Navigation Banner
 
  HyPhy Documentation: Standard Analyses: PositiveSelection.bf

      Summary: Given a codon data file, a tree and a codon substitution model, PositiveSelection performs a series of analyses, along the lines of Yang, Goldman 2000 paper to identify sites in the data which are under selective pressure.

     Input: A codon data file in any recognizable format. Any of the predefined genetic code translation tables can be used to interpret the data. A Newick style tree string can either be included in the data file, or supplied via a separate tree files. Note, that if a data file contains more than one tree, only the first one will be read.
    The user will be prompted to specify the cut-off Bayesian level for a site to be considered under selective pressure (a number between 0 and 1), and how many rate classes should be used in discretizing continuous distributions.

    dN/dS variability is modeled by those 13 distributions.

  1. Single Rate no rate variation.
  2. Neutral rates are 0 or 1 with mixing parameter P.
  3. Selection rates are 0 or 1 or W (estimated) with mixing parameters P1 and P2.
  4. Discrete rates are R or R*M1 or R*M2 with mixing parameters P1 and P2.
  5. Freqs rates are 0,1/3,2/3,1,3 with mixing parameters P1,P2,P3 and P4.
  6. Gamma rates are sampled (by conditional mean) from a two parameter gamma distribution.
  7. 2 Gamma rates are sampled (by conditional mean) from a mixture of a two parameter gamma and a mean 1 gamma.
  8. Beta rates are sampled (by conditional mean) from a two parameter beta distribution (thus the rates are all in [0,1])
  9. Beta+w rates are sampled (by conditional mean) from a mixture of a two parameter beta distribution and the class with rate W.
  10. Beta & (Gamma+1) rates are sampled (by conditional mean) from a mixture of a two parameter beta distribution and a two parameter gamma distribution shifted to [1,Infinity).
  11. Beta & (Normal>1) rates are sampled (by conditional mean) from a mixture of a two parameter beta distribution and a two parameter normal distribution restricted to [1,Infinity).
  12. 0 & 2 (Normal>1) rates are sampled (by conditional mean) from a mixture of the zero rate class, a two parameter normal and a mean 1 normal (restricted to [0,Infinity)).
  13. 3 Normal rates are sampled (by conditional mean) from a mixture of the a mean 0 normal, a mean 1 normal, and a two-parameter normal, all restricted to [0,Infinity).

    The user may choose any combination (or all 13) distributions to run.

    Models: MG94,GY94 with either 3 or 9 frequency parameters codon models can be selected for the analysis.

    Output: A summary table is output to the screen and a detailed report spooled to a file chosen by the user.

     Result Processing Tools: None are really applicable.

 
Sergei L. Kosakovsky Pond and Spencer V. Muse, 1997-2002