Summary:
Given 2 nucleotide, aminoacid or codon data files with the same number
of sequences, a tree, and a relevant substitution model, RelativeRatio
calculates two sets of MLE:
-
Unconstrained Model : all model parameters
(rates) for each data file are estimated independently. This is the alternative
hypothesis.
-
Ratio Constrained Model : rates in the second
data set are assumed to be proportional to rates in the first one. This
is the null hypothesis.
A likelihood ratio test is then performed to determine
whether the alternative hypothesis should be accepted or rejected.
Input: Two
nucleotide, aminoacid or codon data files in any recognizable
format. HYPHY uses the following table
to translate nucleotide ambiguities (or aminoacid
characters). Any of the predefined
genetic code translation tables can be used to interpret codon data.
A Newick style tree string can either be included in the first data file,
or supplied via a separate tree files. Note, that if a data file contains
more than one tree, only the first one will be read. The second file should
contain at least as many sequences as the first one; if it contains more
sequences, then only as many as there are in the first file will be used.
Models: Any of the standard nucleotide,
aminoacid or codon
models can be selected for the analysis, based on the sequence type.
If the model used in the relative rate analysis had two or more parameters
per branch, the user is prompted to select which of the parameters to
constrain.
Output: The standard output depends
on the "Likelihood Output" option selected
in "Preferences". By
default, that option is to print the maximum ln-likelihood followed by
a tree string with branch lengths representing the expected number of substitutions
per codon. For a complete list of output options, refer the Output
Formats page. The analysis will report MLE for
the unconstrained model (null hypothesis), ratio constrained model (alternative
hypothesis) and the likelihood ratio test. The latter includes the value
of the likelihood ratio statistic and the P-Value based on the fact that
LR statistic is asymptotically distributed as Chi-squared with degrees
of freedom equal to the number of constrained model parameters (that number
varies based on the model chosen).
Result Processing
Tools: After the analysis is finished, the following options are
available in the "Results"
submenu of the "Analyses"
menu:
For the first four processing tools HYPHY
will present the user with a choice of whether constrained or unconstrained
MLE are desired. "lf1" is the unconstrained likelihood function for the
first data set, "lf2" is the unconstrained likelihood function for the
second data set, and "lfConstrained" is the joint likelihood function with
ratio constraints in place.
Bootstrapping tools output a summary
of simulations on the screen for unconstrained and constrained MLE (mean
and variance), the same descriptors for the ratio parameter R, and the
likelihood statistic sample distribution, including mean, variance and
simulated P-Value (proportion of simulated LR which is larger than the
original LR). They also save LR for each data replicate in a comma-separated
file suitable for import by data processing programs.
|