Prerequisites | This example uses the ideas of Simple Hypothesis testing. |
Description | The objective of this example is to do the following: using a
data file of primate mtDNA, we wish to test the hypothesis that nucleotides
at positions 1,2 and 3 withing that data set evolved under the relative ratio constraint.
This example (sans the bootstrap) is saved in 'Example3.bf' in the 'Saves' directory.
|
Create data partitions | In HyPhy, open the data file 'brown.nuc' from 'data'
in 'GUIExamples'. We need to define three data partitions: one for all the nucleotides
in position 1 within a codon (1,4,7 etc), one for position 2 (2,5,8 etc) and one
for position 3. We could of course just select the sites we want by shift clicking and
define the partition from the selection, but this is way too tedious! We will instead
make use of the combing tool HyPhy provides.
To that end, first select all the sites in the data ('Select All' or Command-A). Next,
click on the combing button in the lower left corner of the data window (use tooltips
to find which one is the combing button!). The following dialog will appear:
The combing width should be 3, and to select all the sites in position 1 in every
triplet, check the box at 1, and make sure the boxes at 2 and 3 are unchecked, then
click OK. You have just selected all the sites in position 1 in every triplet.
Make sure that the partition you have just defined is not selected (highlighted)
and click the combing button again. If the partition were selected, the combing
tool would apply to that partition only, not the entire dataset, which is not what
we intend to do. Repeat the combing for positions 2 and 3. Rename the partitions
to 'Position1', 'Position2' and 'Position3' (recall, that if you double click
on the partition row, you can edit its properties, including the name of the
partition). Note how the scroll bar at the top of the window is a series of alternating
color stripes. It reflects the pattern of our partitions.
Assign the topology from the tree file (brown_tree) to each partition,
set the models to 'HKY85', parameters to 'Global' (shared transversion/
transition ratio for all branches in a tree) and frequencies to 'Partition'.
You can assign each partition inividually, or use the pulldown arrows in
column headers to apply the same selection to all partitions. Construct
the likelihood function. Observe that HyPhy has changed the names of
tree topologies for the second and the third partition. The reason for that
is that each partition has its own tree (even though the topologies
of all trees are the same). HyPhy simply cloned those topologies, but
endowed each partition with its own set of model parameters and branch lengths.
The end result should look like this:
| Define the hypotheses | Optimize the likelihood function and
switch to the parameter table window. Open a window for each tree
(double click on the name of the tree) and arrange them side by side with
the table. For people with small screen sizes (myself included), it
may be convenient to swicth to the more compact tree display mode.
Once you open the tree window, use 'Tree Display Options' from
the 'Tree' window to bring up the options dialog and check 'Scale
tree by resizing the window'. After that, resize the window and position
it next to the parameter table.
You may surmise, from looking at the trees, that while their lengths
are different (e.g position 1 vs position 3), the overall shapes of
the trees look somewhat similar, even though the relative lengths
of 'Orangutan' and 'Gibbon' are not the same between positions 1
and 3. Relative rate assumption states that for two trees,
each branch parameter in one tree is proportional to the
corresponding branch parameter in the second tree, and the constant
of proportionality is the same for all branches.
Before we define any constraints, let us save current MLEs for
the unconstrained model. Save the state of the likelihood function
as 'Full Model'.
Now we are ready to define the constraints. Select trees brown2
and brown3 in the parameter table, and click on the relative
ratio button. A pulldown list (with 1 item) will appear. HyPhy
presents you with the choice of parameters to impose the relative
ratio constraint on. In our example, there is only one parameter
per branch, thus we have no choice but to select it. Next, a
dialog box appears prompting you to select the name of the
proportionality parameter. Call it 'RelRatio23'.
Observe that trees 2 and 3 are now proporional because each
parameter of tree 2 is constrained to be a multiple of the
corresponding parameter in tree 3. Repeat the same procedure
for trees brown_tree and brown_tree3, but name the proportionality
variable 'RelRatio13'. Optimize the likelihood function (from
the 'Likelihood' menu).
Save the LF state as 'Relative Ratio'. Define it as the Null hypothesis,
and 'Full Model' as the Alternative.
|
Perform the LRT | Now, run the LR test; the result is:
2*LR = 17.8021
DF = 12
P-Value = 0.121832
We fail to reject the null hypothesis, and thus positions 1, 2 and 3
are indeed subject to the relative ratio constraint!
|
Perform the bootstrap | Run 100 iterates of the parametric bootstrap to compare the asymptotic p-value to
the simulated p-value. I got the following results:
Parametric: close to the theoretical Chi-squared with 12 degrees of
freedom.
|
|