Navigation Banner
 
  HyPhy Documentation: HyPhy GUI Examples: More Hypothesis Testing

Prerequisites
This example uses the ideas of Simple Hypothesis testing.
Description
The objective of this example is to do the following: using a data file of primate mtDNA, we wish to test the hypothesis that nucleotides at positions 1,2 and 3 withing that data set evolved under the relative ratio constraint.

This example (sans the bootstrap) is saved in 'Example3.bf' in the 'Saves' directory.

Create data partitions
In HyPhy, open the data file 'brown.nuc' from 'data' in 'GUIExamples'. We need to define three data partitions: one for all the nucleotides in position 1 within a codon (1,4,7 etc), one for position 2 (2,5,8 etc) and one for position 3. We could of course just select the sites we want by shift clicking and define the partition from the selection, but this is way too tedious! We will instead make use of the combing tool HyPhy provides.

To that end, first select all the sites in the data ('Select All' or Command-A). Next, click on the combing button in the lower left corner of the data window (use tooltips to find which one is the combing button!). The following dialog will appear:

The combing width should be 3, and to select all the sites in position 1 in every triplet, check the box at 1, and make sure the boxes at 2 and 3 are unchecked, then click OK. You have just selected all the sites in position 1 in every triplet. Make sure that the partition you have just defined is not selected (highlighted) and click the combing button again. If the partition were selected, the combing tool would apply to that partition only, not the entire dataset, which is not what we intend to do. Repeat the combing for positions 2 and 3. Rename the partitions to 'Position1', 'Position2' and 'Position3' (recall, that if you double click on the partition row, you can edit its properties, including the name of the partition). Note how the scroll bar at the top of the window is a series of alternating color stripes. It reflects the pattern of our partitions.

Assign the topology from the tree file (brown_tree) to each partition, set the models to 'HKY85', parameters to 'Global' (shared transversion/ transition ratio for all branches in a tree) and frequencies to 'Partition'. You can assign each partition inividually, or use the pulldown arrows in column headers to apply the same selection to all partitions. Construct the likelihood function. Observe that HyPhy has changed the names of tree topologies for the second and the third partition. The reason for that is that each partition has its own tree (even though the topologies of all trees are the same). HyPhy simply cloned those topologies, but endowed each partition with its own set of model parameters and branch lengths.

The end result should look like this:

Define the hypotheses
Optimize the likelihood function and switch to the parameter table window. Open a window for each tree (double click on the name of the tree) and arrange them side by side with the table. For people with small screen sizes (myself included), it may be convenient to swicth to the more compact tree display mode. Once you open the tree window, use 'Tree Display Options' from the 'Tree' window to bring up the options dialog and check 'Scale tree by resizing the window'. After that, resize the window and position it next to the parameter table.

You may surmise, from looking at the trees, that while their lengths are different (e.g position 1 vs position 3), the overall shapes of the trees look somewhat similar, even though the relative lengths of 'Orangutan' and 'Gibbon' are not the same between positions 1 and 3. Relative rate assumption states that for two trees, each branch parameter in one tree is proportional to the corresponding branch parameter in the second tree, and the constant of proportionality is the same for all branches.

Before we define any constraints, let us save current MLEs for the unconstrained model. Save the state of the likelihood function as 'Full Model'.

Now we are ready to define the constraints. Select trees brown2 and brown3 in the parameter table, and click on the relative ratio button. A pulldown list (with 1 item) will appear. HyPhy presents you with the choice of parameters to impose the relative ratio constraint on. In our example, there is only one parameter per branch, thus we have no choice but to select it. Next, a dialog box appears prompting you to select the name of the proportionality parameter. Call it 'RelRatio23'.

Observe that trees 2 and 3 are now proporional because each parameter of tree 2 is constrained to be a multiple of the corresponding parameter in tree 3. Repeat the same procedure for trees brown_tree and brown_tree3, but name the proportionality variable 'RelRatio13'. Optimize the likelihood function (from the 'Likelihood' menu).

Save the LF state as 'Relative Ratio'. Define it as the Null hypothesis, and 'Full Model' as the Alternative.

Perform the LRT
Now, run the LR test; the result is:
	
     2*LR = 17.8021
     DF = 12
     P-Value = 0.121832 
We fail to reject the null hypothesis, and thus positions 1, 2 and 3 are indeed subject to the relative ratio constraint!
Perform the bootstrap
Run 100 iterates of the parametric bootstrap to compare the asymptotic p-value to the simulated p-value. I got the following results:

Parametric: close to the theoretical Chi-squared with 12 degrees of freedom.

 
Sergei L. Kosakovsky Pond and Spencer V. Muse, 1997-2002