Open Data File | Select menu 'File' then 'Open' then 'Open Data File' (or use
the shortcut Command-Option-O). In the file dialog which should appear, navigate
to the folder 'GUIExamples/data' and open the file 'brown.nuc'. A data window should open,
displaying the sequence data in 'brown.nuc', like the one below: |
Scrolling | To scroll through data sites, use the scrolling tool
located right above the sequence data (white bar with an orange rectangle) -
we'll refer to it as the partition bar.
The white bar represents the entire data set, the orange rectangle - the
visible portion thereof. Click and drag the orange rectangle, or click on
outside the rectangle to go to that location. |
Select Data | Select some data sites from the file by clicking and dragging
in the sequence display window. You can hold down 'Command' and click on individual
sites to select them, or click on the starting site then hold 'Shift' and click on the
ending site to select a contiguous range of sites. |
Define a Partition | Select all the sites in the data ('Select All' from
the 'Edit' menu - Command-A - may be useful). Select 'Selection->Partition' from
the 'Data' menu (Command-2). There should now be a new entry
in the partition table
right below the data window, as in this picture:
Note that the partition bar
has changed to indicate that all sites in the data belong to our data partition,
color coded by red. In order to change the color, or partition name, double click
on the partition row, or in the partition bar, anywhere on the color of that partition.
A dialog box will appear:
Change the default name of the partition to 'AllData'. HyPhy partition names can't
contain spaces, and must begin with a letter or an underscore.
|
Partition Settings | In order to start an analysis, we must assign a tree
topology to the partition, select/define an evolutionary model and choose model
options. Our data set contained a tree string, so we shall select it for the
analysis. In the partition table, click on the arrow next to 'None' in the
'Tree' column for our partition row. At the bottom of the resulting menu, there
will be the tree loaded from the data file, called 'brown_tree'. Select it. For the evolutionary
model, we will choose HKY85 (click on the arrow next to 'None' in the 'Substitution
Model' and choose HKY85). Once you have done that, note that the other columns,
representing model options, have activated. For parameters, let us choose 'Rate Het.',
for rate heterogeneity (gamma). The default setting for equilibrium frequencies,
'Partition' (i.e. observed frequencies of characters in the partition), should be
adequate. Change '4' rate classes to '8' by double clicking on '4' and typing in '8',
followed by 'Enter'. The partition table should now look like this:
|
Construct the Likelihood Function | With all the components now in place,
we are ready to construct the likelihood function. This fact is indicated by the status light
in the bottom left of the window turning yellow. To contruct the likelihood function,
choose 'Build Function' from the 'Likelihood' menu (or Command-L).
HyPhy will now construct the likelihood function and change the light indicator to green.
Also, the names of the partitions which are tied to the likelihood function will be
displayed in bold. An information message is printed to the console:
Created likelihood function 'brown_LF' with
1 partitions,
2 shared parameters,
7 local parameters,
0 constrained parameters.
Pruning efficiency 301 vs 595 (49.4118 % savings)
|
Optimize the Likelihood Function | We are ready to obtain maximum likelihood
parameter estimates (MLEs). Select 'Optimize' from the 'Likelihood Function' menu.
HyPhy goes to work obtaining MLEs; look at the progress bar in the console window to
follow optimization progress. Once the process is complete (which should take only
a few seconds for our example), a summary of result is printed to the console and
a table with parameter estimates pops up:
Note that there are several iconic symbols for different entries in the table:
-
A tree icon: represents a tree used in the analysis. Double click on
the tree row (or select the row and click the first button in the toolbar -
if you hover your mouse pointer over a button, a tooltip will appear)
to open a separate window for that tree, like this one:
-
A global parameter(capital R): represents a model parameter which
is shared by several (or all) tree branches. For our model we have the
gamma shape parameter and the transvertion/transition ratio parameter.
Their names are prefixed by the name of the partition they refer to.
This is useful for distinguishing parameters in analyses with multiple partitions.
-
A local parameter(theta): represents a model parameter which
is shared local to a single tree branch. For our model we have the
branch length parameter, prefixed by the name of the tree and branch
that it comes from. Select one of the local parameters, for instance,
brown_tree.Node6.a, and click the first button in the toolbar.
A tree window will come to the foreground, and the branch with
the parameter we just selected will be highlighted. You can go backwards too!
Click on a tree branch, and then choose 'Show Parameters in Table' from
the 'Tree' menu. The parameter table will come to the foreground and
all the parameters for the branch you selected will be highlighted. The same
works for multiple selections (Shift-click to define multiple item selections).
-
A rate distribution(histogram): represents a rate variation parameter.
It is not a rate distribution parameter, however, but rather the actual
rate distribution inferred by maximum likelihood. The 'Value' column has the form
1(8), meaning that there are 8 rate classes, and the number in the next column
represents the value for rate class 1. To show other rate classes, click on the arrow
next to the value in the last column and choose the rate you wish to see from
the pull-down menu. If the rate distribution came from disretizing a continuous
density (which it did in our example), you can choose 'Density Plot' from
the pulldown menu. A window like this:
will appear. The green bars are the intervals that the continuous distribution
was partitioned in, and the dotted lines are the values for rate classes.
|
Rate class assignments | Go back to the data window, and choose 'Data'->'Additional Info'->'Rate Class' menu item. The data panel display will change thus:
The numbers below data columns are the most likely rate class assignments for
that particular site from 0 to 7 (since we have 8 rate classes). Right click (or Control-click) on that row of numbers, and play with the pull-down that appears (it allows
you to select sites by rate classes and display rate class disribution info).
|
Save | Finally we save the results of our analysis to a file for later use.
From the data panel, choose 'File'->'Save' and select the file to save the information
to. To later load it, select 'File'->'Open'->'Open Batch File' (Command-O),
or drag the file icon over the HyPhy icon.
Directory 'GUIExamples/Saves/' contains 'Example1.bf' which is a saved file
with the analysis we just did.
|
|