Summary:
Given a nucleotide or aminoacid data file and a relevant
model, Topology Search reconstructs a phylogenetic tree for the
sequences in the data by exhaustively searching through all possible
unrooted trees, which there are (2N-5)!! of (N is the number of sequences).
This method is very slow and should not be used for data
sets with more than 9-10 sequences.
Input:
A nucleotide or aminoacid or codon data file in any recognizable format. HYPHY
uses the following table
to translate nucleotide ambiguities (or aminoacid
characters).
Models: Any of the standard
nucleotide or aminoacid
models can be selected for the analysis. The selected model
will be applied to estimate the maximum likelihood parameter
values, and the likelihood of the tree being tested.
Output: The standard
output depends on the "Likelihood
Output" option selected in "Preferences". By default, that option is to print the maximum
ln-likelihood followed by a the inferred tree string with branch
lengths representing the expected number of substitutions per
codon. For a complete list of output options, refer the Output Formats
page. The inferred tree topology can be optionally saved
to a file.
TopologySearch.bf will print out
the scores and the topologies of 10 best trees found in the analysis,
a table of likelihood score distributions, the posterior Bayseian
probability of the best tree and, optionally, save all the trees
and their scores to a file. A run looks like this:
Tree#0 (((Human,Gibbon),Orangutan),Chimpanzee,Gorilla) ==> logLhd = -2694.7
Tree#1 ((Human,(Orangutan,Gibbon)),Chimpanzee,Gorilla) ==> logLhd = -2658.44
Tree#2 (((Human,Orangutan),Gibbon),Chimpanzee,Gorilla) ==> logLhd = -2694.72
Tree#3 ((Human,Orangutan),(Chimpanzee,Gibbon),Gorilla) ==> logLhd = -2699.62
Tree#4 ((Human,Orangutan),Chimpanzee,(Gorilla,Gibbon)) ==> logLhd = -2696.94
Tree#5 ((Human,Gibbon),(Chimpanzee,Orangutan),Gorilla) ==> logLhd = -2701.65
Tree#6 (Human,((Chimpanzee,Gibbon),Orangutan),Gorilla) ==> logLhd = -2696.72
Tree#7 (Human,(Chimpanzee,(Orangutan,Gibbon)),Gorilla) ==> logLhd = -2661.57
Tree#8 (Human,((Chimpanzee,Orangutan),Gibbon),Gorilla) ==> logLhd = -2697.93
Tree#9 (Human,(Chimpanzee,Orangutan),(Gorilla,Gibbon)) ==> logLhd = -2697.03
Tree#10 ((Human,Gibbon),Chimpanzee,(Gorilla,Orangutan)) ==> logLhd = -2696.01
Tree#11 (Human,(Chimpanzee,Gibbon),(Gorilla,Orangutan)) ==> logLhd = -2693.27
Tree#12 (Human,Chimpanzee,((Gorilla,Gibbon),Orangutan)) ==> logLhd = -2682.49
Tree#13 (Human,Chimpanzee,(Gorilla,(Orangutan,Gibbon))) ==> logLhd = -2652.34
Tree#14 (Human,Chimpanzee,((Gorilla,Orangutan),Gibbon)) ==> logLhd = -2683.26
--------------------- RESULTS ---------------------
BestTree =(Human,Chimpanzee,(Gorilla,(Orangutan,Gibbon)))
**************************
* TREE REPORT *
**************************
#### BEST TREES #####
1).
(Human,Chimpanzee,(Gorilla,(Orangutan,Gibbon)))
Log-likelihood = -2652.34
2). Worse by: -6.10077
((Human,(Orangutan,Gibbon)),Chimpanzee,Gorilla)
Log-likelihood = -2658.44
3). Worse by: -9.23189
(Human,(Chimpanzee,(Orangutan,Gibbon)),Gorilla)
Log-likelihood = -2661.57
4). Worse by: -30.1546
(Human,Chimpanzee,((Gorilla,Gibbon),Orangutan))
Log-likelihood = -2682.49
5). Worse by: -30.9269
(Human,Chimpanzee,((Gorilla,Orangutan),Gibbon))
Log-likelihood = -2683.26
6). Worse by: -40.9344
(Human,(Chimpanzee,Gibbon),(Gorilla,Orangutan))
Log-likelihood = -2693.27
7). Worse by: -42.3606
(((Human,Gibbon),Orangutan),Chimpanzee,Gorilla)
Log-likelihood = -2694.7
8). Worse by: -42.3872
(((Human,Orangutan),Gibbon),Chimpanzee,Gorilla)
Log-likelihood = -2694.72
9). Worse by: -43.6714
((Human,Gibbon),Chimpanzee,(Gorilla,Orangutan))
Log-likelihood = -2696.01
10). Worse by: -44.3884
(Human,((Chimpanzee,Gibbon),Orangutan),Gorilla)
Log-likelihood = -2696.72
#### STATISTICS #####
+---------------+---------------+---------------+---------------+
| From Best + | To Best + | Tree Count | % of total |
+---------------+---------------+---------------+---------------+
| 0 | 0.1 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 0.1 | 0.5 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 0.5 | 1.0 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 1.0 | 5.0 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 5.0 | 10.0 | 2 | 13.33333333 |
+---------------+---------------+---------------+---------------+
| 10.0 | 50.0 | 12 | 80.00000000 |
+---------------+---------------+---------------+---------------+
| 50.0 | 100.0 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 100.0 | 1000.0 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 1000.0 | 10000.0 | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
| 10000.0 | Infinity | 0 | 0.00000000 |
+---------------+---------------+---------------+---------------+
Posterior probability of the best tree (with uninformative prior) = 0.997666
Result
Processing Tools: After the analysis is finished, the
following options are available in the "Results" submenu of the "Analyses" menu:
- View Results.
- Save Results.
- (Co)Variance Estimates.
- Nonparametric Bootstrap.
Given a number of iterations,
for each iteration:
resample (with replacement) the original
sequence data, reconstruct the phylogeny for the simulated data set, write out
the tree string to the user-specified output file; also each simulated tree is checked
against the tree yielded by the original data, and all the clades present in both
trees are noted. At the end of the run, a summary tree is printed; this tree has
the same topology as the tree reconstructed from the original data, and internal
branch lengths represent the proporion (or raw count) of simulated trees which
contained the clade starting at that internal node.
The (co)variance
estimates refer to the parameters in the inferred tree.
|