Navigation Banner
 

      HYPHY automatically recognizes five aligned sequence data formats and also autodetects whether the data is nucleotide (codon) or aminoacid. The recognized formats are:

  • NEXUS - files in NEXUS format. The following NEXUS blocks are supported: DATA, CHARACTERS,TAXA, ASSUMPTIONS (for data partitioning) and TREES. See notes on HyPhy constants DATA_FILE_PARTITION_MATRIX and NEXUS_FILE_TREE_MATRIX for details on how to access NEXUS trees and partitions once the file has been read.
  • PHYLIP Sequential and Interleaved: PHYLIP option characters in the first line are ignored
  • # Sequential: taxa names preceded by '#', complete sequence data follow the name of the taxon
  • # Interleaved: list of taxa names preceded by '#', blocks of sequence data follow in the same order as the names of the taxa.

     Here are some small example file for each format, with a tree string present in the file.

# Sequential.

#a ATGATTCAACCTCAGACCCTTTTAAATGTAGCAGATAACAGTGGAGCTCGAAAATTGATG
#b ATGATTCAACCTCAGACCCATTTAAATGTAGCGGATAACAGCGGGGCTCGAGAATTGATG
#og ATGATTCAACCTCAAACTTATTTAAATGTTGCAGATAATAGTGGAGCTCGAAAACTAATG
((a,b),og);

# Interleaved. (Line Width 30, Gap Width 10)

#a
#b
#og

ATGATTCAAC CTCAGACCCT TTTAAATGTA
ATGATTCAAC CTCAGACCCA TTTAAATGTA
ATGATTCAAC CTCAAACTTA TTTAAATGTT

GCAGATAACA GTGGAGCTCG AAAATTGATG
GCGGATAACA GCGGGGCTCG AGAATTGATG
GCAGATAATA GTGGAGCTCG AAAACTAATG

((a,b),og);

PHYLIP Sequential.

3	60
a          ATGATTCAAC CTCAGACCCT TTTAAATGTA GCAGATAACA GTGGAGCTCG 
           AAAATTGATG 
b          ATGATTCAAC CTCAGACCCA TTTAAATGTA GCGGATAACA GCGGGGCTCG 
           AGAATTGATG 
og         ATGATTCAAC CTCAAACTTA TTTAAATGTT GCAGATAATA GTGGAGCTCG 
           AAAACTAATG 

1
((a,b),og);

PHYLIP Interleaved. (Line Width 30, Gap Width 10)

3	60 I
a           ATGATTCAAC CTCAGACCCT TTTAAATGTA
b           ATGATTCAAC CTCAGACCCA TTTAAATGTA
og          ATGATTCAAC CTCAAACTTA TTTAAATGTT
            GCAGATAACA GTGGAGCTCG AAAATTGATG
            GCGGATAACA GCGGGGCTCG AGAATTGATG
            GCAGATAATA GTGGAGCTCG AAAACTAATG
1
((a,b),og);

NEXUS.

#NEXUS
BEGIN TAXA;
	DIMENSIONS NTAX = 3;
	TAXLABELS
		'a' 'b' 'og' ;
END;
BEGIN CHARACTERS;
	DIMENSIONS NCHAR = 60;
	FORMAT
		DATATYPE = DNA
		GAP=-
		MISSING=?
	;
MATRIX
	'a'   ATGATTCAACCTCAGACCCTTTTAAATGTAGCAGATAACAGTGGAGCTCGAAAATTGATG
	'b'   ATGATTCAACCTCAGACCCATTTAAATGTAGCGGATAACAGCGGGGCTCGAGAATTGATG
	'og'  ATGATTCAACCTCAAACTTATTTAAATGTTGCAGATAATAGTGGAGCTCGAAAACTAATG;
END;
BEGIN TREES;
	TREE tree = ((a,b),og);
END;

 
Sergei L. Kosakovsky Pond and Spencer V. Muse, 1997-2002