read.nexus.data reads a file with sequences in the NEXUS
format. nexus2DNAbin is a helper function to convert the output
from the previous function into the class "DNAbin".
For the moment, only sequence data (DNA or protein) are supported.
read.nexus.data(file)
nexus2DNAbin(x)A list of sequences each made of a single vector of mode character where each element is a (phylogenetic) character state.
a file name specified by either a variable of mode character, or a double-quoted string.
an object output by read.nexus.data.
Johan Nylander, Thomas Guillerme, and Klaus Schliep
This parser tries to read data from a file written in a restricted NEXUS format (see examples below).
Please see files data.nex and taxacharacters.nex for
examples of formats that will work.
Some noticeable exceptions from the NEXUS standard (non-exhaustive list):
I: Comments must be either on separate lines or at the
end of lines. Examples:
[Comment] --- OK
Taxon ACGTACG [Comment] --- OK
[Comment line 1
Comment line 2] --- NOT OK!
Tax[Comment]on ACG[Comment]T --- NOT OK!
II: No spaces (or comments) are allowed in the
sequences. Examples:
name ACGT --- OK
name AC GT --- NOT OK!
III: No spaces are allowed in taxon names, not even if
names are in single quotes. That is, single-quoted names are not
treated as such by the parser. Examples:
Genus_species --- OK
'Genus_species' --- OK
'Genus species' --- NOT OK!
IV: The trailing end that closes the
matrix must be on a separate line. Examples:
taxon AACCGGT
end; --- OK
taxon AACCGGT;
end; --- OK
taxon AACCCGT; end; --- NOT OK!
V: Multistate characters are not allowed. That is,
NEXUS allows you to specify multiple character states at a
character position either as an uncertainty, (XY), or as an
actual appearance of multiple states, {XY}. This is
information is not handled by the parser. Examples:
taxon 0011?110 --- OK
taxon 0011{01}110 --- NOT OK!
taxon 0011(01)110 --- NOT OK!
VI: The number of taxa must be on the same line as
ntax. The same applies to nchar. Examples:
ntax = 12 --- OK
ntax =
12 --- NOT OK!
VII: The word “matrix” can not occur anywhere in
the file before the actual matrix command, unless it is in
a comment. Examples:
BEGIN CHARACTERS;
TITLE 'Data in file "03a-cytochromeB.nex"';
DIMENSIONS NCHAR=382;
FORMAT DATATYPE=Protein GAP=- MISSING=?;
["This is The Matrix"] --- OK
MATRIX
BEGIN CHARACTERS;
TITLE 'Matrix in file "03a-cytochromeB.nex"'; --- NOT OK!
DIMENSIONS NCHAR=382;
FORMAT DATATYPE=Protein GAP=- MISSING=?;
MATRIX
Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an extensible file format for systematic information. Systematic Biology, 46, 590--621.
read.nexus, write.nexus,
write.nexus.data
## Use read.nexus.data to read a file in NEXUS format into object x
if (FALSE) x <- read.nexus.data("file.nex")
Run the code above in your browser using DataLab