ape (version 2.1-2)

read.nexus.data: Read Character Data In NEXUS Format

Description

This function reads a file with sequences in the NEXUS format.

Usage

read.nexus.data(file)

Arguments

file
a file name specified by either a variable of mode character, or a double-quoted string.

Value

  • A list of sequences each made of a single vector of mode character where each element is a (phylogenetic) character state.

Details

This parser tries to read data from a file written in a restricted NEXUS format (see examples below).

Please see files data.nex and taxacharacters.nex for examples of formats that will work.

Some noticeable exceptions from the NEXUS standard (non-exhaustive list):

  • I
{Comments must be either on separate lines or at the end of lines. Examples: [Comment] --- OK Taxon ACGTACG [Comment] --- OK [Comment line 1

Comment line 2] --- NOT OK! Tax[Comment]on ACG[Comment]T --- NOT OK!} II{No spaces (or comments) are allowed in the sequences. Examples: name ACGT --- OK name AC GT --- NOT OK!} III{No spaces are allowed in taxon names, not even if names are in single quotes. That is, single-quoted names are not treated as such by the parser. Examples: Genus_species --- OK 'Genus_species' --- OK 'Genus species' --- NOT OK!} IV{The trailing end that closes the matrix must be on a separate line. Examples: taxon AACCGGT

end; --- OK taxon AACCGGT;

end; --- OK taxon AACCCGT; end; --- NOT OK!} V{Multistate characters are not allowed. That is, NEXUS allows you to specify multiple character states at a character position either as an uncertainty, (XY), or as an actual appearance of multiple states, {XY}. This is information is not handled by the parser. Examples: taxon 0011?110 --- OK taxon 0011{01}110 --- NOT OK! taxon 0011(01)110 --- NOT OK!} VI{The number of taxa must be on the same line as ntax. The same applies to nchar. Examples: ntax = 12 --- OK ntax =

12 --- NOT OK!} VII{The word matrix can not occur anywhere in the file before the actual matrix command, unless it is in a comment. Examples: BEGIN CHARACTERS;

TITLE 'Data in file "03a-cytochromeB.nex"';

DIMENSIONS NCHAR=382;

FORMAT DATATYPE=Protein GAP=- MISSING=?;

["This is The Matrix"] --- OK

MATRIX BEGIN CHARACTERS;

TITLE 'Matrix in file "03a-cytochromeB.nex"'; --- NOT OK!

DIMENSIONS NCHAR=382;

FORMAT DATATYPE=Protein GAP=- MISSING=?;

MATRIX}

References

Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an extensible file format for systematic information. Systematic Biology, 46, 590--621.

See Also

read.nexus, write.nexus, write.nexus.data

Examples

Run this code
## Use read.nexus.data to read a file in NEXUS format into object x
x <- read.nexus.data("file.nex")

Run the code above in your browser using DataCamp Workspace