Learn R Programming

phangorn (version 2.0.3)

phyDat: Conversion among Sequence Formats

Description

These functions transform several DNA formats into the phyDat format. allSitePattern generates an alignment of all possible site patterns.

Usage

phyDat(data, type = "DNA", levels = NULL, return.index=TRUE, ...) 
read.phyDat(file, format="phylip", type="DNA", ...)
write.phyDat(x, file, format="phylip",...)
## S3 method for class 'DNAbin':
as.phyDat(x, ...)
## S3 method for class 'alignment':
as.phyDat(x, type="DNA", ...)
## S3 method for class 'MultipleAlignment':
as.phyDat(x, ...)## S3 method for class 'phyDat':
as.character(x, allLevels = TRUE, ...)
## S3 method for class 'phyDat':
as.data.frame(x, ...)
## S3 method for class 'phyDat':
as.DNAbin(x, ...)
## S3 method for class 'phyDat':
subset(x, subset, select, site.pattern = TRUE, ...)
## S3 method for class 'phyDat':
unique(x, incomparables = FALSE, identical = TRUE, ...)
phyDat2alignment(x)
allSitePattern(n, levels=c("a","c","g","t"), names=NULL)
acgt2ry(obj)
baseFreq(obj, freq=FALSE, all=FALSE, drop.unused.levels=FALSE)

Arguments

data
An object containing sequences.
x
An object containing sequences.
type
Type of sequences ("DNA", "AA", "CODON" or "USER").
levels
Level attributes.
return.index
If TRUE returns a index of the site patterns.
file
A file name.
format
File format of the sequence alignment (see details). Several popular formats are supported: "phylip", "interleaved", "sequential", "clustal", "fasta" or "nexus", or any unambiguous abbreviation of these.
n
Number of sequences.
names
Names of sequences.
subset
a subset of taxa.
select
a subset of characters.
site.pattern
select site pattern or sites.
allLevels
return original data.
obj
as object of class phyDat
freq
logical, if 'TRUE', frequencies or counts are returned otherwise proportions
all
all a logical; if all = TRUE, all counts of bases, ambiguous codes, missing data, and alignment gaps are returned as defined in the contrast.
drop.unused.levels
logical, drop unused levels
incomparables
for compatability with unique.
identical
if TRUE (default) sequences have to be identical, if FALSE sequences are considered duplicates if distance between sequences is zero (happens frequently with ambiguous sites).
...
further arguments passed to or from other methods.

Value

  • The functions return an object of class phyDat.

Details

If type "USER" a vector has to be give to levels. For example c("a", "c", "g", "t", "-") would create a data object that can be used in phylogenetic analysis with gaps as fifth state. There is a more detailed example for specifying "USER" defined data formats in the vignette "phangorn-specials".

allSitePattern returns all possible site patterns and can be useful in simulation studies. For further details see the vignette phangorn-specials.

write.phyDat calls the function write.dna or write.nexus.data and read.phyDat calls the function read.dna, read.aa or read.nexus.data see for more details over there. You may import data directly with read.dna or read.nexus.data and convert the data to class phyDat.

The generic function c can be used to to combine sequences and unique to get all unique sequences or unique haplotypes.

acgt2ry converts a phyDat object of nucleotides into an binary ry-coded dataset.

See Also

DNAbin, as.DNAbin, read.dna, read.aa, read.nexus.data and the chapter 1 in the vignette("phangorn-specials", package="phangorn") and the example of pmlMix for the use of allSitePattern

Examples

Run this code
data(Laurasiatherian)
class(Laurasiatherian)
Laurasiatherian
baseFreq(Laurasiatherian)
baseFreq(Laurasiatherian, all=TRUE)
subset(Laurasiatherian, subset=1:5)
# transform into old ape format
LauraChar <- as.character(Laurasiatherian)
# and back 
Laura <- phyDat(LauraChar, return.index=TRUE)
all.equal(Laurasiatherian, Laura)
allSitePattern(5)

Run the code above in your browser using DataLab