Learn R Programming

phangorn (version 2.0.3)

simSeq: Simulate sequences.

Description

Simulate sequences for a given evolutionary tree.

Usage

simSeq(x, ...)
## S3 method for class 'phylo':
simSeq(x, l=1000, Q=NULL, bf=NULL, rootseq=NULL, type="DNA",
    model="", levels=NULL, rate=1, ancestral=FALSE, ...)
## S3 method for class 'pml':
simSeq(x, ancestral = FALSE, ...)

Arguments

x
a phylogenetic tree tree, i.e. an object of class phylo or and object of class pml.
l
length of the sequence to simulate.
Q
the rate matrix.
bf
base frequencies.
rootseq
a vector of length l containing the root sequence, other root sequence is randomly generated.
type
Type of sequences ("DNA", "AA" or "USER").
model
Amino acid models: one of "WAG", "JTT", "Dayhoff" or "LG"
levels
levels takes a character vector of the different bases, default is for nucleotide sequences, only used when type = "USER".
rate
rate.
ancestral
Return ancestral sequences?
...
Further arguments passed to or from other methods.

Value

  • simSeq returns an object of class phyDat.

Details

simSeq is now a generic function to simulate sequence alignments. It is quite flexible and allows to generate DNA, RNA, amino acids or binary sequences. It is possible to give a pml object as input simSeq return a phyDat from these model. There is also a more low level version, which lacks rate variation, but one can combine different alignments having their own rate (see example).

See Also

phyDat, pml, SOWH.test

Examples

Run this code
data(Laurasiatherian)
tree = nj(dist.ml(Laurasiatherian))
fit = pml(tree, Laurasiatherian, k=4)
fit = optim.pml(fit, optNni=TRUE, model="GTR", optGamma=TRUE)
data = simSeq(fit)

tree = rtree(5)
plot(tree)
nodelabels()

# Example for simple DNA alignment
data = simSeq(tree, l = 10, type="DNA", bf=c(.1,.2,.3,.4), Q=1:6)
as.character(data)

# Example to simulate discrete Gamma rate variation
rates = discrete.gamma(1,4)
data1 = simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[1])
data2 = simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[2])
data3 = simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[3])
data4 = simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[4])
data <- c(data1,data2, data3, data4)

write.phyDat(data, file="temp.dat", format="sequential",nbcol = -1, colsep = "")
unlink("temp.dat")

Run the code above in your browser using DataLab