phangorn (version 2.5.5)

simSeq: Simulate sequences.

Description

Simulate sequences for a given evolutionary tree.

Usage

simSeq(x, ...)

# S3 method for phylo simSeq(x, l = 1000, Q = NULL, bf = NULL, rootseq = NULL, type = "DNA", model = NULL, levels = NULL, rate = 1, ancestral = FALSE, ...)

# S3 method for pml simSeq(x, ancestral = FALSE, ...)

Arguments

x

a phylogenetic tree tree, i.e. an object of class phylo or and object of class pml.

Further arguments passed to or from other methods.

l

length of the sequence to simulate.

Q

the rate matrix.

bf

base frequencies.

rootseq

a vector of length l containing the root sequence, other root sequence is randomly generated.

type

Type of sequences ("DNA", "AA", "CODON" or "USER").

model

Amino acid models: e.g. "WAG", "JTT", "Dayhoff" or "LG"

levels

levels takes a character vector of the different bases, default is for nucleotide sequences, only used when type = "USER".

rate

mutation rate or scaler for the edge length, a numerical value greater than zero.

ancestral

Return ancestral sequences?

Value

simSeq returns an object of class phyDat.

Details

simSeq is now a generic function to simulate sequence alignments to along a phylogeny. It is quite flexible and allows to generate DNA, RNA, amino acids, codon or binary sequences. It is possible to give a pml object as input simSeq return a phyDat from these model. There is also a more low level version, which lacks rate variation, but one can combine different alignments having their own rate (see example). The rate parameter acts like a scaler for the edge lengths.

For codon models type="CODON" two additional arguments dnds for the dN/dS ratio and tstv for the transition transversion ratio can be supplied.

See Also

phyDat, pml, SOWH.test

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
data(Laurasiatherian)
tree <- nj(dist.ml(Laurasiatherian))
fit <- pml(tree, Laurasiatherian, k=4)
fit <- optim.pml(fit, optNni=TRUE, model="GTR", optGamma=TRUE)
data <- simSeq(fit)
# }
# NOT RUN {
tree <- rtree(5)
plot(tree)
nodelabels()

# Example for simple DNA alignment
data <- simSeq(tree, l = 10, type="DNA", bf=c(.1,.2,.3,.4), Q=1:6)
as.character(data)

# Example to simulate discrete Gamma rate variation
rates <- discrete.gamma(1,4)
data1 <- simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[1])
data2 <- simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[2])
data3 <- simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[3])
data4 <- simSeq(tree, l = 100, type="AA", model="WAG", rate=rates[4])
data <- c(data1,data2, data3, data4)

write.phyDat(data, file="temp.dat", format="sequential", nbcol = -1,
  colsep = "")
unlink("temp.dat")

# }

Run the code above in your browser using DataCamp Workspace