Base frequencies from DNA Sequences
base.freq computes the frequencies (absolute or relative) of
the four DNA bases (adenine, cytosine, guanine, and thymidine) from a
sample of sequences.
GC.content computes the proportion of G+C (using the previous
function). All missing or unknown sites are ignored.
Ftab computes the contingency table with the absolute
frequencies of the DNA bases from a pair of sequences.
base.freq(x, freq = FALSE, all = FALSE) GC.content(x) Ftab(x, y = NULL)
- a vector, a matrix, or a list which contains the DNA sequences.
- a vector with a single DNA sequence.
- a logical specifying whether to return the proportions (the default) or the absolute frequencies (counts).
- a logical; by default only the counts of A, C, G, and T are
all = TRUE, all counts of bases, ambiguous codes, missing data, and alignment gaps are returned.
The base frequencies are computed over all sequences in the sample.
Ftab, if the argument
y is given then both
y are coerced as vectors and must be of equal length. If
y is not given,
x must be a matrix or a list and only
the two first sequences are used.
- A numeric vector with names
c("a", "c", "g", "t")(and possibly
"r", "m", ..., a single numeric value, or a four by four matrix with similar dimnames.
data(woodmouse) base.freq(woodmouse) base.freq(woodmouse, TRUE) base.freq(woodmouse, TRUE, TRUE) GC.content(woodmouse) Ftab(woodmouse) Ftab(woodmouse[1, ], woodmouse[2, ]) # same than above Ftab(woodmouse[14:15, ]) # between the last two