subset.haplotype: Subsetting and Filtering Haplotypes

Description

This function selects haplotypes based on their (absolute) frequencies and/or proportions of missing nucleotides.

Usage

# S3 method for haplotype
subset(x, minfreq = 1, maxfreq = Inf, maxna = Inf, na = c("N", "?"), ...)

Arguments

an object of class c("haplotype", "DNAbin").

minfreq, maxfreq

the lower and upper limits of (absolute) haplotype frequencies. By default, all haplotypes are selected whatever their frequency.

maxna

the maximum frequency (absolute or relative; see details) of missing nucleotides within a given haplotype.

a vector of mode character specifying which nucleotide symbols should be treated as missing data; by default, unknown nucleotide (N) and completely unknown site (?) (can be lower- or uppercase). There are two shortcuts: see details.

…

unused.

Value

an object of class c("haplotype", "DNAbin").

Details

The value of maxna can be either less than one, or greater or equal to one. In the former case, it is taken as specifying the maximum proportion (relative frequency) of missing data within a given haplotype. In the latter case, it is taken as the maximum number (absolute frequency).

na = "all" is a shortcut for all ambiguous nucleotides (including N) plus alignment gaps and completely unknown site (?).

na = "ambiguous" is a shortcut for only ambiguous nucleotides (including N).

Examples

Run this code

# NOT RUN {
data(woodmouse)
h <- haplotype(woodmouse)
subset(h, maxna = 20)
subset(h, maxna = 20/ncol(h)) # same thing than above
# }