Learn R Programming

poppr (version 2.1.1)

pgen: Genotype Probability

Description

Calculate the probability of genotypes based on the product of allele frequencies over all loci.

Usage

pgen(gid, pop = NULL, by_pop = TRUE, log = TRUE, freq = NULL, ...)

Arguments

gid
a genind or genclone object.
pop
either a formula to set the population factor from the strata slot or a vector specifying the population factor for each sample. Defaults to NULL.
by_pop
When this is TRUE (default), the calculation will be done by population.
log
a logical if log =TRUE (default), the values returned will be log(Pgen). If log = FALSE, the values returned will be Pgen.
freq
a vector or matrix of allele frequencies. This defaults to NULL, indicating that the frequencies will be determined via round-robin approach in rraf. If this matrix or vector is not prov
...
options passed to rare_allele_correction. The default is to correct allele frequencies to 1/n

Value

  • A vector containing Pgen values per locus for each genotype in the object.

Details

Pgen is the probability of a given genotype occuring in a population assuming HWE. Thus, the value for diploids is $$P_{gen} = \left(\prod_{i=1}^m p_i\right)2^h$$ where $p_i$ are the allele frequencies and h is the count of the number of heterozygous sites in the sample (Arnaud-Haond et al. 2007; Parks and Werth, 1993). The allele frequencies, by default, are calculated using a round-robin approach where allele frequencies at a particular locus are calculated on the clone-censored genotypes without that locus. To avoid issues with numerical precision of small numbers, this function calculates pgen per locus by adding up log-transformed values of allele frequencies. These can easily be transformed to return the true value (see examples).

References

Arnaud-Haond, S., Duarte, C. M., Alberto, F., & Serrão, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.

Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.

See Also

psex, rraf, rrmlg, rare_allele_correction

Examples

Run this code
data(Pram)
head(pgen(Pram, log = FALSE))

# You can get the Pgen values over all loci by summing over the logged results:
exp(rowSums(pgen(Pram, log = TRUE), na.rm = TRUE))

# You can also take the product of the non-logged results:
apply(pgen(Pram, log = FALSE), 1, prod, na.rm = TRUE)

## Rare Allele Correction ---------------------------------------------------
##
# By default, allele frequencies are calculated with rraf with 
# correction = TRUE. This is normally benign when analyzing large populations,
# but it can have a great effect on small populations. You can pass arguments 
# for the function rare_allele_correction() to correct the allele frequencies
# that were lost in the round robin calculations.

# Default is to correct by 1/n per population. Since the calculation is 
# performed on a smaller sample size due to round robin clone correction, it
# would be more appropriate to correct by 1/rrmlg at each locus. This is 
# acheived by setting d = "rrmlg". Since this is a diploid, we would want to
# account for the number of chromosomes, and so we set mul = 1/2
head(pgen(Pram, log = FALSE, d = "rrmlg", mul = 1/2)) # compare with the output above

# If you wanted to treat all alleles as equally rare, then you would set a
# specific value (let's say the rare alleles are 1/100):
head(pgen(Pram, log = FALSE, e = 1/100))

Run the code above in your browser using DataLab