Learn R Programming

poppr (version 2.1.1)

rare_allele_correction: Correct minor allele frequencies derived from rraf (INTERNAL)

Description

This is an internal function. The documentation is for use with rraf, pgen, and psex. Do not attempt to use this function directly. Minor alleles are often lost when calculating allele frequencies from a round-robin approach, resulting in zero-valued allele frequencies (Arnaud-Haond et al. 2007, Parks and Werth 1993). This can be problematic when calculating values for pgen and psex. This function gives options for giving a value to these zero-valued frequencies.

Usage

rare_allele_correction(rraf, rrmlg, e = NULL, sum_to_one = FALSE,
  d = c("sample", "mlg", "rrmlg"), mul = 1, mlg = NULL, pop = NULL,
  locfac = NULL)

Arguments

rraf
internal a list or matrix produced from rraf (with uncorrected MAF)
rrmlg
internal a matrix containing multilocus genotypes per locus derived from rrmlg
e
a numeric epsilon value to use for all missing allele frequencies.
sum_to_one
when TRUE, the original frequencies will be reduced so that all allele frequencies will sum to one. Default: FALSE
d
the unit by which to take the reciprocal. div = "sample" will be 1/(n samples), div = "mlg" will be 1/(n mlg), and div = "rrmlg" will be 1/(n mlg at that locus). This is overridden by e.
mul
a multiplier for div. Default is mult = 1. This parameter is overridden by e
mlg
internal the number of MLGs in the sample. Only required if d = "mlg".
pop
internal a vector of factors that define the population definition for each observation in rrmlg. This must be supplied if rraf is a matrix.
locfac
internal a vector of factors that define the columns belonging to the loci.

Value

  • a matrix or vector the same type as rraf

Details

Arguments of interest to the user are:
  • e
  • sum_to_one
  • d
  • m
By default (d = "sample", e = NULL, sum_to_one = FALSE, mul = 1), this will add 1/(n samples) to all zero-value alleles. The basic formula is 1/(d * m) unless e is specified. If sum_to_one = TRUE, then the frequencies will be scaled as x/sum(x) AFTER correction, indicating that the allele frequencies will be reduced. See the examples for details. The general pattern of correction is that the value of the MAF will be rrmlg > mlg > sample

References

Arnaud-Haond, S., Duarte, C. M., Alberto, F., & Serrão, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.

Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.

See Also

rraf, pgen, psex, rrmlg

Examples

Run this code
data(Pram)
#-------------------------------------

# If you set correction = FALSE, you'll notice the zero-valued alleles

rraf(Pram, correction = FALSE)

# By default, however, the data will be corrected by 1/n

rraf(Pram)

# Of course, this is a diploid organism, we might want to set 1/2n

rraf(Pram, mul = 1/2)

# To set MAF = 1/2mlg

rraf(Pram, d = "mlg", mul = 1/2)

# Another way to think about this is, since these allele frequencies were
# derived at each locus with different sample sizes, it's only appropriate to
# correct based on those sample sizes.

rraf(Pram, d = "rrmlg", mul = 1/2)

# If we were going to use these frequencies for simulations, we might want to
# ensure that they all sum to one. 

rraf(Pram, d = "mlg", mul = 1/2, sum_to_one = TRUE) 

#-------------------------------------
# When we calculate these frequencies based on population, they are heavily
# influenced by the number of observed mlgs. 

rraf(Pram, by_pop = TRUE, d = "rrmlg", mul = 1/2)

# This can be fixed by specifying a specific value

rraf(Pram, by_pop = TRUE, e = 0.01)

Run the code above in your browser using DataLab