testHardyWeinberg: Test for Hardy-Weinberg equilibrium

Description

For SNP and MSV markers, the sample is tested for being in Hardy-Weinberg (HW) equilibrium. Estimated B allele frequencies are returned

Usage

testHardyWeinberg(sizes, bestConf,
    polyCent = generatePolyCenters(ploidy="di"),
    afList = seq(0, 0.5, 0.05))

Arguments

sizes

Numeric vector with the size of each cluster in the current genotype category

bestConf

Character string with genotype category, or numeric index to correct genotype category in polyCent

polyCent

List with all genotype categories and the allele ratios corresponding to the different clusters. See generatePolyCenters

afList

Numeric vector of values within [0, 0.5], denoting which B allele frequencies to test in the case of two segregating paralogs (see below)

Value

Data-frame with the chi-square test statistic, the degrees of freedom, p-value, and both estimated B allele frequencies

Details

The null hypothesis of the test is that the population is in HW frequencies. As most populations deviate more or less from HW equilibrium, strict control of this test is usually not recommended. Still, it can a very powerful test to detect failed clustering, by using a sufficiently low significant level to account for naturally deviating samples. If the sample contain subjects from different populations however, the power of the test may be very low.

At duplicated loci, the observed B allele frequency (BAF) is in fact the mean BAF across both paralogues. For MSV-5 markers, which are segregating in both paralogs, the individual BAFs may be estimated assuming different candidate values at one paralogue. Several values of BAF for one paralogue are set in afList, such that the BAF for the other paralogue is given. A value of 0.5 means the BAF is the same at both loci. All values are tested for HW, and the most likely BAF at both paralogs are those resulting in the highest p-value. The accuracy of these estimates increases with the degree of HW equilibrium.

Examples

Run this code

#Arbitrary MSV-5 marker
sizes <- c(30, 200, 1600, 700, 400)
polyCent <- generatePolyCenters(ploidy='tetra')
HWstats <- testHardyWeinberg(sizes,'MSV-5',polyCent=polyCent)
print(HWstats)