CheckHaplos: Estimation of Haplotype Frequencies

Description

This function estimates haplotype frequencies using the EM algorithm applied to the multilocus genotype data output by RecodeHaplos. Haplotypes with frequencies below a user-specified tolerance (zero.tol) are assumed to not exist and are removed from further consideration. (Pseudo-individuals having haplotypes with zero frequency are deleted and the column corresponding to that haplotype is deleted.) From the remaining haplotypes, those with frequency below a user-defined pooling tolerance are pooled into a single category called "pooled".

Usage

CheckHaplos(haplos.list,numSNPs,pooling.tol = 0.05, zero.tol = 1/(2 * sum(haplos.list$wt) * 10)

Arguments

haplo.list

list output by the function RecodeHaplos

numSNPs

number of SNPs per haplotype

pooling.tol

pooling tolerance -- set to 0.05 by default

zero.tol

tolerance for haplotype frequencies below which haplotypes are assumed not to exist -- set to $\frac{1}{2*N*10}$ where N is the number of subjects by default

Value

haplotestT/F, True if some pooling of the haplotypes was done
initGammainitial estimates of haplotype frequencies
zeroFreqHaploslist of haplos assumed not to exist
pooledHaploslist of haplos pooled into a single category
nonHaploDMnon-genetic portion of the AUGMENTED data frame
haploDMa data frame with $2^{numSNPs}$ columns scoring number of copies of each haplotype for each pseudo-individual
haploMatmatrix with 2 columns giving haplotypes for each pseudo-individual
wtvector giving initial weights for each pseudo-individual for the EM algorithm
IDindex for each individual in the original data frame. Note that all pseudo-individuals have the same ID value
unknownvector indicating whether the haplotype information was missing for each row in the augmented data

Examples

Run this code

data(hypoDat)
example.haplos<-RecodeHaplos(hypoDat,2)
example.newhaplos<-CheckHaplos(example.haplos, 2)

# To get the haplotype counts

example.newhaplos$initGamma

# Result:
#       n00         n01         n10         n11 
#0.009465535 0.246534465 0.437534465 0.306465535 
# The '10' haplotype is the most frequent

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples