Learn R Programming

hapassoc (version 0.1)

CheckHaplos: Estimation of Haplotype Frequencies

Description

This function estimates haplotype frequencies using the EM algorithm applied to the multilocus genotype data output by RecodeHaplos. Haplotypes with frequencies below a user-specified tolerance (zero.tol) are assumed to not exist and are removed from further consideration. (Pseudo-individuals having haplotypes with zero frequency are deleted and the column corresponding to that haplotype is deleted.) From the remaining haplotypes, those with frequency below a user-defined pooling tolerance are pooled into a single category called "pooled".

Usage

CheckHaplos(haplos.list,numSNPs,pooling.tol = 0.05, zero.tol = 1/(2 * sum(haplos.list$wt) * 10)

Arguments

haplo.list
list output by the function RecodeHaplos
numSNPs
number of SNPs per haplotype
pooling.tol
pooling tolerance -- set to 0.05 by default
zero.tol
tolerance for haplotype frequencies below which haplotypes are assumed not to exist -- set to $\frac{1}{2*N*10}$ where N is the number of subjects by default

Value

  • haplotestT/F, True if some pooling of the haplotypes was done
  • initGammainitial estimates of haplotype frequencies
  • zeroFreqHaploslist of haplos assumed not to exist
  • pooledHaploslist of haplos pooled into a single category
  • nonHaploDMnon-genetic portion of the AUGMENTED data frame
  • haploDMa data frame with $2^{numSNPs}$ columns scoring number of copies of each haplotype for each pseudo-individual
  • haploMatmatrix with 2 columns giving haplotypes for each pseudo-individual
  • wtvector giving initial weights for each pseudo-individual for the EM algorithm
  • IDindex for each individual in the original data frame. Note that all pseudo-individuals have the same ID value
  • unknownvector indicating whether the haplotype information was missing for each row in the augmented data

See Also

RecodeHaplos,EM,summary.EM.

Examples

Run this code
data(hypoDat)
example.haplos<-RecodeHaplos(hypoDat,2)
example.newhaplos<-CheckHaplos(example.haplos, 2)

# To get the haplotype counts

example.newhaplos$initGamma

# Result:
#       n00         n01         n10         n11 
#0.009465535 0.246534465 0.437534465 0.306465535 
# The '10' haplotype is the most frequent

Run the code above in your browser using DataLab