AssotesteR (version 0.1-10)

RARECOVER: RARECOVER Algorithm

Description

RARECOVER is an algorithm proposed by Bhatia et al (2010) that determines the set of variants in a manner of forward variable selection: starting from a null model without any genetic variants, genetic variants are selected one by one based on their statistical significance and then added into the model

Usage

RARECOVER(y, X, maf = 0.05, dif = 0.5, perm = 100)

Arguments

y
numeric vector with phenotype status: 0=controls, 1=cases. No missing data allowed
X
numeric matrix or data frame with genotype data coded as 0, 1, 2. Missing data is allowed
maf
numeric value indicating the minor allele frequency threshold for rare variants (maf=0.05 by default)
dif
numeric value between 0 and 1 as a threshold for the decision criterion in the RARECOVER algorithm (default dif=0.5)
perm
positive integer indicating the number of permutations (100 by default)

Value

"assoctest", basically a list with the following elements:
rc.stat
rarecover statistic
perm.pval
permuted p-value
set
set of selected variants
args
descriptive information with number of controls, cases, variants, rare variants, maf, number of selected variants, and permutations
name
name of the statistic

Details

The applied association test statistic (denoted as XCORR in Bhatia et al, 2010) is based on the Pearsons chi-square statistic

The argument maf is used to specify the threshold of the minor allele frequency for rare variants. By default, only variants below maf=0.05 are taken into account in the analysis. However, if all variants in X are considered as rare variants, setting maf=1 will consider them all for the analysis

There is no imputation for the missing data. Missing values are simply ignored in the computations.

References

Bhatia G, Bansal V, Harismendy O, Schork NJ, Topol EJ, Frazer K, Bafna V (2010) A Covering Method for Detecting Genetic Associations between Rare Variants and Common Phenotypes. PLoS Computational Biology, 6(10): e1000954

See Also

WSS

Examples

Run this code
  ## Not run: 
#   
#   # number of cases
#   cases = 500
# 
#   # number of controls
#   controls = 500
# 
#   # total (cases + controls)
#   total = cases + controls
# 
#   # phenotype vector
#   phenotype = c(rep(1, cases), rep(0, controls))
# 
#   # genotype matrix with 10 variants (random data)  
#   set.seed(1234)
#   genotype = matrix(rbinom(total*10, 2, 0.051), nrow=total, ncol=10)
# 
#   # apply RARECOVER with dif=0.05 and 500 permutations
#   myrc = RARECOVER(phenotype, genotype, maf=0.05, perm=500)
#   myrc
#   ## End(Not run)

Run the code above in your browser using DataLab