Learn R Programming

MDR (version 1.2)

mdr.ca.adj: Function to calculate a post-hoc adjusted prediction estimate of classification accuracy, corrected for prospective data with previously estimated population prevalence

Description

After fitting an object of class 'mdr' and obtaining a best model, calculate an adjusted estimate of classification accuracy to be used for prediction that accounts for retrospective sampling and incorporates disease prevalence, as implemented in Winham and Motsinger-Reif 2010.

Usage

mdr.ca.adj(data, model, hr, prev, genotype = c(0, 1, 2))

Arguments

data
the dataset; an n by (p+1) matrix where the first column is the binary response vector (coded 0 or 1) and the remaining columns are the p SNP genotypes (coded numerically)
model
a numeric vector of the final MDR model loci
hr
vector of binary indicators for high-risk/low-risk of the genotype combinations of the final model loci
prev
an estimate of population prevalence
genotype
a numeric vector of possible genotypes arising in data; default is c(0,1,2), but this vector can be longer or shorter depending on if more or fewer than three genotypes are possible

Value

List containing:
adjusted classification accuracy
post-hoc prediction estimate of classification accuracy adjusted for prevalence, measured as a percentage
adjusted classification error
post-hoc prediction estimate of classification error (100-classification accuracy) adjusted for prevalence
...

Details

MDR provides a prediction error estimate of the final model calculated from retrospective data. To provide a prospective prediction estimate, an accurate estimate of the population prevalence rate must be incorporated.

References

Ritchie MD et al (2001). Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hm Genet 69(1): 138-147.

Winham SJ and Motsinger AA (2010). The effect of retrospective sampling on estimates of prediction error for multifactor dimensionality reduction. Annals of Human Genetics.

See Also

mdr.cv, mdr.3WS, boot.error

Examples

Run this code
#load test data
data(mdr1)

#this runs mdr with 5-fold cross-validation on a subset of the sample data, considering all pairwise combinations (K=2)
fit<-mdr.cv(mdr1[,1:11],K=2,cv=5) 

#calculates adjusted CA estimate from the sample data for the previously fit MDR object 'fit', assuming the population prevalence is 10%
mdr.ca.adj(mdr1, model=fit$'final model', hr=fit$'high-risk/low-risk', prev=0.10)

Run the code above in your browser using DataLab