test: Sample data for MDR package for n=1000, p=5000
Description
This dataset provides case/control disease status and genetic information.
Format
A simulated data frame with 1000 observations on 5001 variables. 'Response' is a binary vector representing case(1) or control(0) status for a disease. Variables 'SNP.1' to 'SNP.5000' are numeric variables which represent genotype information (coded as 0,1,2) at 5000 loci.Details
This data was simulated with a larger number of samples and genetic predictor variables than mdr1
and mdr2
as an example for a larger association study. Can be used as part of a training-testing framework to assess models built with train