Generate a data set consisting of:
"anno"A SNP-length annotation vector.
"covar"A subject by 6 covariate matrix.
"geno"A subject by SNP genotype matrix.
"pheno"A subject-length phenotype vector.
DGP(
anno = NULL,
beta = c(0, 1, 2),
binary = FALSE,
geno = NULL,
include_residual = TRUE,
indicator = FALSE,
maf_range = c(0.005, 0.01),
method = "none",
n = 100,
p_dmv = 0.4,
p_ptv = 0.1,
prop_causal = 1,
random_signs = FALSE,
random_var = 0,
snps = 100,
weights = c(1, 2, 3)
)
List containing: genotypes, annotations, covariates, phenotypes.
Annotation vector, if providing genotypes. Should match the number of columns in geno.
If method = "none", a (3 x 1) coefficient vector for bmvs, dmvs, and ptvs respectively. If method != "none", a scalar effect size.
Generate binary phenotype? Default: FALSE.
Genotype matrix, if providing genotypes.
Include residual? If FALSE, returns the expected value. Intended for testing.
Convert raw counts to indicators? Default: FALSE.
Range of minor allele frequencies: c(MIN, MAX).
Genotype aggregation method. Default: "none".
Sample size.
Frequency of deleterious missense variants. Default of 40% is based on the frequency of DMVs among rare coding variants in the UK Biobank.
Frequency of protein truncating variants. Default of 10% is based on the frequency of PTVs among rare coding variants in the UK Biobank.
Proportion of variants which are causal. Default: 1.0.
Randomize signs? FALSE for burden-type genetic architecture, TRUE for SKAT-type.
Frailty variance in the case of random signs. Default: 0.
Number of SNP in the gene. Default: 100.
Aggregation weights.
# Generate data.
data <- DGP(n = 100)
# View components.
table(data$anno)
head(data$covar)
head(data$geno[, 1:5])
hist(data$pheno)
Run the code above in your browser using DataLab