Learn R Programming

RVFam (version 1.1)

gc.fun: function that does genomic control correction to single SNP analysis, sum of square test and RData for survival trait analysis

Description

When high genomic control (GC) parameter (lambda) estimate is observed, gc.fun applies GC correction to SNPs with minor allele counts (MAC) less than a user specified threshold that may have inflated type I error rate for survival traits in particular, adjusts RData output accordingly, and recomputes sum of square statistic.

Usage

gc.fun(path,phen,snpinfoRdata,snp.cor,mac,aggregateBy="SKATgene", maf.file,mafRange,ssq.beta.wts=c(1,25))

Arguments

path
path to directory that saves all 23 tab delimited single SNP analysis result files
phen
a character string for the phenotype name of a trait of interest
snpinfoRdata
a character string naming the RData containing SNP info to be loaded, this should at least include 'Name' (for SNP name), 'Chr', and aggregateBy (default='SKATgene') columns
snp.cor
a character string naming the RData containing lists of SNP correlation matrix within each 'SKATgene'
mac
user specified MAC threshold for applying GC correction to SNPs with MAC under it
aggregateBy
the column of SNP info on which single SNPs are to be aggregated for burden tests, default is 'SKATgene'
maf.file
a character string naming the comma delimited file containing 'snp.names' for SNP name and 'maf' for MAF
mafRange
range of MAF to include SNPs for gene-based burden tests, default is c(0,0.05)
ssq.beta.wts
a vector of parameters of beta weights used in proposed sum of squares test, default=c(1,25) as in SKAT

Value

No value is returned. Instead, tab delimited result files and an RData are generated. A single SNP result file, named with phen and singleSNP, contains columns: gene, Name, maf, ntotal, nmiss, maf_ntotal, beta, se, Z, remark, p (p-value from LRT), MAC, n0, n1, and n2. A SSQ test result file, named with phen and SSQ, contains columns: gene, SSQ, cmafTotal, cmafUsed, nsnpsTotal, nsnpsUsed, nmiss, df, and p. A generated RData that is a list that contains scores, cov, n, maf and sey for each gene with gene names being the names of the list. Note maf in RData is MAF based on ntotal.

Details

When high lambda is observed from survival trait single SNP analysis, the gc.fun function applies GC correction to SNPs with user defined MAC, adjusts RData output based on GC corrected single SNP analysis results, recomputes sum of squares statistic and then outputs corrected single SNP analysis results, SSQ analysis results and RData. Initial single SNP analysis result files are required and the input arguments should be identical to the ones used in initial analysis (except for path).

Examples

Run this code
## Not run: 
# gc.fun(path="/home/mhchen/",phen="trait1",mafRange=c(0,0.01),
# snpinfoRdata="SNPinfo_EC.RData",aggregateBy="SKATgene",
# maf.file="EC_MAF.csv",snp.cor="EC_SNPcor.RData",ssq.beta.wts=c(1,25))
# ## End(Not run)

Run the code above in your browser using DataLab