pathway.warm.start: pathway.warm.start

Description

This function is designed to accelerate the ARTP2 test in practice. It uses pre-calculated and reusable statistics as input and allow the users to try different testing configuration more efficiently. See Details for more information.

Usage

pathway.warm.start(setup, nperm = NULL, lambda = 1.0, nthread = NULL)

Arguments

setup

an R object created by pathway.summaryData or pathway.rawData. It is a list containing necessary statistics for computing p-values.

nperm

the number of permutations. If it is NULL as default, then the value in the saved setup will be used. See Details.

lambda

inflation factor to be adjusted in pathway analysis. lambda in this function can only be a single numeric number, which is different from the one in pathway.summaryData. The default is 1.0.

nthread

number of threads to be used in permutation. NULL if setup$options$nthread is used.

Value

pathway.warm.start returns an object of class ARTP2. It is a list containing the following components:
pathway.pvaluefinal pathway p-value accounting for multiple comparisons.
gene.pvaluea data frame containing gene name, number of SNPs in the gene that were included in the analysis, chromosome name, and the p-value for the gene accounting for multiple comparisons.
pathwaya data frame defining the pathway that was actually tested after various filters applied.
modela list containing detailed information of selected SNPs in each gene.
most.sig.genesa character vector of genes selected by ARTP2. They are the most promising candidates, although their statistical significance is not guaranteed.
deleted.snpsa data frame containing SNPs excluded from the analysis and their reasons.
deleted.genesa data frame containing genes excluded from the analysis because they are subsets of other remaining genes.
optionsa list of options used in the analysis. See options
test.timingtiming information (in sec)
accurateTRUE if options$nperm is large enougth to accurately estimate p-values, i.e., if the criteria sqrt(pvalue*(1-pvalue)/nperm)/pvalue < 0.1 is satisfied.

Details

An ARTP2 test has two major steps in testing an association. The first step applys data clean criteria and creates necessary and reusable statistics, which can be time-consuming for large pathways. The second step performs the testing procedure to estimate the pathway or gene-level p-value. pathway.warm.start focuses on the second step.

The first step can be done by using pathway.summaryData or pathway.rawData if their options$only.setup is set as TRUE. Their output object, setup, can be used as the first argument of pathway.warm.start. With pathway.warm.start, users can try different configurations to perform various tests allowed by the ARTP2 framework, but avoid long waiting time for data cleaning. Commonly used options in setup$options include method, inspect.snp.n, inspect.gene.n, nperm, etc.

Note that both pathway.summaryData and pathway.rawData can produce the final p-value directly if options$only.setup is FALSE.

The setup is supposed to have all components defined in pathway.summaryData and pathway.rawData. If nperm is NULL, then it will be set as setup$options$nperm. The users can also pass lambda if a second round genomic control is needed. However, unlike in pathway.summaryData, lambda here can only be a single numberic number rather than a vector. Options nperm and lambda are the most useful ones in using pathway.warm.start so we highlight them in the interface. Users can modify any option values in setup$options directly to get more controls of the testing procedure. See options for more details about how to set setup$options.

Except for setup$options, all other components in setup should not be modified by users.

References

Zhang H, Wheeler W, Hyland LP, Yang Y, Shi J, Chatterjee N, Yu K. A powerful procedure for pathway-based meta-analysis using summary statistics identifies multiple pathways associated with type II diabetes.

Yu K, Li Q, Bergen AW, Pfeiffer RM, Rosenberg PS, Caporaso N, Kraft P, Chatterjee N. (2009) Pathway analysis by adaptive combination of P-values. Genet Epidemiol 33(8): 700 - 709

Zhang H, Shi J, Liang F, Wheeler W, Stolzenberg-Solomon R, Yu K. (2014) A fast multilocus test with adaptive SNP selection for large-scale genetic association studies. European Journal of Human Genetics, 22, 696 - 702

Examples

Run this code

## firstly, run the example in pathway.summaryData
## users can adjust the second round inflation in pathway.warm.start
## the first round inflation can be study-specific and adjusted in pathway.rawData 
## or pathway.summaryData

library(ARTP2)
study1 <- system.file("extdata", package = "ARTP2", "study1.txt.gz")
study2 <- system.file("extdata", package = "ARTP2", "study2.txt.gz")
pathway <- system.file("extdata", package = "ARTP2", "pathway.txt.gz")
chr <- 1:22
nchr <- length(chr)
fam <- vector("character", nchr)
bim <- vector("character", nchr)
bed <- vector("character", nchr)
for(i in 1:nchr){
  fam[i] <- system.file("extdata", package = "ARTP2", paste("chr", chr[i], ".fam", sep = ""))
  bim[i] <- system.file("extdata", package = "ARTP2", paste("chr", chr[i], ".bim", sep = ""))
  bed[i] <- system.file("extdata", package = "ARTP2", paste("chr", chr[i], ".bed", sep = ""))
}
reference <- data.frame(fam, bim, bed)
options <- list(inspect.snp.n = 2, nperm = 1e4, 
                maf = .01, HWE.p = 1e-6, 
                gene.R2 = .9, 
                id.str = "unique-pathway-id", 
                out.dir = getwd(), save.setup = FALSE)
                
## different inflation factors are adjusted in two studies
## first round adjustment
lambda <- c(1.10, 1.08)
ncases <- list()
ncontrols <- list()
ncases[[1]] <- c(9580, 2591)
ncontrols[[1]] <- c(53810, 3052)
ncases[[2]] <- 7638
ncontrols[[2]] <- 54319

family <- 'binomial'

## do not run permutation
options$only.setup <- TRUE
## the first round study-specific inflation is adjusted as lambda = c(1.10, 1.08)
# setup <- pathway.summaryData(summary.files = c(study1, study2), pathway, family, 
#                              reference, lambda, ncases, ncontrols, options = options)

## the two rounds of inflation is adjusted as lambda2 = c(1.17370, 1.15236)
lambda2 <- lambda * 1.067
## run permutation to calculate p-value
options$only.setup <- FALSE
# ret1 <- pathway.summaryData(summary.files = c(study1, study2), pathway, family, 
#                             reference, lambda2, ncases, ncontrols, options = options)

## or adjust the second round of inflation in pathway.warm.start
# ret2 <- pathway.warm.start(setup, lambda = 1.067)

# two ways of inflation adjustment should give same results
# ret1$pathway.pvalue == ret2$pathway.pvalue

###############################################################
###############################################################
## modify or specify the method
# setup$options$method <- 2
# setup$options$inspect.snp.n <- 3

# ret3 <- pathway.warm.start(setup, nperm = 1e5, nthread = 2)

Run the code above in your browser using DataLab