assocTestCPH: Cox proportional hazards

Description

Fits Cox proportional hazards model

Usage

assocTestCPH(genoData, event, time.to.event, covars, factor.covars = NULL, scan.chromosome.filter = NULL, scan.exclude = NULL, maf.filter = FALSE,  GxE = NULL, strata.vars = NULL, chromosome.set = NULL, block.size = 5000, verbose = TRUE, outfile = NULL)

Arguments

genoData

GenotypeData object, should contain sex and phenotypes in scan annotation. Chromosomes are expected to be in contiguous blocks.

event

name of scan variable in genoData for event to analyze

time.to.event

name of scan variable in genoData for time to event

covars

vector of covariate terms for model (can include interactions as 'a:b', main effects correspond to scan variable names in genoData)

factor.covars

vector of names of covariates to be converted to factor

scan.chromosome.filter

a logical matrix that can be used to exclude some chromosomes, some scans, or some specific scan-chromosome pairs. Entries should be TRUE if that scan-chromosome pair should be included in the analysis, FALSE if not. The number of rows must be equal to the number of scans in genoData, and the number of columns must be equal to the largest integer chromosome value in genoData. The column number must match the chromosome number. e.g. A scan.chromosome.filter matrix used for an analyis when genoData has SNPs with chromosome=(1-24, 26, 27) (i.e. no Y (25) chromosome SNPs) must have 27 columns (all FALSE in the 25th column). But a scan.chromosome.filter matrix used for an analysis genoData has SNPs chromosome=(1-26) (i.e no Unmapped (27) chromosome SNPs) must have only 26 columns.

scan.exclude

an integer vector containing the IDs of entire scans to be excluded.

maf.filter

whether to filter results returned using

MAF*(1-MAF)
  > 75/(2*n)

where MAF = minor allele frequency and n = number of events

GxE

name of the covariate to use for E if genotype-by-environment (i.e. SNP:E) model is to be analyzed, in addition to the main effects (E can be a covariate interaction)

strata.vars

vector of names of variables to stratify on for a stratified analysis (use NULL if no stratified analysis needed)

chromosome.set

integer vector with chromosome(s) to be analyzed. Use 23, 24, 25, 26, 27 for X, XY, Y, M, Unmapped respectively.

block.size

number of SNPs from a given chromosome to read in one block from genoData

verbose

Logical value specifying whether to show progress information.

outfile

a character string to append in front of ".chr.i_k.RData" for naming the output data-frames; where i is the first chromosome, and k is the last chromosome used in that call to the function. "chr.i_k." will be omitted if chromosome.set=NULL.

Value

index: snp index
snpID: unique integer ID for SNP
chr: chromosome
maf: minor allele frequency calculated as appropriate for autosomal loci
mafx: minor allele frequency calculated as appropriate for X-linked loci
beta: regression coefficient returned by the coxph function
se: standard error of the regression coefficient returned by the coxph function
z: z statistic returned by the coxph function
pval: p-value for the z-statistic returned by the coxph function
warned: TRUE if a warning was issued
n.events: number of events in complete cases for the given SNP
index: snp index
snpID: unique integer ID for SNP
chr: chromosome
maf: minor allele frequency calculated as appropriate for autosomal loci
mafx: minor allele frequency calculated as appropriate for X-linked loci
warned: TRUE if a warning was issued
n.events: number of events in complete cases for the given SNP
ge.lrtest: Likelihood ratio test statistic for the GxE interaction
ge.pval: p-value for the likelihood ratio test statistic

Details

This function performs Cox proportional hazards regression of a survival object (using the Surv function) on SNP genotype and other covariates. It uses the coxph function from the R survival library. Individual samples can be included or excluded from the analysis using the scan.exclude parameter. Individual chromosomes can be included or excluded by specifying the indices of the chromosomes to be included in the chromosome.set parameter. Specific chromosomes for specific samples can be included or excluded using the scan.chromosome.filter parameter. Both scan.chromosome.filter and scan.exclude may be used together. If a scan is excluded in EITHER, then it will be excluded from the analysis, but it does NOT need to be excluded in both. This design allows for easy filtering of anomalous scan-chromosome pairs using the scan.chromosome.filter matrix, but still allows easy exclusion of a specific group of scans (e.g. males or Caucasians) using scan.exclude.

The argument maf.filter indicates whether to filter results returned using 2 * MAF * (1-MAF) * n > 75 where MAF = minor allele frequency and n = number of events. This filter was suggested by Ken Rice and Thomas Lumley, who found that without this requirement, at threshold levels of significance for genome-wide studies, Cox regression p-values based on standard asymptotic approximations can be notably anti-conservative.

Examples

Run this code

# an example of a scan chromosome matrix
# desiged to eliminate duplicated individuals
# and scans with missing values of sex
library(GWASdata)
data(illuminaScanADF)
scanAnnot <- illuminaScanADF
samp.chr.matrix <- matrix(TRUE,nrow(scanAnnot),26)
dup <- duplicated(scanAnnot$subjectID)
samp.chr.matrix[dup | is.na(scanAnnot$sex),] <- FALSE
samp.chr.matrix[scanAnnot$sex=="F", 25] <- FALSE

# additionally, exclude YRI subjects
scan.exclude <- scanAnnot$scanID[scanAnnot$race == "YRI"]

# create some variables for the scans
scanAnnot$age <- rnorm(nrow(scanAnnot),mean=40, sd=10)
scanAnnot$event <- rbinom(nrow(scanAnnot),1,0.4)
scanAnnot$ttoe <- rnorm(nrow(scanAnnot),mean=100,sd=10)

# create data object
gdsfile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gds <- GdsGenotypeReader(gdsfile)
genoData <-  GenotypeData(gds, scanAnnot=scanAnnot)

# variables
event <- "event"
time.to.event <- "ttoe"
covars <- c("sex", "age")
factor.covars <- "sex"

chr.set <- 21

res <- assocTestCPH(genoData,
  event="event", time.to.event="ttoe",
  covars=c("sex", "age"), factor.covars="sex",
  scan.chromosome.filter=samp.chr.matrix,
  scan.exclude=scan.exclude,
  chromosome.set=chr.set)

close(genoData)