Learn R Programming

GWASTools (version 1.12.2)

assocTestCPH: Cox proportional hazards

Description

Fits Cox proportional hazards model

Usage

assocTestCPH(genoData, event, time.to.event, covars, factor.covars = NULL, scan.chromosome.filter = NULL, scan.exclude = NULL, maf.filter = FALSE, GxE = NULL, strata.vars = NULL, chromosome.set = NULL, block.size = 5000, verbose = TRUE, outfile = NULL)

Arguments

genoData
GenotypeData object, should contain sex and phenotypes in scan annotation. Chromosomes are expected to be in contiguous blocks.
event
name of scan variable in genoData for event to analyze
time.to.event
name of scan variable in genoData for time to event
covars
vector of covariate terms for model (can include interactions as 'a:b', main effects correspond to scan variable names in genoData)
factor.covars
vector of names of covariates to be converted to factor
scan.chromosome.filter
a logical matrix that can be used to exclude some chromosomes, some scans, or some specific scan-chromosome pairs. Entries should be TRUE if that scan-chromosome pair should be included in the analysis, FALSE if not. The number of rows must be equal to the number of scans in genoData, and the number of columns must be equal to the largest integer chromosome value in genoData. The column number must match the chromosome number. e.g. A scan.chromosome.filter matrix used for an analyis when genoData has SNPs with chromosome=(1-24, 26, 27) (i.e. no Y (25) chromosome SNPs) must have 27 columns (all FALSE in the 25th column). But a scan.chromosome.filter matrix used for an analysis genoData has SNPs chromosome=(1-26) (i.e no Unmapped (27) chromosome SNPs) must have only 26 columns.
scan.exclude
an integer vector containing the IDs of entire scans to be excluded.
maf.filter
whether to filter results returned using MAF*(1-MAF) > 75/(2*n) where MAF = minor allele frequency and n = number of events
GxE
name of the covariate to use for E if genotype-by-environment (i.e. SNP:E) model is to be analyzed, in addition to the main effects (E can be a covariate interaction)
strata.vars
vector of names of variables to stratify on for a stratified analysis (use NULL if no stratified analysis needed)
chromosome.set
integer vector with chromosome(s) to be analyzed. Use 23, 24, 25, 26, 27 for X, XY, Y, M, Unmapped respectively.
block.size
number of SNPs from a given chromosome to read in one block from genoData
verbose
Logical value specifying whether to show progress information.
outfile
a character string to append in front of ".chr.i_k.RData" for naming the output data-frames; where i is the first chromosome, and k is the last chromosome used in that call to the function. "chr.i_k." will be omitted if chromosome.set=NULL.

Value

If outfile=NULL (default), all results are returned as a data.frame. If outfile is specified, no data is returned but the function saves a data.frame with the naming convention as described by the argument outfile. Columns for the main effects model are:
index
snp index
snpID
unique integer ID for SNP
chr
chromosome
maf
minor allele frequency calculated as appropriate for autosomal loci
mafx
minor allele frequency calculated as appropriate for X-linked loci
beta
regression coefficient returned by the coxph function
se
standard error of the regression coefficient returned by the coxph function
z
z statistic returned by the coxph function
pval
p-value for the z-statistic returned by the coxph function
warned
TRUE if a warning was issued
n.events
number of events in complete cases for the given SNP
If GxE is not NULL, another data.frame is returned with the results of the genotype-by-environment model. If outfile=NULL, the function returns a list with names (main, GxE); otherwise the GxE data.frame is saved as a separate output file. Columns are:
index
snp index
snpID
unique integer ID for SNP
chr
chromosome
maf
minor allele frequency calculated as appropriate for autosomal loci
mafx
minor allele frequency calculated as appropriate for X-linked loci
warned
TRUE if a warning was issued
n.events
number of events in complete cases for the given SNP
ge.lrtest
Likelihood ratio test statistic for the GxE interaction
ge.pval
p-value for the likelihood ratio test statistic
Warnings:If outfile is not NULL, another file will be saved with the name "outfile.chr.i_k.warnings.RData" that contains any warnings generated by the function.

Details

This function performs Cox proportional hazards regression of a survival object (using the Surv function) on SNP genotype and other covariates. It uses the coxph function from the R survival library. Individual samples can be included or excluded from the analysis using the scan.exclude parameter. Individual chromosomes can be included or excluded by specifying the indices of the chromosomes to be included in the chromosome.set parameter. Specific chromosomes for specific samples can be included or excluded using the scan.chromosome.filter parameter. Both scan.chromosome.filter and scan.exclude may be used together. If a scan is excluded in EITHER, then it will be excluded from the analysis, but it does NOT need to be excluded in both. This design allows for easy filtering of anomalous scan-chromosome pairs using the scan.chromosome.filter matrix, but still allows easy exclusion of a specific group of scans (e.g. males or Caucasians) using scan.exclude.

The argument maf.filter indicates whether to filter results returned using 2 * MAF * (1-MAF) * n > 75 where MAF = minor allele frequency and n = number of events. This filter was suggested by Ken Rice and Thomas Lumley, who found that without this requirement, at threshold levels of significance for genome-wide studies, Cox regression p-values based on standard asymptotic approximations can be notably anti-conservative.

See Also

GenotypeData, coxph

Examples

Run this code
# an example of a scan chromosome matrix
# desiged to eliminate duplicated individuals
# and scans with missing values of sex
library(GWASdata)
data(illuminaScanADF)
scanAnnot <- illuminaScanADF
samp.chr.matrix <- matrix(TRUE,nrow(scanAnnot),26)
dup <- duplicated(scanAnnot$subjectID)
samp.chr.matrix[dup | is.na(scanAnnot$sex),] <- FALSE
samp.chr.matrix[scanAnnot$sex=="F", 25] <- FALSE

# additionally, exclude YRI subjects
scan.exclude <- scanAnnot$scanID[scanAnnot$race == "YRI"]

# create some variables for the scans
scanAnnot$age <- rnorm(nrow(scanAnnot),mean=40, sd=10)
scanAnnot$event <- rbinom(nrow(scanAnnot),1,0.4)
scanAnnot$ttoe <- rnorm(nrow(scanAnnot),mean=100,sd=10)

# create data object
gdsfile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gds <- GdsGenotypeReader(gdsfile)
genoData <-  GenotypeData(gds, scanAnnot=scanAnnot)

# variables
event <- "event"
time.to.event <- "ttoe"
covars <- c("sex", "age")
factor.covars <- "sex"

chr.set <- 21

res <- assocTestCPH(genoData,
  event="event", time.to.event="ttoe",
  covars=c("sex", "age"), factor.covars="sex",
  scan.chromosome.filter=samp.chr.matrix,
  scan.exclude=scan.exclude,
  chromosome.set=chr.set)

close(genoData)

Run the code above in your browser using DataLab