Learn R Programming

RVFam (version 1.1)

coxph.ped: function of single SNP analysis and gene-based tests for survival traits with family data using Cox proportional hazards regression model

Description

Fit Cox proportional hazards regression model with shared frailty (random effect) in each pedigree for single SNP analysis that tests associations between a survival phenotype and each genotyped SNP on a chromosome in a genotype file and for gene-based tests in family data. The association test is carried out by coxph.EC function. Likelihood ratio test (LRT) result is reported. In each test, the coxph function from package survival is used.

Usage

coxph.ped(phenfile,phen,covars=NULL,mafRange=c(0,0.05),chr,genfile, pedfile,snpinfoRdata,sep.ped=",",sep.phe=",",sep.gen=" ",time, aggregateBy="SKATgene",maf.file,snp.cor,ssq.beta.wts=c(1,25), singleSNP.outfile=F)

Arguments

genfile
a character string naming the genotype file for reading
phenfile
a character string naming the phenotype file for reading
pedfile
a character string naming the pedigree file for reading
phen
a character string for the phenotype name of a survival trait of interest in test.dat
covars
a character vector for covariates in phenfile
sep.ped
the field separator character for pedigree file
sep.phe
the field separator character for phenotype file
sep.gen
the field separator character for genotype file
time
the character string of variable named for survival time
mafRange
range of MAF to include SNPs for gene-based burden tests, default is c(0,0.05)
chr
chromosome number that can be 1,2,...,22, and 'X'
snpinfoRdata
a character string naming the RData containing SNP info to be loaded, this should at least include 'Name' (for SNP name), 'Chr', and aggregateBy (default='SKATgene') columns
aggregateBy
the column of SNP info on which single SNPs are to be aggregated for burden tests, default is 'SKATgene'
maf.file
a character string naming the comma delimited file containing 'Name' for SNP name and 'maf' for MAF
snp.cor
a character string naming the RData containing lists of SNP correlation matrix within each 'SKATgene'
ssq.beta.wts
a vector of parameters of beta weights used in proposed sum of squares test, default=c(1,25) as in SKAT
singleSNP.outfile
a logical value, TRUE indicating single SNP analysis has been done and result files are available for computing SSQ using a different mafRange

Value

No value is returned. Instead, tab delimited result files and an RData are generated. A single SNP result file, named with phen and singleSNP, contains columns: gene, Name, maf, ntotal, nmiss, maf_ntotal, beta, se, Z, remark, p (p-value from LRT), MAC, n0, n1, and n2. A burden test result file, named with phen and T/MB for Li & Leal 2008/Madsen & Browning 2009 respectively, contains columns: gene, beta, se, Z, cmafTotal, cmafUsed, nsnpsTotal, nsnpsUsed, nmiss, remark, and p. A SSQ test result file, named with phen and SSQ, contains columns: gene, SSQ, cmafTotal, cmafUsed, nsnpsTotal, nsnpsUsed, nmiss, df, and p. A generated RData that is a list that contains scores, cov, n, maf and sey for each gene with gene names being the names of the list. Note maf in RData is MAF based on ntotal.
gene
gene name
Name
SNP name
maf
minor allele frequency based on genotyped sample
ntotal
number of individuals with genotype, phenotype and covariates
nmiss
number of individuals with missing genotype among ntotal
maf_ntotal
minor allele frequency based on ntotal
beta
regression coefficient of single SNP test or burden test
se
standard error of beta
Z
signed likelihood ratio statistic
remark
additional information of the analysis
p
p-value of single SNP test or burden test by LRT
camfTotal
sum of maf_ntotal of SNPs in a gene
cmafUsed
sum of maf_ntotal of SNPs selected with mafRange in a gene for burden tests or SSQ test
nsnpsTotal
total number of SNPs in a gene
nsnpsUsed
number of SNPs selected and used in burden tests and SSQ test
SSQ
sum of squares statistic
df
degrees of freedom of SSQ
MAC
minor allele count
n0
the number of individuals with 0 copy of coded alleles
n1
the number of individuals with 1 copy of coded alleles
n2
the number of individuals with 2 copies of coded alleles
scores
beta/se^2 in output RData, where beta and se are vectors
cov
diag(1/se)*LD matrix*diag(1/se) in output RData
n
maximum ntotal in a gene in output RData
sey
1 in output RData

Details

The coxph.ped function reads in and merges phenotype, genotype, and pedigree files to perform single SNP analysis, two burden tests (weight=1 for Li & Leal 2008; weight=1/(MAF)/(1-MAF) for Madsen & Browning 2009), and one sum of squares (SSQ) test (Wei 2009) using Cox proportional hazards regression model with shared frailty (random effect) in each family as implemented in coxph function in survival R package and to output an RData that is computed based on single SNP results and that is compatible with seqMeta R package for conducting meta-analysis. For burden tests and SSQ test, SNPs genotypes/results are aggregated by aggregateBy (default = "SKATgene") using SNPs selected according to user specified mafRange within each gene (by default). genfile contains unique individual numerical id and genotype data on a chromosome, with the column names being "id" and SNP names. For each SNP, the genotype data should be coded as 0, 1, 2 indicating the numbers of the coded alleles. The SNP name in genotype file should not have any dash, '-' and other special characters(dots and underscores are OK). phenfile contains unique individual id, phenotype and covariates data, with the column names being "id" and phenotype and covaraite names. pedfile contains pedigree informaion, with the column names being "famid","id","fa","mo","sex". LRT is used in all genetic association tests.

References

Therneau T (2014). A Package for Survival Analysis in S. R package version 2.37-7, http://CRAN.R-project.org/package=survival.

Terry M. Therneau and Patricia M. Grambsch (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.

Li, B. and Leal, S. M (2008). Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data. Am J Hum Genet, 83(3), 311-321.

Madsen, B. E. and Browning, S. R (2009). A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic. PLoS Genet, 5(2) e1000384.

Wei P (2009). Asymptotic Tests of Association with Multiple SNPs in Linkage Disequilibrium. Genet Epidemiol, 33(6), 497-507.

Examples

Run this code
## Not run: 
# coxph.ped(genfile="EC_chr1.txt",phenfile="trait1.csv",pedfile="ped.csv",
# phen="trait1",covars=NULL,sep.ped=",",sep.phe=",",sep.gen=" ",
# mafRange=c(0,0.01),chr=1,snpinfoRdata="SNPinfo_EC.RData",
# aggregateBy="SKATgene",time="survival_time",maf.file="EC_MAF.csv",
# snp.cor="EC_SNPcor.RData")
# ## End(Not run)

Run the code above in your browser using DataLab