lme.ped: function of single SNP analysis and gene-based tests for continuous traits with family data using Linear Mixed Effects model

Description

Fit linear mixed effects (LME) model for single SNP analysis that tests associations between a continuous phenotype and each genotyped SNP on a chromosome in a genotype file and for gene-based tests in family data. The association test is carried out by lme.EC function. In each test, the lmekin function from package coxme is used.

Usage

lme.ped(phenfile,genfile,pedfile,phen,covars=NULL,mafRange=c(0,0.05),chr,
snpinfoRdata,sep.ped=",",sep.phe=",",sep.gen=" ",aggregateBy="SKATgene",
maf.file,snp.cor,ssq.beta.wts=c(1,25),singleSNP.outfile=F)

Arguments

phenfile

a character string naming the phenotype file for reading

genfile

a character string naming the genotype file for reading

pedfile

a character string naming the pedigree file for reading

phen

a character string for the phenotype name of a continuous trait of interest in test.dat

covars

a character vector for covariates in phenfile

mafRange

range of MAF to include SNPs for gene-based burden tests, default is c(0,0.05)

chr

chromosome number that can be 1,2,...,22, and 'X'

snpinfoRdata

a character string naming the RData containing SNP info to be loaded, this should at least include 'Name' (for SNP name), 'Chr', and aggregateBy (default='SKATgene') columns

sep.ped

the field separator character for pedigree file

sep.phe

the field separator character for phenotype file

sep.gen

the field separator character for genotype file

aggregateBy

the column of SNP info on which single SNPs are to be aggregated for burden tests, default is 'SKATgene'

maf.file

a character string naming the comma delimited file containing 'Name' for SNP name and 'maf' for MAF

snp.cor

a character string naming the RData containing lists of SNP correlation matrix within each 'SKATgene'

ssq.beta.wts

a vector of parameters of beta weights used in proposed sum of squares test, default=c(1,25) as in SKAT

singleSNP.outfile

a logical value, TRUE indicating single SNP analysis has been done and result files are available for computing SSQ using a different mafRange

Value

gene: gene name
Name: SNP name
maf: minor allele frequency based on genotyped sample
ntotal: number of individuals with genotype, phenotype and covariates
nmiss: number of individuals with missing genotype among ntotal
maf_ntotal: minor allele frequency based on ntotal
beta: regression coefficient of single SNP test or burden test
se: standard error of beta
Z: Wald Z statistic
remark: additional information of the analysis
p: p-value of single SNP test or burden test
camfTotal: sum of maf_ntotal of SNPs in a gene
cmafUsed: sum of maf_ntotal of SNPs selected with mafRange in a gene for burden tests or SSQ test
nsnpsTotal: total number of SNPs in a gene
nsnpsUsed: number of SNPs selected and used in burden tests and SSQ test
SSQ: sum of squares statistics
df: degree of freedom of SSQ
MAC: minor allele count
n0: the number of individuals with 0 copy of coded alleles
n1: the number of individuals with 1 copy of coded alleles
n2: the number of individuals with 2 copies of coded alleles
scores: beta/se^2 in output RData, where beta and se are vectors
cov: diag(1/se)*LD matrix*diag(1/se) in output RData
n: maximum ntotal in a gene in output RData
sey: residual standard error in output RData

Details

The lme.ped function reads in and merges phenotype, genotype, and pedigree files, and creates a relationship coefficient matrix using pedfile and kinship2 package to perform single SNP analysis, two burden tests (weight=1 for Li & Leal 2008; weight=1/(MAF)/(1-MAF) for Madsen & Browning 2009), one sum of squares (SSQ) test (Wei 2009) using a LME model as implemented in lmekin function in coxme R package and to output an RData that is computed based on single SNP results and that is compatible with seqMeta for conducting meta-analysis. For burden tests and SSQ test, SNPs genotypes/results are aggregated by aggregateBy (default = "SKATgene") using SNPs selected according to user specified mafRange within each gene (by default). genfile contains unique individual numerical id and genotype data on a chromosome, with the column names being "id" and SNP names. For each SNP, the genotype data should be coded as 0, 1, 2 indicating the numbers of the coded alleles. The SNP name in genotype file should not have any dash, '-' and other special characters(dots and underscores are OK). phenfile contains unique individual id, phenotype and covariates data, with the column names being "id" and phenotype and covaraite names. pedfile contains pedigree informaion, with the column names being "famid","id","fa","mo","sex". Wald chi-square test is used in all genetic association tests.

References

coxme package: mixed-effects Cox models, sparse matrices, and modeling data from large pedigrees. Beth Atkinson (atkinson@mayo.edu) for pedigree functions.Terry Therneau (therneau@mayo.edu) for all other functions. 2007. Ref Type: Computer Program. http://cran.r-project.org/web/packages/coxme/.

Abecasis, G. R., Cardon, L. R., Cookson, W. O., Sham, P. C., & Cherny, S. S (2001). Association analysis in a variance components framework. Genet Epidemiol, 21 Suppl 1, S341-S346.

Li, B. and Leal, S. M (2008). Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data. Am J Hum Genet, 83(3), 311-321.

Madsen, B. E. and Browning, S. R (2009). A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic. PLoS Genet, 5(2) e1000384.

Wei P (2009). Asymptotic Tests of Association with Multiple SNPs in Linkage Disequilibrium. Genet Epidemiol, 33(6), 497-507.

Examples

Run this code

## Not run: 
# lme.ped(genfile="EC_chr1.txt",phenfile="trait1.csv",pedfile="ped.csv",
# phen="trait1",covars=NULL,sep.ped=",",sep.phe=",",sep.gen=" ",mafRange=c(0,0.01),
# chr=1,snpinfoRdata="SNPinfo_EC.RData",aggregateBy="SKATgene",maf.file="EC_MAF.csv",
# snp.cor="EC_SNPcor.RData",ssq.beta.wts=c(1,25))
# ## End(Not run)

Run the code above in your browser using DataLab