Learn R Programming

GWAF (version 2.2)

geepack.quant.batch: function to test genetic associations between a continuous trait and a batch of genotyped SNPs in families using Generalized Estimation Equation model

Description

Fit Generalized Estimation Equation (GEE) model to test associations between a continuous phenotype and all genotyped SNPs in a genotype file in family data with user specified genetic model. Each pedigree is treated as a cluster, with independence working correlation matrix used in the robust variance estimator. The proportion of phenotype variation explained by the tested SNP is not provided. This function applies the same trait-SNP association test to all genotyped SNPs in the genotype data. The trait-SNP association test is carried out by using the geese function from package geepack.

Usage

geepack.quant.batch(phenfile,genfile,pedfile,phen,model="a",covars=NULL,outfile, col.names=T,sep.ped=",",sep.phe=",",sep.gen=",")

Arguments

genfile
a character string naming the genotype file for reading (see format requirement in details)
phenfile
a character string naming the phenotype file for reading (see format requirement in details)
pedfile
a character string naming the pedigree file for reading (see format requirement in details)
outfile
a character string naming the result file for writing
phen
a character string for a phenotype name in phenfile
covars
a character vector for covariates in phenfile
model
a single character of 'a','d','g', or 'r', with 'a'=additive, 'd'=dominant, 'g'=general and 'r'=recessive models
col.names
a logical value indicating whether the output file should contain column names
sep.ped
the field separator character for pedigree file
sep.phe
the field separator character for phenotype file
sep.gen
the field separator character for genotype file

Value

No value is returned. Instead, results are written to outfile. When the genetic model is 'a', 'd' or 'r', the result includes the following columns. When the genetic model is 'g', beta and se are replaced with beta10, beta20, beta21, se10, se20, se21 .
phen
phenotype name
snp
SNP name
n0
the number of individuals with 0 copy of coded alleles
n1
the number of individuals with 1 copy of coded alleles
n2
the number of individuals with 2 copies of coded alleles
beta
regression coefficient of SNP covariate
se
standard error of beta
chisq
Chi-square statistic for testing beta not equal to zero
df
degree of freedom of the Chi-square statistic
model
model actually used in the analysis
pval
p-value of the chi-square statistic
beta10
regression coefficient of genotype with 1 copy of coded allele vs. that with 0 copy
beta20
regression coefficient of genotype with 2 copy of coded allele vs. that with 0 copy
beta21
regression coefficient of genotype with 2 copy of coded allele vs. that with 1 copy
se10
standard error of beta10
se20
standard error of beta20
se21
standard error of beta21

Details

For a continuous trait, the geepack.quant.batch function first reads in and merges phenotype-covariates, genotype and pedigree files, then tests the association of phen against all SNPs in genfile. genfile contains unique individual id and genotype data, with the column names being "id" and SNP names. For each SNP, the genotype data should be coded as 0, 1, 2 indicating the numbers of the coded alleles. The SNP name in genotype file should not have any dash, '-' and other special characters(dots and underscores are OK). phenfile contains unique individual id, phenotype and covariates data, with the column names being "id" and phenotype and covaraite names. pedfile contains pedigree informaion, with the column names being "famid","id","fa","mo","sex". In all files, missing value should be an empty space, except missing parental id in pedfile. SNPs with low genotype counts (especially minor allele homozygote) may be omitted or analyzed with dominant model. The geepack.quant.batch function fits GEE model using each pedigree as a cluster with geese function from geepack package.

References

Liang, K.Y. and Zeger, S.L. (1986) Longitudinal data analysis using generalized linear models. Biometrika, 73 13--22.

Zeger, S.L. and Liang, K.Y. (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42 121--130.

Yan, J and Fine, J. (2004) Estimating equations for association structures. Stat Med, 23 859--874.

Examples

Run this code
## Not run: 
# geepack.quant.batch(phenfile="simphen.csv",genfile="simgen.csv",pedfile="simped.csv",
# phen="SIMQT",model="a",outfile="simout.csv",sep.ped=",",sep.phe=",",sep.gen=",")
# ## End(Not run)

Run the code above in your browser using DataLab