Learn R Programming

GWAF (version 2.2)

geepack.lgst.int.batch: function to test gene-environment or gene-gene interactions between a dichotomous trait and a batch of genotyped SNPs in families using Generalized Estimation Equation model

Description

Fit logistic regression via Generalized Estimation Equation (GEE) to test gene-environment or gene-gene interactions for a dichotomous phenotype and all genotyped SNPs in a genotype file in family data under additive genetic model. The interaction term is the product of SNP genotype and a covariate for interaction (cov.int). The covariate for interaction (cov.int) can be SNP genotype (gene-gene interaction) or an environmental factor (gene-environment interaction). Only one interaction term is allowed. When cov.int is dichotomous, stratified analyses can be requested by specifying sub="Y". The covariance between the main effect (SNP) and the interaction effect is provided in the output when stratified analysis is not requested. Each pedigree is treated as a cluster with independence working correlation matrix used in the robust variance estimator. This function applies the same interaction test to all SNPs in a genotype file. The interaction test is carried out by geepack.lgst.int function from GWAF where the the geese function from package geepack is used.

Usage

geepack.lgst.int.batch(genfile,phenfile,pedfile,outfile,phen,covars,cov.int,sub="N", col.names=T,sep.ped=",",sep.phe=",",sep.gen=",")

Arguments

genfile
a character string naming the genotype file for reading (see format requirement in details)
phenfile
a character string naming the phenotype file for reading (see format requirement in details)
pedfile
a character string naming the pedigree file for reading (see format requirement in details)
outfile
a character string naming the result file for writing
phen
a character string for a phenotype name in phenfile
covars
a character vector for covariates in phenfile
cov.int
a character string naming the covariate for interaction, the covariate has to be included in covars
sub
"N" (default) for no stratified analysis, and "Y" for requesting stratified analyses (only when cov.int is dichotomous)
col.names
a logical value indicating whether the output file should contain column names
sep.ped
the field separator character for pedigree file
sep.phe
the field separator character for phenotype file
sep.gen
the field separator character for genotype file

Value

No value is returned. Instead, results are written to outfile. If stratified analyses are requested, the result file will include the following columns. Otherwise, cov_beta_snp_beta_int will be included instead of the results from stratified analyses, that is, beta_snp_cov0, se_snp_cov0, pval_snp_cov0, beta_snp_cov1, se_snp_cov1, and pval_snp_cov1.
phen
phenotype name
snp
SNP name
covar_int
the covariate for interaction
n
sample size used in analysis
AF
allele frequency of the coded allele
nd
the number of individuals in affected sample
AFd
allele frequency of the coded allele in affected sample
model
genetic model used in analysis, additive model only
beta_snp
regression coefficient of SNP covariate
se_snp
standard error of beta_snp
pval_snp
p-value of testing beta_snp not equal to zero
beta_snp_cov0
regression coefficient of SNP covariate in stratified analysis using the subset where cov.int level is 0
se_snp_cov0
standard error of beta_snp_cov0
pval_snp_cov0
p-value of testing beta_snp_cov0 not equal to zero
beta_snp_cov1
regression coefficient of SNP covariate in stratified analysis using the subset where cov.int level is 1
se_snp_cov1
standard error of beta_snp_cov1
pval_snp_cov1
p-value of testing beta_snp_cov1 not equal to zero
beta_int
regression coefficient of the interaction term
se_int
standard error of beta_int
pval_int
p-value of testing beta_int not equal to zero
remark
warning or additional information for the analysis, 'not converged' indicates the GEE analysis did not converge; 'logistic reg' indicates GEE model is replaced by logistic regression; 'exp count<5' 5="" indicates="" any="" expected="" count="" is="" less="" than="" in="" phenotype-genotype="" table;="" 'not="" converged="" and="" exp="" count<5',="" 'logistic="" reg="" &="" count<5'="" are="" noted="" similarly;="" 'collinearity'="" collinearity="" exists="" between="" snp="" some="" covariates<="" dd="">

Details

The geepack.lgst.int.batch function first reads in and merges phenotype-covariates, genotype and pedigree files, then tests gene-environment or gene-gen interaction for phen against all SNPs in genfile. Only one interaction term is allowed, so is the covariate for interaction (cov.int). When cov.int is dichotomous, stratified analyses can be requested by specifying sub="Y". The covariance between the main effect (SNP) and the interaction effect is provided in the output when stratified analysis is not requested. genfile contains unique individual id and genotype data, with the column names being "id" and SNP names. For each genotyped SNP, the genotype data should be coded as 0, 1, 2 indicating the numbers of the coded alleles. The SNP names in genotype file should not have any dash, '-' and other special characters(dots and underscores are OK). phenfile contains unique individual id, phenotype and covariates data, with the column names being "id" and phenotype and covaraite names. pedfile contains pedigree informaion, with the column names being "famid","id","fa","mo","sex". In all files, missing value should be an empty space, except missing parental id in pedfile. Only phenotypes with two categories are analyzed. A phenotype should be coded as 0 and 1, with 1 denoting affected and 0 unaffected. SNPs with low genotype counts (especially minor allele homozygote) may be omitted or analyzed with logistic regression. The geepack.lgst.int.batch function fits GEE model using each pedigree as a cluster with geepack.lgst.int function from GWAF package and geese function from geepack package.

Examples

Run this code
## Not run: 
# geepack.lgst.int.batch(phenfile="simphen.csv",genfile="simgen.csv",pedfile="simped.csv",
# phen="CVD",outfile="simout.csv",covars=c("sex","age"),cov.int="age",
# sep.ped=",",sep.phe=",",sep.gen=",")
# ## End(Not run)

Run the code above in your browser using DataLab