Learn R Programming

GWAF (version 2.2)

auto: function to generate scripts for genome-wide association/interaction analysis

Description

Given a path/directory (genopath) that keeps geotype files, phenotype file, pedigree file, phenotype of interest, covariates, analysis of interest (can be 'lmepack', 'lmepack.imputed', 'lmeVpack.imputed', 'glmm', 'geepack', 'geepack.imputed', 'geepack.quant', 'geepack.quant.imputed', 'lmepack.int', 'lmepack.int.imputed', 'geepack.int', 'geepack.int.imputed', 'geepack.quant.int', 'geepack.quant.int.imputed') and other arguments, auto function generates one R script, one shell script that can excute R script, and one list file that can excute all shell scripts in batch mode, for each genotype file. Once the list file (XXXX.lst) is generated, user can use ksh XXXX.lst to submit all jobs to test all SNPs in genopath.

Usage

auto(genopath, phenfile, pedfile, outfile, phen, covars, cov.int, sub="N", analysis, lib.loc, model = NULL, kinmat = NULL, col.names = F, sep.ped = ",", sep.phe = ",", sep.gen = ",")

Arguments

genopath
a character string indicating the path/directory that keeps genotype files to be analyzed
phenfile
a character string naming the phenotype file for reading (see format requirement in details)
pedfile
a character string naming the pedigree file for reading (see format requirement in details)
outfile
a character string naming the result file for writing
phen
a character string for a phenotype name in phenfile
covars
a character vector for covariates in phenfile
cov.int
a character string naming the covariate for interaction, the covariate has to be included in covars
sub
"N" (default) for no stratified analysis, and "Y" for requesting stratified analyses (only when cov.int is dichotomous)
analysis
a character string indicating the analysis of interest available in GWAF package, can be 'lme', 'lme.imputed', 'gee' or 'gee.imputed'
lib.loc
a character string indicating the location of GWAF package
model
a single character of 'a','d','g', or 'r', with 'a'=additive, 'd'=dominant, 'g'=general and 'r'=recessive models; Not appropriate/needed for analyzing imputed SNPs
kinmat
a character string naming the file where kinship coefficient matrix is kept; needed for LME analyses
col.names
a logical value indicating whether the output file should contain column names
sep.ped
the field separator character for pedigree file
sep.phe
the field separator character for phenotype file
sep.gen
the field separator character for genotype file

Value

No value is returned. Instead, results are written to outfile.

Details

auto function generates one R script, one shell script that can excute R script, and one list file that can excute all shell scripts in batch mode. These scripts are named based on the phenotype of interest, the analysis of interest and the time these scripts are generated. After generating these scripts, auto function genertates a message telling the user how to submit ALL the jobs (using ksh XXXX.lst). When a submitted job is completed, a log file indicating which genotype file was analyzed will be generated and the R script and the shell script will be removed. The number of log files should equal to the number of genotype files, if all jobs are completed. All the results will be written and appended to the user specified single output file. Different outfile should be assigned for different genopath to avoid over-writting.

Examples

Run this code
## Not run: 
# auto(phenfile="simphen.csv",genopath="/home/data/exomechip/chr1",pedfile="simped.csv",
# outfile="exomechip_chr1_SIMQT.csv",phen="CVD",covars="sex",analysis="geepack",model="a",
# col.names=F,sep.ped=",",sep.phe=",",sep.gen=",")
# ## End(Not run)

Run the code above in your browser using DataLab