Learn R Programming

pbatR (version 0.7)

pbat: PBAT Graphical and Command Line Interface

Description

The following routines are for the graphical and command line pbat interface. The command line interfaces are listed in an order of suggested usage. Most users of the command line will only want to use pbat.m.

pbat runs a GUI (Graphical User Interface) for pbat.

pbat.last returns an object of class pbat of the last command file run from pbat() (this is also returned from pbat; this command is retained because rerunning a command in pbat can be a very time-consuming process).

pbat.last.rawResults prints out the raw text file of the output (particularly useful if the output of pbat cannot be parsed properly, although all of this should have been fixed now).

pbat.m runs pbat according to an expression, from phe class (phenotype information), ped class (pedigree information), and various options. pbat.obj runs pbat with a ped class object (pedigree information), a 'phe' class object (phenotype information), and various other options. pbat.files runs pbat according to a set of filenames and commands.

pbat.create.commandfile creates a command file for Christoph Lange's pbat software with respect to two files on disk (.phe, .ped).

Some options are only available for the respective pbat-gee (G), pbat-pc (P), pbat-logrank (L). If a parameter is 'R'equired for a specific version, it will be denoted, for example, by (G-R).

Usage

pbat()

pbat.last()

pbat.last.rawResults()

pbat.m( formula, phe, ped, fbat="", max.pheno=1, min.pheno=1, null="no linkage, no association", alpha=0.05, trans.pheno="none", trans.pred="none", trans.inter="none", scan.pred="all", scan.inter="all", scan.genetic="additive", offset="default", screening="conditional power", distribution="continuous", logfile="", max.gee=1, max.ped=7, min.info=20, incl.ambhaplos=TRUE, infer.mis.snp=TRUE, sub.haplos=FALSE, length.haplos=2, adj.snps=TRUE, overall.haplo=FALSE, cutoff.haplo=FALSE, output="normal", max.mating.types=10000, commandfile="", future.expansion=NULL )

pbat.obj( phe, ped, file.prefix, ... )

pbat.files( pedfile, fbat="gee", commandfile="", logrank.outfile="", ... )

pbat.create.commandfile( pedfile, phefile="", snps="", phenos="", time="", # (only one of 'phenos' and 'time' can be set) preds="", preds.order="", inters="", groups.var="", groups="", fbat="gee", censor="", max.pheno=1, min.pheno=1, null="no linkage, no association", alpha=0.05, trans.pheno="none", trans.pred="none", trans.inter="none", scan.pred="all", scan.inter="all", scan.genetic="additive", offset="default", screening="conditional power", distribution="continuous", logfile="", max.gee=1, max.ped=7, min.info=20, haplos=NULL, incl.ambhaplos=TRUE, infer.mis.snp=TRUE, sub.haplos=FALSE, length.haplos=2, adj.snps=TRUE, overall.haplo=FALSE, cutoff.haplo=FALSE, output="normal", max.mating.types=10000, commandfile="", future.expansion=NULL )

Arguments

formula
Symbolic expression describing what should be processed. See 'examples' for more information.
phe
'phe' object as described in write.phe.
ped
'ped' object as described in write.ped.
file.prefix
Prefix of the output datafile (phe & ped must match)
pedfile
Name of the pedigree file (.ped) in PBAT-format (extension '.ped' is optional).
phefile
Name of the phenotype file (.phe) in PBAT-format. The default assumes the same prefix as that in 'pedfile'.
...
Options in higher level functions to be passed to 'pbat.create.commandfile'.
fbat
Selects the fbat statistic used the data analysis. "gee" = The FBAT-GEE statistic simplifies to the standard univariate FBAT-statistic. If several phenotypes are selected, all phenotypes are tested simultaneously, using FBAT-
max.pheno
(G,P) The maximum number of phenotypes that will be analyzed in the FBAT-statistic.
min.pheno
(G,P) The minimum number of phenotypes that will be analyzed in the FBAT-statistic.
null
Specification of the null-hypothesis. "no linkage, no association" = Null-hypothesis of no linkage and no association. "linkage, no association" = Null-hypothesis of linkage, but no association.
alpha
Specification of the significance level.
trans.pheno
Transformation of the selected phenotypes. "none" = no transformation "ranks" = transformation to ranks "normal score" = transformation to normal score The default choice is
trans.pred
Transformation of the selected predictor variables/covariates: "none" = no transformation "ranks" = transformation to ranks "normal score" = transformation to normal score The
trans.inter
Transformation of the selected interaction variables "none" = no transformation "ranks" = transformation to ranks "normal score" = transformation to normal score The default choice
scan.pred
(G,P) Computation of all covariate sub-models:

"all" = The selected FBAT statistic is computed with adjustment for all selected covariates/predictors.

"subsets" = The selected FBAT statistic is computed for all

scan.inter
(G,P) Computation of all interaction sub-models:

"all" = The selected FBAT statistic is computed including all selected interaction variables.

"subsets" = The selected FBAT statistic is computed for all posible

scan.genetic
Specification of the mode of inheritance: "additive" = Additive model "dominant" = Dominant model "recessive" = Recessive model "heterozygous advantage" = Heterozygous
offset
Specification of the covariate/predictor variables adjustment: "no" = No adjustments for covariates/predictor variables "max power" = Offset (=FBAT adjustment for covariates and interaction variables) th
screening
Specification of the screening methods to handle the multiple comparison problem for multiple SNPs/haplotypes and a set of phenotypes. "conditional power" = Screening based on conditional power (parametric approach)
distribution
Specification of the phenotypic distribution "continuous" = Phenotypes are treated as continuous phenotypes in the power calculation "categorical" = Phenotypes are treated as categorical/integer variables
logfile
Specification of the log-file. By default, PBAT selects an unique file-name for the log-file, i.e. "pbatlog...".
max.gee
(G) Specification of the maximal number of iterations in the GEE-estimation procedure.
max.ped
Specification of the maximal number of proband in one extended pedigrees.
min.info
Specification of the minimum number of informative families required for the computation of the FBAT-statistics.
incl.ambhaplos
This command defines the handling of ambiguous haplotypes in the haplotypes analysis. Choices:

TRUE = Ambiguous haplotypes (phase can not be inferred) are included in the analysis and are weighted according to their estimated

infer.mis.snp
Handling of missing genotype information in the haplotypes analysis.

FALSE = Individuals with missing genotype information are excluded from the analysis. This is the analysis also implemented in the HBAT option of the FBAT-p

sub.haplos
FALSE = The haplotypes defined by the all SNPs given in the haplotype-block definition are analyzed.

TRUE = All haplotypes are analyzed that are defined by any subset of SNPs in the haplotypes block definition.

length.haplos
Defines the haplotype length when subhaplos=TRUE.
adj.snps
Takes effect when subhaplos=TRUE.

FALSE = All sub-haplotypes are analyzed

TRUE = Only the sub-haplotypes are analyzed for which the first constituting SNPs are adjacent.

overall.haplo
Specification of an overall haplotypes test. When this command is included in the batch-file, only one level of the "groups" variable can be specified.

FALSE = no overall test

TRUE = an overall test

cutoff.haplo
The minimum haplotypes frequency so that a haplotypes is included in the overall test.
output
"normal" = Normal PBAT output. "short" = Short output. "detailed" = Detailed output for each family is created.
max.mating.types
Maximal number of mating types in the haplotype analysis.
commandfile
Name of the temporary command file that will be created to send to the pbat. It is suggested to leave this blank, and an appropriate name will be chosen with a time stamp.
future.expansion
(Only included for future expansion of pbat.) A vector of strings for extra lines to write to the batchfile for pbat.
logrank.outfile
(L) Name of the file to store the R source code to generate the plots for logrank analysis.
snps
Vector of strings for the SNPs to process. Default processes all of the SNPs.
phenos
(G,P) Vector of strings for the phenotypes/traits for the analysis. If none are specified, then all are analyzed. (Note: this must be left empty for logrank analysis, instead specify the time to onset with the time variable.
time
(L-R) Time to onset variable. 'phenos' cannot be specified when this is used, but it must be set for logrank.
preds
Vector of strings for the covariates for the test statistic.
preds.order
Vector of integers indicating the order of 'preds' - the order for the vector of covariates for the test statistic.
inters
Vector of strings for the interaction variables.
groups.var
String for the grouping variable.
groups
Vector of strings corresponding to the groups of the grouping variable (groupsVar).
censor
(L-R) String of the censoring variables. In the corresponding data, this variable has to be binary.
haplos
List of string vectors representing the haplotype blocks for the haplotype analysis. For example, list( block1=c("m1","m2"), block2=c("m3","m4") ) defines 2 haplotype-blocks where the first block is defined by SNPs m1 and m2,

Value

  • 'pbat', 'pbat.last', 'pbat.m', 'pbat.obj', and 'pbat.files' return an object of class pbat. Methods supported by this include plot(...), summary(...), and print(...). Follow the first three links in the 'see also' section of this file for more details.

Details

These commands require `pbatdata.txt' to be in the working directory; if not found, the program will attempt to (1) copy the file from the directory where pbat is, (2) copy it from anywhere in the path, or (3) download it from the internet.

These commands will also generate a lot of output files in the current working directory when interfacing with pbat. These files will be time-stamped so concurrent analysis in the same directory can be run. Race condition: if two logrank analysis finish at exactly the same time, then the plots for one might be lost and/or get linked to the wrong analysis. This should be a rather rare occurence, and is a result of pbat always sending this output to only one filename.

References

http://www.biostat.harvard.edu/~clange/default.htm This was taken with only slight modification to accomodate the interface from Christoph Lange's description of the commands for the pbat program, (which was available with the software at the time of this writing).

http://www.people.fas.harvard.edu/~tjhoffm/pbatR.html

See Also

summary.pbat, plot.pbat, print.pbat, as.ped, as.pedlist, read.ped

as.phe, read.phe

Examples

Run this code
##########################
## pbat.m(...) examples ##
##########################

# Note that none of these can be run as they are verbatim.
# They are just meant to be examples.

# load in the data
# Here we assume that:
#  data.phe contains 'preds1', 'preds2', 'preds3', 'time',
#                     'censor', 'phenos1', ... 'phenos4'
#  data.ped contains 'snp1', 'snp2', 'snp3',
#                     'block1snp1','block1snp2',
#                     'block2snp1','block2snp2'
data.phe <- read.phe( "data" )
data.ped <- read.ped( "data" )

# empty model, does all phenotypes, no predictor covariates, and all
#  snps for a snps analysis.
# The ALL and NONE are special phrases that should only be used here,
#  although they are case sensative.
res <- pbat.m( ALL ~ NONE, phe, ped, fbat="pc", ... )
summary( res )
res  # equivalent to print(res)

# basic model with one phenotype, does all snps (if none specified)
pbat.m( phenos1 ~ preds1, phe, ped, fbat="gee" )

# same model, but with more phenotypes
pbat.m( phenos1 + phenos2 + phenos3 ~ preds1, phe, ped, fbat="gee" )

# does all snps, the mi() tells it should be a marker interaction
pbat.m( phenos1 ~ mi(preds1), phe, ped, fbat="gee" )

# logrank analysis - fbat need not be set
# uses more than one predictor variable
res <- pbat.m( time & censor ~ preds1 + preds2 + preds3, phe, ped )
plot( res )

# single snp analysis (because each snp is seperated by a vertical bar
#  '|'), and stratified by group (presence of censor auto-indicates
#  log-rank analysis).  Note that the group is at the end of the
#  expression, and _must_ be at the end of the expression
res <- pbat.m( time & censor ~ preds1^3 + preds2 | snp1 | snp2 |
         snp3 / group, temp )
plot( res )

# haplotype analysis, stratified by group
pbat.m( time & censor ~ mi(preds1^2) + mi(preds2^3) | block1snp1
        + block1snp2 | block2snp1 + block2snp2 / group, temp )

# set any of the various options
pbat.m( phenos ~ preds, phe, ped, fbat="pc",
        null="linkage, no association", alpha=0.1 )


############################
## pbat.obj(...) examples ##
############################

# These will not function; they only serve as examples.

# ... just indicates there are various options to be put here!
res <- pbat.obj("pedfile", snps=c("snp1,snp2"), preds="pred1", ... ) 
summary(res)
res

# plot is only available for "logrank"
res <- pbat.obj(..., fbat="logrank")
plot( res )


##############################
## pbat.files(...) examples ##
##############################

# These will not function, but only serve as examples.

# Note in the following example, both "pedfile.ped" and "pedfile.phe"
#  must exist.  If the names differed, then you must specify the
#  option 'phe="phefile.phe"', for example.
res <- pbat.files( "pedfile", phenos=c("phenos1","phenos2"),
                   screening="conditional power" )
summary(res)
res

Run the code above in your browser using DataLab