Usage
haplin(filename,
markers = "ALL", n.vars = 0, sep = " ", allele.sep = ";",
na.strings = "NA", design = "triad", use.missing = FALSE,
xchrom = FALSE, maternal = FALSE, test.maternal = FALSE,
scoretest = "no", ccvar = NULL, covar = NULL, sex = NULL,
reference = "reciprocal", response = "free",
threshold = 0.01, max.haplos = NULL, haplo.file = NULL,
resampling = FALSE, max.EM.iter = 50, data.out = "no",
verbose = TRUE, printout = TRUE)
Arguments
Of the following arguments, only filename
is required. Use of the remaining arguments will depend on the type of analysis.
filename
A character string giving the name and path of the ASCII data file to be read.
markers
Default is "ALL", which means HAPLIN uses all available markers in the data set in the analysis. For the current version of HAPLIN the number of markers used at a single run should probably not exceed 4 or 5 due to the computational burden. The markers ar
n.vars
Numeric. The number of variables (columns) in the data file before (to the left) of the genetic data.
sep
The character separator used in the data file to separate between "columns", where each column contains the two alleles of a single individual at a single marker.
allele.sep
The character separator used in the data file to separate the two alleles for a single individual in a single marker. The recommended (default) separator is ";", but for SNPs an empty "" is also common.
na.strings
The character string indicating missing data in the data file. Default is to use "NA" in place of, for instance, C;T for a SNP that hasn't been typed in that individual.
design
The value "triad" is used for the standard case triad design, without indepdendent controls. The value "cc.triad" means a combination of case triads and control triads. This requires the argument ccvar
to point to the data column containing t
use.missing
A logical value used to determine whether triads with missing data should be included in the analysis. When set to TRUE, Haplin uses the EM algorithm to obtain risk estimates, also taking into account triads with missing data. The standard errors and p-va
xchrom
Logical, defaults to "FALSE". If set to "TRUE", haplin assumes the markers are on the x-chromosome. This option should be combined with specifying the sex
argument, and setting (for the time being) response = "mult", reference = "ref.ca
maternal
If TRUE, maternal effects are estimated as well as the standard fetal effects.
test.maternal
Not yet implemented.
scoretest
Special interest only. If "no", no score test is computed. If "yes", an overall score p-value is included in the output, and the individual score values are returned in the haplin object. If "only", haplin is only run under the null hypothesis, and a simp
ccvar
Numeric. Should give the column number for the column containing the case-control indicator in the data file. Needed for the "cc" and "cc.triad" designs. The column should contain two numeric values, of which the largest one is always used to denote cases
covar
Not yet implemented.
sex
A numeric value specifying which of the data columns that contains the sex variable. The variable should be coded 1 for males and 2 for females. To be used with xchrom = TRUE
.
reference
Decides how HAPLIN chooses its reference category for the effect estimates. Default value is "reciprocal". With the reciprocal reference the effect of a single or double dose of each haplotype is measured relative to the remaining haplotypes. This means t
response
The default value "free" means that both single- and double dose effects are estimated. Choosing "mult" instead specifies a multiplicative dose-response model.
threshold
Sets the (approximate) lower limit for the haplotype frequencies of those haplotypes that should be retained in the analysis. Hapotypes that are less frequent are removed, and information about this is given in the output.
max.haplos
Not yet implemented.
haplo.file
Not yet implemented.
resampling
Default is FALSE. When FALSE, the individual haplotypes reconstructed by the EM algorithm as assumed known when computing CIs and p-values. If set to "jackknife" a jackknife-based resampling procedure is used when computing confidence intervals and p-valu
max.EM.iter
The maximum number of iterations used by the EM algorithm. This value can be increased if necessary, which sometimes is the case with e.g. case-control data which a substantial amount of missing. However, for triad data with little missing information the
data.out
Character. Accepts values "no", "prelim", "null" or "full", with "no" as default. For values other than default, haplin
returns the data file prepared for analysis rather than the usual haplin
estimation results. The data file co
verbose
Default is T (=TRUE). During the EM algorithm, HAPLIN prints the estimated parameters and deviance for each step. To avoid the output, set this argument to F (=FALSE).
printout
Logical. If TRUE (default), haplin prints a full summary of the results after finishing the estimation. If FALSE, no such printout is given, but the summary
function can later be applied to a saved result to get the same summary.