Usage
"trioFS"(x, y, B = 20, nleaves = 5, replace = TRUE, sub.frac = 0.632, control = lrControl(), fast = FALSE, addMatImp = TRUE, addModels = TRUE, verbose = FALSE, rand = NA, ...)
"trioFS"(x, ...)
"trioFS"(formula, data, recdom = TRUE, ...)
Arguments
x
either an object of class trioPrepare
, i.e. the output of trio.prepare
, or
a binary matrix consisting of zeros and ones. If the latter, then each column of x
must correspond to a binary variable
(e.g., codng for a dominant or a recessive effect of a SNP), and each row to a case or a pseudo-control,
where each trio is represented by a block of four consecutive rows of x
containing the data for the case
and the three matched pseudo-controls (in this order) so that the first four rows of x
comprise
the data for the first trio, rows 5-8 the data for the second trio, and so on. Missing values are not
allowed. A convenient way to generate this matrix is to use the function trio.prepare
. Afterwards, trioLR
can be directly applied to the output of trio.prepare
.
y
a numeric vector specifying the case-pseudo-control status for the observations in x
(if x
is a binary matrix).
Since in trio logic regression, cases are coded by a 3
and pseudo-controls by a 0
,
y
is given by rep(c(3, 0, 0, 0), n.trios)
, where n.trios
is
the number of trios for which genotype data is stored in x
. Thus, the length of y
must be equal to the number of rows in x
. No missing values are allowed in y
.
If not specified, y
will be automatically generated.
B
number of bootstrap samples or subsamples used in trioFS
nleaves
maximum number of leaves, i.e.\ variables, in the logic tree considered in each of the B
trio logic regression models (please note in trio logic regression the model consists only of one logic tree).
replace
should sampling of the trios be done with replacement? If
TRUE
, a Bootstrap sample of size n.trios
is drawn
from the n.trios
trios in each of the B
iterations. If
FALSE
, ceiling(sub.frac * n.trios)
of the trios
are drawn without replacement in each iteration.
sub.frac
a proportion specifying the fraction of trios that
are used in each iteration to fit a trio logic regression model if replace = FALSE
.
Ignored if replace = TRUE
.
control
a list of control parameters for the search algorithms and the logic trees considered when fitting the
trio logic regression model, where the parameters for an MC logic regression are ignored. For details and the parameters,
see lrControl
, which is the function that should be used to specify control
. fast
should a greedy search be used instead of simulated annealing, i.e. the standard
search algorithm in (trio) logic regression?
addMatImp
should the matrix containing the improvements due to the interactions
in each of the iterations be added to the output, where the importance of each interaction
is computed by the average over the B
improvements due to this interaction?
addModels
should the B
trio logic regression models be added to the output
verbose
should some comments on the progress the trioFS
analysis be printed?
rand
positive integer. If specified, the random number generator is set into a reproducible state.
formula
an object of class formula
describing the model that should be fitted.
data
a data frame containing the variables in the model. Each row of data
must correspond to an observation, and each column to a binary variable (coded by 0 and 1)
or a factor (for details, see recdom
) except for the column comprising
the response, where no missing values are allowed in data
. For a description of the specification
of the response, see y
.
recdom
a logical value or vector of length ncol(data)
comprising whether a SNP should
be transformed into two binary dummy variables coding for a recessive and a dominant effect.
If recdom
is TRUE
(and a logical value), then all factors/variables with three levels will be coded by two dummy
variables as described in make.snp.dummy
. Each level of each of the other factors
(also factors specifying a SNP that shows only two genotypes) is coded by one indicator variable.
If recdom
isFALSE
(and a logical value),
each level of each factor is coded by an indicator variable. If recdom
is a logical vector,
all factors corresponding to an entry in recdom
that is TRUE
are assumed to be SNPs
and transformed into two binary variables as described above. All variables corresponding
to entries of recdom
that are TRUE
(no matter whether recdom
is a vector or a value)
must be coded either by the integers 1 (coding for the homozygous reference genotype), 2 (heterozygous),
and 3 (homozygous variant), or alternatively by the number of minor alleles, i.e. 0, 1, and 2, where
no mixing of the two coding schemes is allowed. Thus, it is not allowed that some SNPs are coded by
1, 2, and 3, and others are coded by 0, 1, and 2.
...
for the trioPrepare
and the formula
method, optional parameters to be passed to
the low level function trioFS.default
, i.e. all arguments of trioFS.default
except for
x
and y
. Otherwise, ignored.