"inbagg"(formula, data, pFUN=NULL, cFUN=list(model = NULL, predict = NULL, training.set = NULL), nbagg = 25, ns = 0.5, replace = FALSE, ...)
formula
specified as y~w1+w2+w3~x1+x2+x3
describes how to model the intermediate variables w1, w2, w3
and the response variable y
, if no other formula is specified by the elements of pFUN
or in cFUN
newdata
and returning the class membership by default, or a list specifying a classifying model, similar to one element of pFUN
. Details are given below.subset
)."inbagg"
, that is a list with elements
nbagg
, describing the prediction
models corresponding
to each bootstrap sample. Each element of mtrees
is a list with elements bindx
(observations of bag sample),
btree
(classifying function of bag sample) and bfct
(predictive models for intermediates of bag sample). Here, each specified intermediate variable is modelled separately
following pFUN
, a list of lists with elements specifying an
arbitrary number of models for the intermediate variables and an
optional element training.set = c("oob", "bag", "all")
. The
element training.set
determines whether, predictive models for
the intermediate are calculated based on the out-of-bag sample
("oob"
), the default, on the bag sample ("bag"
) or on all
available observations ("all"
). The elements of pFUN
,
specifying the models for the intermediate variables are lists as
described in inclass
.
Note that, if no formula is given in these elements, the functional
relationship of formula
is used.
The response variable is modelled following cFUN
.
This can either be a fixed classifying function as described in Peters
et al. (2003) or a list,
which specifies the modelling technique to be applied. The list
contains the arguments model
(which model to be fitted),
predict
(optional, how to predict), formula
(optional, of
type y~w1+w2+w3+x1+x2
determines the variables the classifying
function is based on) and the optional argument training.set =
c("fitted.bag", "original", "fitted.subset")
specifying whether the classifying function is trained on the predicted
observations of the bag sample ("fitted.bag"
),
on the original observations ("original"
) or on the
predicted observations not included in a defined subset
("fitted.subset"
). Per default the formula specified in
formula
determines the variables, the classifying function is
based on.
Note that the default of cFUN = list(model = NULL, training.set = "fitted.bag")
uses the function rpart
and
the predict function predict(object, newdata, type = "class")
.
Andrea Peters, Berthold Lausen, Georg Michelson and Olaf Gefeller (2003), Diagnosis of glaucoma by indirect classifiers. Methods of Information in Medicine 1, 99-103.
rpart
, bagging
,
lm
library("MASS")
library("rpart")
y <- as.factor(sample(1:2, 100, replace = TRUE))
W <- mvrnorm(n = 200, mu = rep(0, 3), Sigma = diag(3))
X <- mvrnorm(n = 200, mu = rep(2, 3), Sigma = diag(3))
colnames(W) <- c("w1", "w2", "w3")
colnames(X) <- c("x1", "x2", "x3")
DATA <- data.frame(y, W, X)
pFUN <- list(list(formula = w1~x1+x2, model = lm, predict = mypredict.lm),
list(model = rpart))
inbagg(y~w1+w2+w3~x1+x2+x3, data = DATA, pFUN = pFUN)
Run the code above in your browser using DataLab