Partitioned and aggregated transformation models
traforest(object, parm = 1:length(coef(object)), reparm = NULL,
intercept = c("none", "shift", "scale", "shift-scale"),
update = TRUE, min_update = length(coef(object)) * 2,
mltargs = list(), ...)
# S3 method for traforest
predict(object, newdata, mnewdata = data.frame(1), K = 20, q = NULL,
type = c("weights", "node", "coef", "trafo", "distribution", "survivor", "density",
"logdensity", "hazard", "loghazard", "cumhazard", "quantile"),
OOB = FALSE, simplify = FALSE, trace = FALSE, updatestart = FALSE,
applyfun = NULL, cores = NULL, ...)
# S3 method for traforest
logLik(object, newdata, weights = NULL, OOB = FALSE, coef = NULL, ...)
An object of class traforest
with corresponding logLik
and
predict
methods.
an object of class ctm
or mlt
specifying the
abstract model to be partitioned.
parameters of object
those corresponding score is
used for finding partitions.
optional matrix of contrasts for reparameterisation of the scores.
teststat = "quadratic"
is invariant to this operation
but teststat = "max"
might be more powerful for
example when formulating an implicit into an explicit intercept term.
add optional intercept parameters (constraint to zero) to the model.
arguments to mlt
for fitting the
transformation models.
logical, if TRUE
, models and thus scores are updated in
every node. If FALSE
, the model and scores are
computed once in the root node. The latter option is faster
but less accurate.
number of observations necessary to refit the model in a node. If less observations are available, the parameters from the parent node will be reused.
an optional data frame of observations for the forest.
an optional data frame of observations for the model.
number of grid points to generate (in the absence of q
).
quantiles at which to evaluate the model.
type of prediction or plot to generate.
compute out-of-bag predictions.
simplify predictions (if possible).
a logical indicating if a progress bar shall be printed while the predictions are computed.
try to be smart about starting values for computing predictions (experimental).
an optional lapply
-style function with arguments
function(X, FUN, ...)
for looping over newdata
.
The default is to use the
basic lapply
function unless the cores
argument is
specified (see below).
numeric. If set to an integer the applyfun
is set to
mclapply
with the desired number of cores
.
an optional vector of weights.
an optional matrix of precomputed coefficients for
newdata
(using predict
). Helps to compute the
coefficients once for later reuse (different weights, for
example).
arguments to cforest
, at least
formula
and data
.
Conditional inference trees are used for partitioning likelihood-based transformation
models as described in Hothorn and Zeileis (2017). The method can be seen
in action in Hothorn (2018) and the corresponding code is available as
demo("BMI")
.
Torsten Hothorn and Achim Zeileis (2021). Predictive Distribution Modelling Using Transformation Forests. Journal of Computational and Graphical Statistics, tools:::Rd_expr_doi("10.1080/10618600.2021.1872581").
Torsten Hothorn (2018). Top-Down Transformation Choice. Statistical Modelling, 3-4, 274-298. tools:::Rd_expr_doi("10.1177/1471082X17748081").
Natalia Korepanova, Heidi Seibold, Verena Steffen and Torsten Hothorn (2019). Survival Forests under Test: Impact of the Proportional Hazards Assumption on Prognostic and Predictive Forests for ALS Survival. tools:::Rd_expr_doi("10.1177/0962280219862586").
### Example: Personalised Medicine Using Partitioned and Aggregated Cox-Models
### A combination of and
### based on infrastructure in the mlt R add-on package described in
### https://cran.r-project.org/web/packages/mlt.docreg/vignettes/mlt.pdf
library("trtf")
library("survival")
### German Breast Cancer Study Group 2 data set
data("GBSG2", package = "TH.data")
GBSG2$y <- with(GBSG2, Surv(time, cens))
### set-up Cox model with overall treatment effect in hormonal therapy
cmod <- Coxph(y ~ horTh, data = GBSG2, support = c(100, 2000), order = 5)
### overall log-hazard ratio
coef(cmod)
### roughly the same as
coef(coxph(y ~ horTh, data = GBSG2))
if (FALSE) {
### estimate age-dependent Cox models (here ignoring all other covariates)
ctrl <- ctree_control(minsplit = 50, minbucket = 20, mincriterion = 0)
set.seed(290875)
tf_cmod <- traforest(cmod, formula = y ~ horTh | age, control = ctrl,
ntree = 50, mtry = 1, trace = TRUE, data = GBSG2)
### plot age-dependent treatment effects vs. overall treatment effect
nd <- data.frame(age = 30:70)
cf <- predict(tf_cmod, newdata = nd, type = "coef")
nd$logHR <- sapply(cf, function(x) x["horThyes"])
plot(logHR ~ age, data = nd, pch = 19, xlab = "Age", ylab = "log-Hazard Ratio")
abline(h = coef(cmod <- mlt(m, data = GBSG2))["horThyes"])
### treatment most beneficial in very young patients
### NOTE: scale of log-hazard ratios depends on
### corresponding baseline hazard function which _differs_
### across age; interpretation of positive / negative treatment effect is,
### however, save.
### mclapply doesn't work in Windows
if (.Platform$OS.type != "windows") {
### computing predictions: predicted coefficients
cf1 <- predict(tf_cmod, newdata = nd, type = "coef")
### speedup with plenty of RAM and 4 cores
cf2 <- predict(tf_cmod, newdata = nd, cores = 4, type = "coef")
### memory-efficient with low RAM and _one_ core
cf3 <- predict(tf_cmod, newdata = nd, cores = 4, applyfun = lapply, type = "coef")
all.equal(cf1, cf2)
all.equal(cf1, cf3)
}
}
Run the code above in your browser using DataLab