
Last chance! 50% off unlimited learning
Sale ends in
explain
shows which rules apply to which observations and visualizes
the contribution of rules and linear predictors to the predicted values
explain(
object,
newdata,
penalty.par.val = "lambda.1se",
response = 1L,
plot = TRUE,
intercept = FALSE,
center.linear = FALSE,
plot.max.nobs = 4,
plot.dim = c(2, 2),
plot.obs.names = TRUE,
pred.type = "response",
digits = 3L,
cex = 0.8,
ylab = "Contribution to linear predictor",
bar.col = c("#E495A5", "#39BEB1"),
rule.col = "darkgrey",
...
)
object of class pre
.
optional dataframe of new (test) observations, including all predictor variables used for deriving the prediction rule ensemble.
character or numeric. Value of the penalty parameter
"lambda.min"
employs the "lambda.min"
may be specified, to employ the pre_object$glmnet.fit
and plot(pre_object$glmnet.fit)
.
numeric or character vector of length one. Specifies the
name or number of the response variable (for multivariate responses) or
the name or number of the factor level (for multinomial responses) for
which explanations and contributions should be computed and/or plotted.
Only used forpre
s fitted to multivariate or multinomial responses.
logical. Should explanations be plotted?
logical. Specifies whether intercept should be included in explaining predictions.
logical. Specifies whether linear terms should be
centered with respect to the training sample mean before computing their
contribution to the predicted value. If intercept = TRUE
, this
will also affect the intercept. That is, the value of the intercept returned
will differ from that of the value returned by the print
method.
numeric. Specifies maximum number of observations
for which explanations will be plotted. The default (4
) plots the
explanation for the first four observations supplied in newdata
.
numeric vector of length 2. Specifies the number of rows and columns in the resulting plot.
logical vector of length 1, NULL, or character vector
of length nrow(data)
supplying the names that should be used for
individual observations' plots. If TRUE
(default),
rownames(newdata)
will be used as titles. If NULL
,
paste("Observation", 1:nrow(newdata))
will be used as titles. If
FALSE
, no titles will be plotted.
character. Specifies the type of predicted values to be computed, returned and provided in the plot(s). Note that the computed contributions must be additive and are therefore always on the scale of the linear predictor.
integer. Specifies the number of digits used in depcting the predicted values in the plot.
numeric. Specifies the relative text size of title, tick and axis labels.
character. Specifies the label for the horizonantal (y-) axis.
character vector of length two. Specifies the colors to be used for plotting the positive and negative contributions to the predictions, respectively.
character. Specifies the color to be used for plotting the rule
descriptions. If NULL
, rule descriptions are not plotted.
Further arguments to be passed to predict.pre
and
predict.cv.glmnet
.
Provides a graphical depiction of the contribution of rules and
linear terms to the individual predictions (if plot = TRUE
.
Invisibly returns a list with objects predictors
and
contribution
. predictors
contains the values of the rules and
linear terms for each observation in newdata
, for those rules
and linear terms included in the final ensemble with the specified
value of penalty.par.val
. contribution
contains the
values of predictors
, multiplied by the estimated values
of the coefficients in the final ensemble selected with the
specified value of penalty.par.val
.
All contributions are calculated w.r.t. the intercept, by default.
Thus, if a given rule applies to an observation in newdata
,
the contribution of that rule equals the estimated coefficient of
that rule. If a given rule does not apply to an observation in
newdata
, the contribution of that rule equals 0.
For linear terms, contributions can be centered, or not (the default).
Thus, by default the contribution of a linear terms for an
observation in newdata
equals the obeservation's value of the
linear term, times the estimated coefficient of the linear term.
If center.linear = TRUE
, the contribution of a linear term
for an observation in newdata
equals (the value of the linear
temr, minus the mean value of the linear term in the training data)
times the estimated coefficient for the linear term.
Fokkema, M. & Strobl, C. (2020). Fitting prediction rule ensembles to psychological research data: An introduction and tutorial. Psychological Methods 25(5), 636-652. tools:::Rd_expr_doi("10.1037/met0000256"), https://arxiv.org/abs/1907.05302
pre
, plot.pre
,
coef.pre
, importance.pre
, cvpre
,
interact
, print.pre
airq <- airquality[complete.cases(airquality), ]
set.seed(1)
train <- sample(1:nrow(airq), size = 100)
set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airq[train,])
airq.ens.exp <- explain(airq.ens, newdata = airq[-train,])
airq.ens.exp$predictors
airq.ens.exp$contribution
## Can also include intercept in explanation:
airq.ens.exp <- explain(airq.ens, newdata = airq[-train,])
## Fit PRE with linear terms only to illustrate effect of center.linear:
set.seed(42)
airq.ens2 <- pre(Ozone ~ ., data = airq[train,], type = "linear")
## When not centered around their means, Month has negative and
## Day has positive contribution:
explain(airq.ens2, newdata = airq[-train,][1:2,],
penalty.par.val = "lambda.min")$contribution
## After mean centering, contributions of Month and Day have switched
## sign (for these two observations):
explain(airq.ens2, newdata = airq[-train,][1:2,],
penalty.par.val = "lambda.min", center.linear = TRUE)$contribution
Run the code above in your browser using DataLab