interact
calculates test statistics for assessing the strength of
interactions between the input variable(s) specified, and all other input
variables.
interact(object, varnames = NULL, nullmods = NULL,
penalty.par.val = "lambda.1se", quantprobs = c(0.05, 0.95), plot = TRUE,
col = c("yellow", "blue"), ylab = "Interaction strength",
main = "Interaction test statistics", se.linewidth = 0.05,
parallel = FALSE, k = 10, verbose = FALSE, ...)
an object of class pre
.
character vector. Names of variables for which interaction
statistics should be calculated. If NULL
, interaction statistics for
all predictor variables with non-zeor coefficients will be calculated (which
may take a long time).
object with bootstrapped null interaction models, resulting
from application of bsnullinteract
.
character. Which value of the penalty parameter
criterion should be used? The value yielding minimum cv error
("lambda.min"
) or penalty parameter yielding error within 1 standard
error of minimum cv error ("lambda.1se
")? Alternatively, a numeric
value may be specified, corresponding to one of the values of lambda in the
sequence used by glmnet, for which estimated cv error can be inspected by
running object$glmnet.fit
and plot(object$glmnet.fit)
.
numeric vector of length two. Probabilities that should be
used for plotting the range of bootstrapped null interaction model statistics.
Only used when nullmods
argument is specified and plot = TRUE
.
The default yields sample quantiles corresponding to .05 and .95 probabilities.
logical. Should interaction statistics be plotted?
character vector of length one or two. Color for plotting
interaction statistics. The first color specified is used to plot the
interaction statistic from the training data, the second color specifed
is used to plot the interaction statistic distribution from the bootstrapped
null interaction models. Only used when plot = TRUE
. Only the first
element of vector is used if nullmods = NULL
.
character string. Label to be used for plotting y-axis.
character. Main title for the bar plot.
numeric. Width of the whiskers of the plotted standard error bars (in inches).
logical. Should parallel foreach be used? Must register parallel beforehand, such as doMC or others.
integer. Calculating interaction test statistics is a computationally intensive, so calculations are split up in several parts to prevent memory allocation errors. If a memory allocation error still occurs, increase k.
logical. Should progress information be printed to the command line?
Additional arguments to be passed to barplot
.
Function interact()
returns and plots interaction statistics
for the specified predictor variables. If nullmods is not specified, it
returns and plots only the interaction test statistics for the specified
fitted prediction rule ensemble. If nullmods is specified, the function
returns a list, with elements $fittedH2
, containing the interaction
statistics of the fitted ensemble, and $nullH2
, which contains the
interaction test statistics for each of the bootstrapped null interaction
models.
If plot = TRUE
(the default), a barplot is created with the
interaction test statistic from the fitted prediction rule ensemble. If
nullmods
is specified, bars representing the median of the
distribution of interaction test statistics of the bootstrapped null
interaction models are plotted. In addition, error bars representing the
quantiles of the distribution (their value specified by the quantprobs
argument) are plotted. These allow for testing the null hypothesis of no
interaction effect for each of the input variables.
Note that the error rates of null hypothesis tests of interaction effects
have not yet been studied in detail, but likely depend on the number of
generated bootstrapped null interaction models as well as the complexity of
the fitted ensembles. Users are therefore advised to test for the presence
of interaction effects by setting the nsamp
argument of the function
bsnullinteract
\(\geq 100\) (even though this may take a lot of
computation time). Also, users are advised to test for the presence of
interactions only with fitted ensembles that are neither too sparse nor too
complex, that is, ensembles that are selected by setting the
penalty.par.val
argument equal to "lambda.min"
or
"lambda.1se"
.
Can be computationally intensive, especially when nullmods is
specified, in which case setting parallel = TRUE
may improve speed.
# NOT RUN {
set.seed(42)
airq.ens <- pre(Ozone ~ ., data=airquality[complete.cases(airquality),])
interact(airq.ens, c("Temp", "Wind", "Solar.R"))
# }
Run the code above in your browser using DataLab