Given a model, this function will report a data frame with all the variables that may be interchanged in the model without affecting its classification performance. For each variable in the model, this function will loop all candidate variables and report all of which result in an equivalent or better zIDI than the original model.
reportEquivalentVariables(object,
pvalue = 0.05,
data,
variableList,
Outcome = "Class",
timeOutcome=NULL,
type = c("LOGIT", "LM", "COX"),
description = ".",
method="BH",
osize=0,
fitFRESA=TRUE)
An object of class lm
, glm
, or coxph
containing the model to be analyzed
The maximum p-value, associated to the IDI , allowed for a pair of variables to be considered equivalent
A data frame where all variables are stored in different columns
A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables
The name of the column in data
that stores the variable to be predicted by the model
The name of the column in data
that stores the time to event
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX")
The name of the column in variableList
that stores the variable description
The method used by the p-value adjustment algorithm
The number of features used for p-value adjustment
if TRUE it will use the cpp based fitting method
A list with all the unadjusted p-values of the equivalent features per model variable
A data frame with three columns. The first column is the original variable of the model. The second column lists all variables that, if interchanged, will not statistically affect the performance of the model. The third column lists the corresponding z-scores of the IDI for each equivalent variable.
a character vector with all the equivalent formulas
a bagged model that used all the equivalent formulas. The model size is limited by the number of observations