Creates a summary table showing the performance of the selected models by different combinations of stepwise regression strategies and selection metrics.
performance(x, ...)A data frame where:
The formula of each selected model
Columns for each combination of strategy and metric used
For linear, poisson, gamma, and negative binomial regression:
adj_r2_train/adj_r2_test: Adjusted R-squared measures the proportion of variance explained by the model, adjusted for the number of predictors. Values range from 0 to 1, with higher values indicating better model fit. A good model should have high adjusted R-squared on both training and test data, with minimal difference between them. Large differences suggest overfitting.
mse_train/mse_test: Mean Squared Error measures the average squared difference between predicted and actual values. Lower values indicate better model performance. The test MSE should be close to training MSE; significantly higher test MSE suggests overfitting.
mae_train/mae_test: Mean Absolute Error measures the average absolute difference between predicted and actual values. Lower values indicate better model performance. Like MSE, test MAE should be close to training MAE to avoid overfitting.
For logistic regression:
accuracy_train/accuracy_test: Accuracy measures the proportion of correct predictions (true positives + true negatives) / total predictions. Values range from 0 to 1, with higher values indicating better classification performance. Test accuracy should be close to training accuracy; large differences suggest overfitting.
auc_train/auc_test: Area Under the Curve measures the model's ability to distinguish between classes. Values range from 0.5 (random) to 1.0 (perfect discrimination). AUC > 0.7 is considered acceptable, > 0.8 is good, > 0.9 is excellent. Test AUC should be close to training AUC to avoid overfitting.
log_loss_train/log_loss_test: Log Loss (logarithmic loss) penalizes confident wrong predictions more heavily. Lower values indicate better model performance. Values close to 0 are ideal. Test log loss should be close to training log loss; higher test log loss suggests overfitting.
For Cox regression:
c-index_train/c-index_test: Concordance Index (C-index) measures the model's ability to correctly rank survival times. Values range from 0.5 (random) to 1.0 (perfect ranking). C-index > 0.7 is considered acceptable, > 0.8 is good, > 0.9 is excellent. Test C-index should be close to training C-index to avoid overfitting.
auc_hc: Harrell's C-index for time-dependent AUC, measuring discrimination at specific time points. Higher values indicate better discrimination ability.
auc_uno: Uno's C-index for time-dependent AUC, providing an alternative measure of discrimination that may be more robust to censoring patterns.
auc_sh: Schemper and Henderson's C-index for time-dependent AUC, offering another perspective on model discrimination performance.
Each cell contains the performance of the model by the corresponding strategy-metric combination. For the subset strategy with Information Criteria (IC), only the single best model across all variable numbers is shown. This does not apply to Significance Level (SL) since F/Rao statistics can only be compared between models with the same number of variables.
A list object returned by the stepwise() function
Additional arguments (currently not used)
# Load example data
data(mtcars)
# Run stepwise regression with multiple strategies and metrics
formula <- mpg ~ .
result <- stepwise(
  formula = formula,
  data = mtcars,
  type = "linear",
  strategy = c("forward", "backward", "bidirection"),
  metric = c("AIC", "BIC")
)
# Get performance summary
performance(result)
Run the code above in your browser using DataLab