Learn R Programming

fastml: Guarded Resampling Workflows for Safe and Automated Machine Learning in R

fastml is an R package for training, evaluating, and comparing machine learning models with a guarded resampling workflow.
Rather than introducing new learning algorithms, fastml focuses on reducing leakage risk by keeping preprocessing, model fitting, and evaluation aligned within supported resampling paths.

In fastml, fast refers to the rapid construction of statistically valid workflows, not to computational shortcuts. By eliminating entire classes of user-induced errors - most notably preprocessing leakage - fastml allows practitioners to obtain reliable performance estimates with minimal configuration.

Core Principles

  • Guarded resampling workflow
    When the guarded resampling path is used, preprocessing and model fitting are re-estimated independently within each resampling split. This reduces leakage risk, but does not prevent users from supplying preprocessed inputs.

  • Leakage risk reduction, not guarantees
    fastml can mitigate common leakage modes (e.g., global scaling or imputation before resampling) when workflows are fit within resamples. It does not universally prevent all leakage scenarios.

  • Single, unified interface
    Multiple models can be trained and benchmarked through a single call, while internally coordinating resampling, preprocessing, and evaluation in supported paths.

  • Compatibility with established engines
    fastml orchestrates existing modeling infrastructure (recipes, rsample, parsnip, yardstick) without modifying their statistical behavior.

Features

  • Preprocessing isolation within resampling
    Transformations (scaling, imputation, encoding, feature construction) are learned from training folds and applied to assessment folds when the guarded resampling path is used.

  • Support for multiple algorithms
    Includes tree-based models, linear and penalized models, kernel methods, neural networks, and boosting approaches via established engines.

  • Hyperparameter tuning within guarded resampling
    Grid and Bayesian tuning are performed safely inside the resampling loop.

  • Consistent performance evaluation
    Metrics such as Accuracy, ROC AUC, Sensitivity, Specificity, Precision, and F1 are computed without leakage.

  • Multiclass ROC AUC averaging
    Macro averaging (tidymodels default) weights each class equally. Set multiclass_auc = "macro_weighted" to weight by class prevalence; this can change model rankings on imbalanced data, so keep the choice consistent across runs.

  • Visualization and comparison tools
    Built-in plots facilitate comparison across models while preserving statistical validity.

Installation

From CRAN

You can install the latest stable version of fastml from CRAN using:

install.packages("fastml")

You can install all dependencies (additional models) using:

# install all dependencies - recommended
install.packages("fastml", dependencies = TRUE)

From GitHub

For the development version, install directly from GitHub using the devtools package:

# Install devtools if you haven't already
install.packages("devtools")

# Install fastml from GitHub
devtools::install_github("selcukorkmaz/fastml")

Quick Start

Here's a simple workflow to get you started with fastml:

library(fastml)
library(dplyr)

# Example dataset
data(iris)

iris_binary <- iris %>%
  filter(Species != "setosa") %>%
  mutate(Species = factor(Species))

# Train models
fit <- fastml(
  data = iris_binary,
  label = "Species",
  algorithms = c("rand_forest", "logistic_reg")
)

# View model summary
summary(fit)

# Plot the performance metrics
plot(fit, type = "bar")

# Plot ROC curves
plot(fit, type = "roc")

# Plot model calibration
plot(fit, type = "calibration")

Tuning Strategies

Hyperparameter tuning is supported via:

  • grid - regular grid search

  • bayes - Bayesian optimization

fit <- fastml(
  data = iris_binary,
  label = "Species",
  algorithms = c("rand_forest", "logistic_reg"),
  tuning_strategy = "bayes",
  tuning_iterations = 20
)

summary(fit)

tuning_iterations is used only for Bayesian optimization.

Explainability

Model explainability tools are provided through fastexplain():

# Prepare data
library(survival)
data(pbc, package = "survival")
  
# The pbc dataset has two parts; we only want the baseline data (rows 1-312)
pbc_baseline <- pbc[1:312, -c(1:4)]

# Train a regression model
fit_reg <- fastml(
  data = pbc_baseline,
  label = "albumin",
  algorithms = c("xgboost"),
  metric = "rmse",
  impute_method = "medianImpute"
)
  
# Feature importance and SHAP values based on DALEX
fastexplain(fit_reg, method = "dalex")

# Breakdown profile
fastexplain(fit_reg, method = "breakdown", observation = pbc_baseline[1, -9])

# Counterfactual explanation (Ceteris Paribus profile)
fastexplain(fit_reg, method = "counterfactual", observation = pbc_baseline[1, -9])

Explainability is performed on trained models and does not interfere with resampling or preprocessing.

Exploratory Diagnostics

fastexplore() provides read-only exploratory diagnostics prior to model training. It summarizes distributions, missingness, correlations, and basic structure without invoking resampling, preprocessing, or model fitting.

fastexplore(iris, label = "Species")

This function is decoupled from fastml's guarded resampling core and does not influence model evaluation unless its outputs are explicitly used in later modeling calls.

Scope

fastml is intended for users who require reliable performance estimation under cross-validation, particularly in:

  • multi-site or grouped data

  • high-dimensional biomedical applications

  • workflows prone to preprocessing leakage

It prioritizes correctness-oriented defaults and workflow clarity over maximum flexibility.

License

MIT License See LICENSE for details.

Copy Link

Version

Install

install.packages('fastml')

Monthly Downloads

477

Version

0.7.7

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Selcuk Korkmaz

Last Published

January 27th, 2026

Functions in fastml (0.7.7)

fastexplain

Explain a fastml model using various techniques
fastml_compute_holdout_results

Evaluate Models Function
explain_lime

Generate LIME explanations for a fastml model
fastml_normalize_survival_status

Internal helpers for survival-specific preprocessing
fastml_guard_validate_indices

Guarded Resampling Utilities
fastml_prepare_explainer_inputs

Internal helper to prepare explainer inputs from a fastml object
fastml

Fast Machine Learning Function
explain_stability

Analyze Feature Importance Stability Across Cross-Validation Folds
fastexplore

Lightweight exploratory helper
get_best_model_idx

Get Best Model Indices by Metric and Group
extract_survreg_components

Extract survreg Linear Predictor and Scale
get_default_differences

Get All Default Differences Summary
flatten_and_rename_models

Flatten and Rename Models
get_best_workflows

Get Best Workflows
get_default_params_with_warnings

Get Default Parameters with Transparency Warnings
get_default_tune_params

Get Default Tuning Parameters
get_best_model_names

Get Best Model Names
get_default_engine

Get Default Engine
get_default_params

Get Default Parameters for an Algorithm
format_default_override_warning

Format Default Override Warning Message
interaction_strength

Compute feature interaction strengths for a fastml model
load_model

Load Model Function
get_engine_names

Get Engine Names from Model Workflows
get_model_engine_names

Get Model Engine Names
get_surv_info

Extract Time and Status from Survival Matrix
get_tuning_complexity

Tuning Complexity Presets
get_tuning_params_for_complexity

Get Tuning Parameters for Complexity Level
get_expanded_tune_params

Expanded Default Tuning Parameters
print.fastml_stability

Print method for fastml_stability objects
plot.fastml_stability

Plot method for fastml_stability objects
plot_ice

Plot ICE curves for a fastml model
map_brier_values

Map Brier Curve Values to Specific Horizons
get_parsnip_default_params

Get Parsnip Default Parameters for an Algorithm
get_parsnip_default_engine

Get Parsnip Default Engine for an Algorithm
predict_model.model_fit

Internal predict_model method for parsnip fits
plot.fastml

Plot Methods for fastml Objects
predict.fastml

Predict method for fastml objects
predict_survival

Predict survival probabilities from a survival model
predict_risk

Predict Risk Scores from a Survival Model
print_default_differences

Print Default Differences Table
surrogate_tree

Fit a surrogate decision tree for a fastml model
summary.fastml

Summary Function for fastml (Using yardstick for ROC Curves)
save.fastml

Save Model Function
sanitize

Clean Column Names or Character Vectors by Removing Special Characters
resolve_positive_class

Resolve the positive class for binary classification
train_models

Train Specified Machine Learning Algorithms on the Training Data
reset_default_warnings

Reset Default Override Warnings
recommend_tuning_config

Recommend Tuning Configuration
warn_default_override

Warn About Default Overrides
tuning_config

Tuning Configuration and Complexity Presets
process_model

Process and Evaluate a Model Workflow
validate_defaults_registry

Validate Defaults Registry Against Parsnip
print_tuning_presets

Print Tuning Presets Summary
compute_rmst_difference

Compute Difference in Restricted Mean Survival Time (RMST)
availableMethods

Get Available Methods
defaults_registry

Defaults Registry for Engine and Parameter Transparency
estimate_tuning_time

Estimate Tuning Time
counterfactual_explain

Generate counterfactual explanations for a fastml model
explain_dalex

Generate DALEX explanations for a fastml model
explain_ale

Compute Accumulated Local Effects (ALE) for a fastml model
align_survival_curve

Align Survival Curve to Evaluation Times
assign_risk_group

Assign Risk Groups
clamp01

Clamp Values to [0, 1]
compute_uno_c_index

Compute Uno's C-index (Time-Dependent AUC)
compute_survreg_matrix

Compute Survival Matrix from survreg Model
.fastml_warned_defaults

Environment for Tracking Warned Defaults
compute_tau_limit

Compute Tau Limit (t_max)
compute_ibrier

Compute Integrated Brier Score and Curve
compare_defaults

Compare fastml and parsnip defaults
convert_survival_predictions

Convert Various Prediction Formats to Survival Matrix
build_survfit_matrix

Build Survival Matrix from survfit Object
determine_round_digits

Determine rounding digits for time horizons
create_censor_eval

Create Censoring Distribution Evaluator