Learn R Programming

modeldata

modeldata contains data sets used in documentation and testing for tidymodels packages. The package also contains a suite of simulation functions for classification and regression data.

Installation

You can install the released version of modeldata from CRAN with:

install.packages("modeldata")

And the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/modeldata")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('modeldata')

Monthly Downloads

26,642

Version

1.5.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Max Kuhn

Last Published

July 31st, 2025

Functions in modeldata (1.5.0)

check_times

Execution time data
mlc_churn

Customer churn data
modeldata-package

modeldata: Data Sets Useful for Modeling Examples
hpc_data

High-performance computing system data
pathology

Liver pathology data
hpc_cv

Class probability predictions
pd_speech

Parkinson's disease speech classification data set
ischemic_stroke

Clinical data used to predict ischemic stroke
leaf_id_flavia

Leaf identification data (Flavia)
solubility_test

Solubility predictions from MARS model
small_fine_foods

Fine foods example data
steroidogenic_toxicity

Predicting steroidogenic toxicity with assay data
sim_classification

Simulate datasets
stackoverflow

Annual Stack Overflow Developer Survey Data
scat

Morphometric data on scat
taxi

Chicago taxi data set
oils

Fatty acid composition of commercial oils
parabolic

Parabolic class boundary data
tate_text

Tate Gallery modern artwork metadata
hepatic_injury_qsar

Predicting hepatic injury from chemical information
hotel_rates

Daily Hotel Rate Data
two_class_example

Two class predictions
two_class_dat

Two class data
lending_club

Loan data
meats

Fat, water and protein content of meat samples
permeability_qsar

Predicting permeability from chemical information
penguins

Palmer Station penguin data
wa_churn

Watson churn data
biomass

Biomass data
Smithsonian

Smithsonian museums
ames

Ames Housing Data
chem_proc_yield

Chemical manufacturing process data set
crickets

Rates of Cricket Chirps
drinks

Sample time series data
Sacramento

Sacramento CA home prices
bivariate

Example bivariate classification data
attrition

Job attrition
car_prices

Kelly Blue Book resale data for 2005 model year GM cars
Chicago

Chicago ridership data
credit_data

Credit data
cells

Cell body segmentation
grants

Grant acceptance data
cat_adoption

Cat Adoption
deliveries

Food Delivery Time Data
ad_data

Alzheimer's disease data
concrete

Compressive strength of concrete mixtures
covers

Raw cover type data