auctestr (version 1.0.0)

sample_experiment_data: Performance of several predictive models over three different datasets, using multiple cutoff points for time within each dataset.

Description

A dataset containing the performance of several predictive models over three different datasets, where models are built using data from multiple time points (where time 1 has less data than time 2, but each subsequent time point T also uses data from all prior time points up to that time t<= T.) This represents the typical output of a machine learning experiment where several models are being considered across multiple datasets, often with different variants of each model type being considered (i.e., different hyperparameter settings of each model).

Usage

sample_experiment_data

Arguments

Format

A data frame with 180 rows and 10 variables:

auc

Area Under the Receiver Operating Characteristic Curve, or AUC, for this model configuration.

precision

Precision for this model configuration.

accuracy

Accuracy for this model configuration.

n

Number of observations in this dataset.

n_n

Number of negative observations (i.e., outcome == 0) in this dataset (required for standard error estimation of AUC statistic).

n_p

Number of positive observations (i.e., outcome == 1) in this dataset (required for standard error estimation of AUC statistic).

dataset

indicator for different datasets.

time

indicator for different time points used to build each dataset; these represent dependent observations of model performance.

model_id

Indicator for the statistical algorithm used (this could be 'Logistic Regression', 'SVM', etc.).

model_variant

Indicator for different variants of each model which are not equivalent and should be used individually (model should not be averaged over these, and instead should be held fixed when comparing to other model). Example of this could be various hyperparameter settings for a given model (i.e., cost for an SVM).