Learn R Programming

CLRtools (version 0.1.0)

cutpoints: Table with Sensitivity and Specificity at Different Cutpoints

Description

This function computes the sensitivity and specificity at various cutpoints for a given logistic regression model. It generates a table summarizing the performance metrics (sensitivity, specificity) at different probability cutoffs and optionally plots these metrics and the distribution of probabilities for each class. This is useful for selecting an optimal threshold for classification.

Usage

cutpoints(model, cmin = 0, cmax = 1, byval = 0.05, plot = TRUE)

Value

A data frame containing cutpoints, sensitivity, specificity, and specificity complement for each cutoff. If plot = TRUE, a ggplot2-based visualization is also printed, showing sensitivity and specificity curves and the distribution of predicted probabilities by outcome class, with the optimal cutoff (where sensitivity and specificity are closest) indicated on the histogram.

Arguments

model

A fitted logistic regression model (either glm or clogit).

cmin

The minimum cutoff value for the predicted probabilities. Defaults to 0.

cmax

The maximum cutoff value for the predicted probabilities. Defaults to 1.

byval

The increment for cutpoints. Defaults to 0.05.

plot

Logical value indicating whether to generate plots. Defaults to TRUE.

Details

The function calculates sensitivity and specificity for a range of cutpoints from cmin to cmax with a step size of byval. It then plots the relationship between sensitivity and specificity, as well as histograms of estimated probabilities. The cutpoint with the smallest difference between sensitivity and specificity is also marked on the histogram plots. This can aid in finding an optimal classification threshold.

Examples

Run this code
# Example from Hosmer et al., 2013
# Applied Logistic Regression (3rd ed.), Chapter 5, Table 5.8

# Recode 'raterisk' into a binary variable 'raterisk_cat'
glow500 <- dplyr::mutate(
  glow500,
  raterisk_cat = dplyr::case_when(
    raterisk %in% c("Less", "Same") ~ "C1",
    raterisk == "Greater" ~ "C2"
  )
)

# Fit a multiple logistic regression model with interactions
model.int <- glm(
  fracture ~ age + height + priorfrac + momfrac + armassist +
    raterisk_cat + age * priorfrac + momfrac * armassist,
  family = binomial,
  data = glow500
)

# Compute sensitivity and specificity at multiple cutpoints
cutpoints(model.int, cmin = 0.05, cmax = 0.75, byval = 0.05, plot = FALSE)

Run the code above in your browser using DataLab