Learn R Programming

CLRtools (version 0.1.0)

DRtest: Deviance Residuals Test (HL Test)

Description

The function performs the Hosmer-Lemeshow (HL) goodness-of-fit test, which is used to evaluate how well a logistic regression model fits the data by comparing observed and expected frequencies across different groups based on predicted probabilities. This function calculates the test using either the model's predictions or user-supplied predictions. It divides the data into g groups, based on the predicted values, and calculates the chi-squared statistic to assess the fit.

Usage

DRtest(model = NULL, yvar = NULL, yhatvar = NULL, g = 10)

Value

A list containing the following components:

results

A data frame with observed and expected values, and total counts for each group. The data frame includes columns for the observed and expected counts for both the outcome (1 and 0) and total counts.

chisq

The chi-squared statistic used to assess the fit of the model.

df

The degrees of freedom for the chi-squared test.

p.value

The p-value of the chi-squared test, which assesses the overall goodness of fit.

groups

The number of groups (quantiles) used in the test.

Arguments

model

A fitted logistic regression model from glm() with family = binomial. If model is supplied, the yvar and yhatvar arguments will not be used.

yvar

A vector of observed response values. Required if model is NULL.

yhatvar

A vector of predicted probabilities. Required if model is NULL.

g

The number of groups (quantiles) to divide the data into for the Hosmer-Lemeshow test. Default is 10.

Details

The Hosmer-Lemeshow test compares the observed and expected frequencies of events across different quantiles of predicted probabilities. The test statistic follows a chi-squared distribution.

Examples

Run this code
# Example from Hosmer et al., 2013
# Applied Logistic Regression (3rd ed.), Chapter 5, Table 5.2

# Recode 'raterisk' into a binary variable 'raterisk_cat'
glow500 <- dplyr::mutate(
  glow500,
  raterisk_cat = dplyr::case_when(
    raterisk %in% c("Less", "Same") ~ "C1",
    raterisk == "Greater" ~ "C2"
  )
)

# Fit a multiple logistic regression model with interactions
model.int <- glm(
  fracture ~ age + height + priorfrac + momfrac + armassist +
    raterisk_cat + age * priorfrac + momfrac * armassist,
  family = binomial,
  data = glow500
)

# Perform Hosmer-Lemeshow test with default 10 groups
DRtest(model.int)

Run the code above in your browser using DataLab