mdyplFit: Fitting function for `glm()` for maximum Diaconis-Ylvisaker prior penalized likelihood estimation of logistic regression models

Description

mdyplFit() is a fitting method for glm() that fits logistic regression models using maximum Diaconis-Ylvisaker prior penalized likelihood estimation.

Usage

mdyplFit(
  x,
  y,
  weights = rep(1, nobs),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep(0, nobs),
  family = binomial(),
  control = list(),
  intercept = TRUE,
  singular.ok = TRUE
)
mdypl_fit(
  x,
  y,
  weights = rep(1, nobs),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep(0, nobs),
  family = binomial(),
  control = list(),
  intercept = TRUE,
  singular.ok = TRUE
)

Value

An object inheriting from "mdyplFit" object, which is a list having the same elements to the list that stats::glm.fit() returns, with a few extra arguments.

Arguments

x: a design matrix of dimension n * p.
y: a vector of observations of length n.
weights: an optional vector of ‘prior weights’ to be used in the fitting process. Should be NULL or a numeric vector.
start: starting values for the parameters in the linear predictor.
etastart: starting values for the linear predictor.
mustart: starting values for the vector of means.
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset.
family: a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for details of family functions.)
control: a list of parameters controlling the fitting process. See mdyplControl() for details.
intercept: logical. Should an intercept be included in the null model?
singular.ok: logical; if FALSE a singular fit is an error.

Author

Ioannis Kosmidis [aut, cre] ioannis.kosmidis@warwick.ac.uk

Details

mdyplFit() uses stats::glm.fit() to fit a logistic regression model on responses alpha * y + (1 - alpha) / 2, where y are the original binomial responses scaled by the binomial totals. This is equivalent to penalizing the likelihood by the Diaconis-Ylvisaker prior with shrinkage parameter \(\alpha\) and regression parameters set to zero. See Rigon & Aliverti (2023) and Sterzinger & Kosmidis (2024).

By default, alpha = n / (p + n) is used, where n is the sum of the binomial totals. Alternative values of alpha can be passed to the control argument; see mdyplControl() for setting up the list passed to control. If alpha = 1 then mdyplFit() will simply do maximum likelihood estimation.

Note that null.deviance, deviance and aic in the resulting object are computed at the adjusted responses. Hence, methods such as logLik() and AIC() use the penalized log-likelihood. With the default alpha, the inferential procedures based on penalized likelihood are asymptotically equivalent to the ones that use the unpenalized likelihood when p/n is vanishing asymptotically.

For high-dimensionality corrected estimates, standard errors and z statistics, use the summary method for "mdyplFit" objects with hd_correction = TRUE.

mdypl_fit() is an alias to mdyplFit().

References

Sterzinger P, Kosmidis I (2024). Diaconis-Ylvisaker prior penalized likelihood for \(p/n \to \kappa \in (0,1)\) logistic regression. arXiv:2311.07419v2, https://arxiv.org/abs/2311.07419.

Rigon T, Aliverti E (2023). Conjugate priors and bias reduction for logistic regression models. Statistics & Probability Letters, 202, 109901. tools:::Rd_expr_doi("10.1016/j.spl.2023.109901").

Examples

Run this code


data("lizards", package = "brglm2")
liz_fm <- cbind(grahami, opalinus) ~ height + diameter + light + time
## ML fit = MDYPL fit with `alpha = 1`
liz_ml <- glm(liz_fm, family = binomial(), data = lizards,
              method = "mdyplFit", alpha = 1)
liz_ml0 <- glm(liz_fm, family = binomial(), data = lizards)

## liz_ml is the same fit as liz_ml0
summ_liz_ml <- summary(liz_ml)
summ_liz_ml0 <- summary(liz_ml0)
all.equal(coef(summ_liz_ml), coef(summ_liz_ml0))

## MDYPL fit with default `alpha` (see `?mdyplControl`)
liz_fm <- cbind(grahami, opalinus) ~ height + diameter + light + time
liz_mdypl <- glm(liz_ml, family = binomial(), data = lizards,
                 method = "mdyplFit")

## Comparing outputs from ML and MDYPL, with and without
## high-dimensionality corrections.
summary(liz_mdypl)
summary(liz_mdypl, hd_correction = TRUE)
summ_liz_ml
summary(liz_ml, hd_correction = TRUE)
## Not much difference in fits here as this is a low dimensional
## problem with dimensionality constant
(liz_ml$rank - 1) / sum(weights(liz_ml))



## The case study in Section 8 of Sterzinger and
## Kosmidis (2024)
data("MultipleFeatures", package = "brglm2")

## Center the fou.* and kar.* features
vars <- grep("fou|kar", names(MultipleFeatures), value = TRUE)
train_id <- which(MultipleFeatures$training)
MultipleFeatures[train_id, vars] <- scale(MultipleFeatures[train_id, vars], scale = FALSE)
## Compute the MDYPL fits
kappa <- length(vars) / sum(MultipleFeatures$training)
full_fm <- formula(paste("I(digit == 7) ~", paste(vars, collapse = " + ")))
nest_vars <- grep("fou", vars, value = TRUE)
nest_fm <- formula(paste("I(digit == 7) ~", paste(nest_vars, collapse = " + ")))
full_m <- glm(full_fm, data = MultipleFeatures, family = binomial(),
              method = mdyplFit, alpha = 1 / (1 + kappa), subset = training)
nest_m <- update(full_m, nest_fm)

## With a naive penalized likelihood ratio test we get no evidence
## against the hypothesis that the model with only `fou` features
## is an as good descrition of `7` as the model with both `fou` and
## `kar` features.
plrtest(nest_m, full_m)

## With a high-dimensionality correction theres is strong evidence
## against the model with only `fou` features
plrtest(nest_m, full_m, hd_correction = TRUE)


# \donttest{
## A simulated data set as in Rigon & Aliverti (2023, Section 4.3)

set.seed(123)
n <- 1000
p <- 500
gamma <- sqrt(5)
X <- matrix(rnorm(n * p, 0, 1), nrow = n, ncol = p)
betas0 <- rep(c(-1, -1/2, 0, 2, 3), each = p / 5)
betas <- gamma * betas0 / sqrt(sum(betas0^2))
probs <- plogis(drop(X %*% betas))
y <- rbinom(n, 1, probs)
fit_mdypl <- glm(y ~ -1 + X, family = binomial(), method = "mdyplFit")

## The default value of `alpha` is `n / (n + p)` here
identical(n / (n + p), fit_mdypl$alpha)

## Aggregate bias of MDYPL and rescaled MDYPL estimators
ag_bias <- function(estimates, beta) mean(estimates - beta)
ag_bias(coef(summary(fit_mdypl))[, "Estimate"], betas)
ag_bias(coef(summary(fit_mdypl, hd_correction = TRUE))[, "Estimate"], betas)

# }

Run the code above in your browser using DataLab