predict.svem_model: Predict Method for SVEM Models (Gaussian and Binomial)

Description

Generate predictions from a fitted SVEM model (Gaussian or binomial), with optional bootstrap uncertainty and family-appropriate output scales.

Usage

# S3 method for svem_model
predict(
  object,
  newdata,
  type = c("response", "link", "class"),
  debias = FALSE,
  se.fit = FALSE,
  interval = FALSE,
  level = 0.95,
  ...
)

Value

If se.fit = FALSE and interval = FALSE:

Gaussian: a numeric vector of predictions on the response (identity) scale.
Binomial: a numeric vector for type = "response" (probabilities) or type = "link" (log-odds), or an integer vector of 0/1 labels for type = "class".

If se.fit and/or interval are TRUE (and type != "class"), a list with components:

fit: predictions on the requested scale.
se.fit: bootstrap standard errors (when se.fit = TRUE).
lwr, upr: percentile confidence limits (when interval = TRUE).

Rows containing unseen or missing factor levels produce NA

predictions (and NA SEs/intervals), with a warning.

Arguments

object

A fitted SVEM model (class svem_model; binomial models typically also inherit class svem_binomial). Created by SVEMnet().

newdata

A data frame of new predictor values.

type

(Binomial only) One of:

"response" (default): predicted probabilities in $[0,1]$.
"link": linear predictor (log-odds).
"class": 0/1 class labels (threshold 0.5). Uncertainty summaries are not available for this type.

Ignored for Gaussian models.

debias

(Gaussian only) Logical; default FALSE. If TRUE, apply the linear calibration fit lm(y ~ y_pred) learned at training when available. Ignored (and internally set to FALSE) for binomial models.

se.fit

Logical; if TRUE, return bootstrap standard errors computed from member predictions (requires coef_matrix). Not available for type = "class". For Gaussian models, this forces use of bootstrap member predictions instead of aggregate coefficients.

interval

Logical; if TRUE, return percentile confidence limits from member predictions (requires coef_matrix). Not available for type = "class". For Gaussian models, this forces use of bootstrap member predictions instead of aggregate coefficients.

level

Confidence level for percentile intervals. Default 0.95.

...

Currently unused.

Design-matrix reconstruction

The function rebuilds the design matrix for newdata to match the training design:

Uses the training terms (with environment set to baseenv()).
Harmonizes factor and character predictors to the training xlevels.
Reuses stored per-factor contrasts when available; otherwise falls back to saved global contrast options.
Zero-fills any columns present at training but absent in newdata, and reorders columns to match the training order.

Rows containing unseen factor levels yield NA predictions (with a warning).

Aggregation and debiasing

For Gaussian SVEM models:

Point predictions: When se.fit = FALSE and interval = FALSE, predictions are computed from the aggregated coefficients saved at fit time (object$parms; or object$parms_debiased when debias = TRUE). This is algebraically equivalent to averaging member predictions when the coefficients were formed as bootstrap means.
Bootstrap-based summaries: When se.fit = TRUE and/or interval = TRUE, predictions are computed from per-bootstrap member predictions using object$coef_matrix. For debias = TRUE, the linear calibration is applied to member predictions before summarizing.

For binomial SVEM models, predictions are always aggregated from member predictions on the requested scale (probability or link) using coef_matrix; the stored coefficient averages (parms, parms_debiased) are retained for diagnostics but are not used in prediction. The debias argument is ignored and treated as FALSE for binomial models.

For Gaussian fits, if debias = TRUE and a calibration model lm(y ~ y_pred) was learned at training, predictions (and, when applicable, member predictions) are transformed by that calibration. This debiasing is never applied for binomial fits.

Uncertainty

When se.fit = TRUE, standard errors are computed as the row-wise standard deviations of member predictions on the requested scale. When interval = TRUE, percentile intervals are computed from member predictions on the requested scale, using the requested level. Both require a non-null coef_matrix. For type = "class" (binomial), uncertainty summaries are not available.

Details

This method dispatches on object$family:

Gaussian: returns predictions on the response (identity) scale. Optional linear calibration ("debias") learned at training may be applied.
Binomial: supports glmnet-style type = "link", "response", or "class" predictions. No debiasing is applied; type = "response" returns probabilities in $[0,1]$.

Uncertainty summaries (se.fit, interval) and all binomial predictions are based on per-bootstrap member predictions obtained from the coefficient matrix stored in object$coef_matrix. If coef_matrix is NULL, these options are not available (and binomial prediction will fail). For Gaussian models with se.fit = FALSE and interval = FALSE, predictions are computed directly from the aggregated coefficients.

Examples

Run this code

## ---- Gaussian example -------------------------------------------------
set.seed(1)
n  <- 60
X1 <- rnorm(n); X2 <- rnorm(n); X3 <- rnorm(n)
y  <- 1 + 0.8 * X1 - 0.6 * X2 + 0.2 * X3 + rnorm(n, 0, 0.4)
dat <- data.frame(y, X1, X2, X3)

fit_g <- SVEMnet(
  y ~ (X1 + X2 + X3)^2, dat,
  nBoot = 40, glmnet_alpha = c(1, 0.5),
  relaxed = TRUE, family = "gaussian"
)

## Aggregate-coefficient predictions (with and without debiasing)
p_g  <- predict(fit_g, dat)                # debias = FALSE (default)
p_gd <- predict(fit_g, dat, debias = TRUE) # apply calibration, if available

## Bootstrap-based uncertainty (requires coef_matrix)
out_g <- predict(
  fit_g, dat,
  debias   = TRUE,
  se.fit   = TRUE,
  interval = TRUE,
  level    = 0.90
)
str(out_g)

# \donttest{
## ---- Binomial example ------------------------------------------------
set.seed(2)
n  <- 120
X1 <- rnorm(n); X2 <- rnorm(n); X3 <- rnorm(n)
eta <- -0.3 + 1.1 * X1 - 0.8 * X2 + 0.5 * X1 * X3
p   <- plogis(eta)
yb  <- rbinom(n, 1, p)
db  <- data.frame(yb = yb, X1 = X1, X2 = X2, X3 = X3)

fit_b <- SVEMnet(
  yb ~ (X1 + X2 + X3)^2, db,
  nBoot = 50, glmnet_alpha = c(1, 0.5),
  relaxed = TRUE, family = "binomial"
)

## Probabilities, link, and classes
p_resp <- predict(fit_b, db, type = "response")
p_link <- predict(fit_b, db, type = "link")
y_hat  <- predict(fit_b, db, type = "class")  # 0/1 labels (no SE or interval)

## Bootstrap-based uncertainty on the probability scale
out_b <- predict(
  fit_b, db,
  type     = "response",
  se.fit   = TRUE,
  interval = TRUE,
  level    = 0.90
)
str(out_b)
# }

Run the code above in your browser using DataLab