AirHOLP: Adaptive Iterative Ridge HOLP Screening

Description

This function ranks features with the Adaptive Iterative Ridge High-dimensional Ordinary Least-squares Projection (Air-HOLP) method of Joudah et al. (2025) and returns both the per-feature ranks and the ordered feature indices. AirHOLP is intended for the high-dimensional case \(p \ge n\). When \(n > p\), use AirOLS instead.

Usage

AirHOLP(
  X,
  y,
  Threshold = min(ncol(X) - 1, ceiling(nrow(X)/log(nrow(X)))),
  r0 = 10,
  adapt = TRUE,
  iter = 10,
  Lambda,
  Un,
  XUn
)

Value

An object of class AirResult containing

order_r: Integer vector of feature indices sorted by absolute Air-HOLP score, from largest to smallest.
index_r: Integer vector of feature ranks matching order_r.
Beta_r: Numeric vector of Air-HOLP coefficient estimates.
r: Final ridge-penalty value used.
iter_last: Number of iterations performed for adaptive penalty selection.

Arguments

X: Numeric predictor matrix of dimension \(n \times p\).
y: Numeric response vector of length \(n\).
Threshold: Integer specifying the number of coefficients retained at each adaptive-penalty iteration (default \(n/\log(n)\) capped at \(p-1\)).
r0: Numeric initial ridge penalty (default \(10\)).
adapt: Logical; set to TRUE (default) to enable adaptive penalty selection.
iter: Integer; maximum number of iterations for adaptive-penalty selection (default \(10\)).
Lambda: Eigenvalues of \(XX^T\), if missing the function will compute it.
Un: Eigenvectors of \(XX^T\), if missing the function will compute it.
XUn: X transpose times Un, if missing the function will compute it.

Details

The Threshold parameter controls how many coefficients are kept at each iteration of the adaptive-penalty procedure. The default value \(\lceil n/\log(n)\rceil\) performs well in most settings; changing it can reduce stability, so we recommend keeping the default unless you have a specific reason to adjust it. The parameters Lambda, Un, and XUn are helpful to run AirHOLP on \(2\) or more different y vectors for the same X (to avoid repeated heavy computations).

References

Joudah, I., Muller, S., and Zhu, H. (2025). "Air-HOLP: Adaptive Regularized Feature Screening for High-Dimensional Data." Statistics and Computing. tools:::Rd_expr_doi("10.1007/s11222-025-10599-6")

Examples

Run this code

# Example 1 (default parameters)
set.seed(314)
X <- matrix(rnorm(10000), nrow = 50, ncol = 200)
y <- X[, 1] + X[, 10] + rnorm(50)
result <- AirHOLP(X, y)
str(result)
result$order_r[1:7] # the top 7 features
result$index_r[c(1, 10),] # ranks of the true features (x1, and x10)

# Example 2 (multiple responses, same X)
set.seed(314)
X <- matrix(rnorm(2000000), nrow = 1000, ncol = 2000)
y1 <- X[, 1] + X[, 2] + 6*rnorm(1000)
y2 <- X[, 1] - X[, 2] + 12*rnorm(1000)
y3 <- X[, 1] + X[, 2] - X[, 3] + 3*rnorm(1000)
y4 <- X[, 1] - X[, 2] + X[, 3] + 9*rnorm(1000)
XXT <- tcrossprod(X)
eXXT <- eigen(XXT)
Lambda <- eXXT$values
Un <- eXXT$vectors
XUn <- crossprod(X,Un)
result1 <- AirHOLP(X, y1, Lambda = Lambda, Un = Un, XUn = XUn)
result1$order_r[1:7] # the top 7 features
result1$index_r[1:2,] # ranks of the true features (x1 and x2)
result2 <- AirHOLP(X, y2, Lambda = Lambda, Un = Un, XUn = XUn)
result2$order_r[1:7] # the top 7 features
result2$index_r[1:2,] # ranks of the true features (x1 and x2)
result3 <- AirHOLP(X, y3, Lambda = Lambda, Un = Un, XUn = XUn)
result3$order_r[1:7] # the top 7 features
result3$index_r[1:3,] # ranks of the true features (x1, x2, and x3)
result4 <- AirHOLP(X, y4, Lambda = Lambda, Un = Un, XUn = XUn)
result4$order_r[1:7] # the top 7 features
result4$index_r[1:3,] # ranks of the true features (x1, x2, and x3)

# Example 3 (multiple fixed penalties)
set.seed(314)
X <- matrix(rnorm(10000), nrow = 100, ncol = 200)
y <- X[, 1] - X[, 2] + X[, 3] + 2*rnorm(100)
result <- AirHOLP(X, y, r0 = c(1, 100, 10000), adapt = FALSE)
str(result)
result$order_r0[1:7,] # the top 7 features (for each penalty)
result$index_r0[1:3,] # ranks of the true features (x1, x2, and x3)

Run the code above in your browser using DataLab