AirOLS: Adaptive Iterative Ridge OLS Screening

Description

This function ranks features with the Adaptive Iterative Ridge Ordinary Least Squares (Air-OLS) method of Joudah et al. (2025) and returns both the per-feature ranks and the ordered feature indices. AirHOLP is intended for the high-dimensional case \(n \ge p\). When \(p > n\), use AirHOLP instead.

Usage

AirOLS(
  X,
  y,
  Threshold = min(ncol(X) - 1, ceiling(nrow(X)/log(nrow(X)))),
  r0 = 10,
  adapt = TRUE,
  iter = 10,
  Lambda,
  Up,
  XUp
)

Value

An object of class AirResult containing

order_r: Integer vector of feature indices sorted by absolute Air-OLS score, from largest to smallest.
index_r: Integer vector of feature ranks matching order_r.
Beta_r: Numeric vector of Air-OLS coefficient estimates.
r: Final ridge-penalty value used.
iter_last: Number of iterations performed for adaptive penalty selection.

Arguments

X: Numeric predictor matrix of dimension \(n \times p\).
y: Numeric response vector of length \(n\).
Threshold: Integer specifying the number of coefficients retained at each adaptive-penalty iteration (default \(n/\log(n)\) capped at \(p-1\)).
r0: Numeric initial ridge penalty (default \(10\)).
adapt: Logical; set to TRUE (default) to enable adaptive penalty selection.
iter: Integer; maximum number of iterations for adaptive-penalty selection (default \(10\)).
Lambda: Eigenvalues of \(X^T X\), if missing the function will compute it.
Up: Eigenvectors of \(X^T X\), if missing the function will compute it.
XUp: X times Up, if missing the function will compute it.

Details

The Threshold parameter controls how many coefficients are kept at each iteration of the adaptive-penalty procedure. The default value \(\lceil n/\log(n)\rceil\) performs well in most settings; changing it can reduce stability, so we recommend keeping the default unless you have a specific reason to adjust it. The parameters Lambda, Up, and XUp are helpful to run AirOLS on \(2\) or more different y vectors for the same X (to avoid repeated heavy computations).

References

Joudah, I., Muller, S., and Zhu, H. (2025). "Air-HOLP: Adaptive Regularized Feature Screening for High-Dimensional Data." Statistics and Computing. tools:::Rd_expr_doi("10.1007/s11222-025-10599-6")

Examples

Run this code

# Example 1 (default parameters)
set.seed(314)
X <- matrix(rnorm(10000), nrow = 200, ncol = 50)
y <- X[, 1] + X[, 10] + 2*rnorm(200)
result <- AirOLS(X, y)
str(result)
result$order_r[1:7] # the top 7 features
result$index_r[c(1, 10),] # ranks of the true features (x1, and x10)

# Example 2 (multiple responses, same X)
set.seed(314)
X <- matrix(rnorm(2000000), nrow = 2000, ncol = 1000)
y1 <- X[, 1] + X[, 2] + 6*rnorm(2000)
y2 <- X[, 1] - X[, 2] + 12*rnorm(2000)
y3 <- X[, 1] + X[, 2] - X[, 3] + 5*rnorm(2000)
y4 <- X[, 1] - X[, 2] + X[, 3] + 10*rnorm(2000)
XTX <- crossprod(X)
eXTX <- eigen(XTX)
Lambda <- eXTX$values
Up <- eXTX$vectors
XUp <- X%*%Up
result1 <- AirOLS(X, y1, Lambda = Lambda, Up = Up, XUp = XUp)
result1$order_r[1:7] # the top 7 features
result1$index_r[1:2,] # ranks of the true features (x1 and x2)
result2 <- AirOLS(X, y2, Lambda = Lambda, Up = Up, XUp = XUp)
result2$order_r[1:7] # the top 7 features
result2$index_r[1:2,] # ranks of the true features (x1 and x2)
result3 <- AirOLS(X, y3, Lambda = Lambda, Up = Up, XUp = XUp)
result3$order_r[1:7] # the top 7 features
result3$index_r[1:3,] # ranks of the true features (x1, x2, and x3)
result4 <- AirOLS(X, y4, Lambda = Lambda, Up = Up, XUp = XUp)
result4$order_r[1:7] # the top 7 features
result4$index_r[1:3,] # ranks of the true features (x1, x2, and x3)

# Example 3 (multiple fixed penalties)
set.seed(314)
X <- matrix(rnorm(10000), nrow = 200, ncol = 100)
y <- X[, 1] - X[, 2] + X[, 3] + 3*rnorm(200)
result <- AirOLS(X, y, r0 = c(1, 100, 10000), adapt = FALSE)
str(result)
result$order_r0[1:7,] # the top 7 features for each penalty
result$index_r0[1:3,] # ranks of the true features (x1, x2, and x3)

Run the code above in your browser using DataLab