Learn R Programming

R/hal9001

The Scalable Highly Adaptive Lasso

Authors: Jeremy Coyle, Nima Hejazi, Rachael Phillips, Lars van der Laan, and Mark van der Laan


What’s hal9001?

hal9001 is an R package providing an implementation of the scalable highly adaptive lasso (HAL), a nonparametric regression estimator that applies L1-regularized lasso regression to a design matrix composed of indicator functions corresponding to the support of the functional over a set of covariates and interactions thereof. HAL regression allows for arbitrarily complex functional forms to be estimated at fast (near-parametric) convergence rates under only global smoothness assumptions (van der Laan 2017a; Bibaut and van der Laan 2019). For detailed theoretical discussions of the highly adaptive lasso estimator, consider consulting, for example, van der Laan (2017a), van der Laan (2017b), and van der Laan and Bibaut (2017). For a computational demonstration of the versatility of HAL regression, see Benkeser and van der Laan (2016). Recent theoretical works have demonstrated success in building efficient estimators of complex parameters when particular variations of HAL regression are used to estimate nuisance parameters (e.g., van der Laan, Benkeser, and Cai 2019; Ertefaie, Hejazi, and van der Laan 2020).


Installation

For standard use, we recommend installing the package from CRAN via

install.packages("hal9001")

To contribute, install the development version of hal9001 from GitHub via remotes:

remotes::install_github("tlverse/hal9001")

Issues

If you encounter any bugs or have any specific feature requests, please file an issue.


Example

Consider the following minimal example in using hal9001 to generate predictions via Highly Adaptive Lasso regression:

# load the package and set a seed
library(hal9001)
#> Loading required package: Rcpp
<<<<<<< HEAD
#> hal9001 v0.4.4: The Scalable Highly Adaptive Lasso
=======
#> hal9001 v0.4.5: The Scalable Highly Adaptive Lasso
>>>>>>> 81093a5ceebcd36630f308dd07f69d4e30f07f1c
#> note: fit_hal defaults have changed. See ?fit_hal for details
set.seed(385971)

# simulate data
n <- 100
p <- 3
x <- matrix(rnorm(n * p), n, p)
y <- x[, 1] * sin(x[, 2]) + rnorm(n, mean = 0, sd = 0.2)

# fit the HAL regression
hal_fit <- fit_hal(X = x, Y = y, yolo = TRUE)
#> [1] "I'm sorry, Dave. I'm afraid I can't do that."
hal_fit$times
#>                   user.self sys.self elapsed user.child sys.child
#> enumerate_basis       0.014    0.003   0.059          0         0
#> design_matrix         0.004    0.001   0.005          0         0
#> reduce_basis          0.000    0.000   0.000          0         0
#> remove_duplicates     0.000    0.000   0.000          0         0
#> lasso                 2.684    0.343   6.583          0         0
#> total                 2.703    0.348   6.655          0         0

# training sample prediction
preds <- predict(hal_fit, new_data = x)
mean(hal_mse <- (preds - y)^2)
#> [1] 0.03667466

Contributions

Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.


Citation

After using the hal9001 R package, please cite both of the following:

    @software{coyle2022hal9001-rpkg,
      author = {Coyle, Jeremy R and Hejazi, Nima S and Phillips, Rachael V
        and {van der Laan}, Lars and {van der Laan}, Mark J},
      title = {{hal9001}: The scalable highly adaptive lasso},
      year  = {2022},
      url = {https://doi.org/10.5281/zenodo.3558313},
      doi = {10.5281/zenodo.3558313}
      note = {{R} package version 0.4.2}
    }

    @article{hejazi2020hal9001-joss,
      author = {Hejazi, Nima S and Coyle, Jeremy R and {van der Laan}, Mark
        J},
      title = {{hal9001}: Scalable highly adaptive lasso regression in
        {R}},
      year  = {2020},
      url = {https://doi.org/10.21105/joss.02526},
      doi = {10.21105/joss.02526},
      journal = {Journal of Open Source Software},
      publisher = {The Open Journal}
    }

License

© 2017-2022 Jeremy R. Coyle & Nima S. Hejazi

The contents of this repository are distributed under the GPL-3 license. See file LICENSE for details.


References

Benkeser, David, and Mark J van der Laan. 2016. “The Highly Adaptive Lasso Estimator.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE. https://doi.org/10.1109/dsaa.2016.93.

Bibaut, Aurélien F, and Mark J van der Laan. 2019. “Fast Rates for Empirical Risk Minimization over Càdlàg Functions with Bounded Sectional Variation Norm.” https://arxiv.org/abs/1907.09244.

Ertefaie, Ashkan, Nima S Hejazi, and Mark J van der Laan. 2020. “Nonparametric Inverse Probability Weighted Estimators Based on the Highly Adaptive Lasso.” https://arxiv.org/abs/2005.11303.

van der Laan, Mark J. 2017a. “A Generally Efficient Targeted Minimum Loss Based Estimator Based on the Highly Adaptive Lasso.” The International Journal of Biostatistics. https://doi.org/10.1515/ijb-2015-0097.

———. 2017b. “Finite Sample Inference for Targeted Learning.” https://arxiv.org/abs/1708.09502.

van der Laan, Mark J, David Benkeser, and Weixin Cai. 2019. “Efficient Estimation of Pathwise Differentiable Target Parameters with the Undersmoothed Highly Adaptive Lasso.” https://arxiv.org/abs/1908.05607.

van der Laan, Mark J, and Aurélien F Bibaut. 2017. “Uniform Consistency of the Highly Adaptive Lasso Estimator of Infinite-Dimensional Parameters.” https://arxiv.org/abs/1709.06256.

Copy Link

Version

Install

install.packages('hal9001')

Monthly Downloads

1,298

Version

0.4.6

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Jeremy Coyle

Last Published

November 14th, 2023

Functions in hal9001 (0.4.6)

meets_basis

Compute Values of Basis Functions
num_knots_generator

A default generator for the num_knots argument for each degree of interactions and the smoothness orders.
print.formula_hal9001

Print formula_hal9001 object
formula_hal

HAL Formula: Convert formula or string to formula_HAL object.
predict.hal9001

Prediction from HAL fits
squash_hal_fit

Squash HAL objects
make_copy_map

Build Copy Maps
quantizer

Discretize Variables into Number of Bins by Unique Values
summary.hal9001

Summary Method for HAL fit objects
print.summary.hal9001

Print Method for Summary Class of HAL fits
+.formula_hal9001

HAL Formula addition: Adding formula term object together into a single formula object term.
predict.SL.hal9001

predict.SL.hal9001
index_first_copy

Find Copies of Columns
make_reduced_basis_map

Mass-based reduction of basis functions
make_design_matrix

Build HAL Design Matrix
as_dgCMatrix

Fast Coercion to Sparse Matrix
SL.hal9001

Wrapper for Classic SuperLearner
enumerate_edge_basis

Enumerate Basis Functions at Generalized Edges
basis_of_degree

Compute Degree of Basis Functions
evaluate_basis

Generate Basis Functions
enumerate_basis

Enumerate Basis Functions
apply_copy_map

Apply copy map
basis_list_cols

List Basis Functions
make_basis_list

Sort Basis Functions
fit_hal

HAL: The Highly Adaptive Lasso
h

HAL Formula term: Generate a single term of the HAL basis
hal9000

HAL 9000 Quotes
hal9001

hal9001
hal_quotes

HAL9000 Quotes from "2001: A Space Odyssey"
calc_pnz

Calculate Proportion of Nonzero Entries
generate_all_rules

Generates rules based on knot points of the fitted HAL basis functions with non-zero coefficients.
calc_xscale

Calculating Centered and Scaled Matrices