- X
A numeric design matrix, each row of which represents a vector of
covariates/independent variables/features. Though not required, it is
recommended to center and scale the columns to have norm
sqrt(nrow(X)).
- Y
An nrow(X)-dimensional response vector, numeric if
family = "linear" and binary if family = "logistic".
- family
A character string selecting the regression model, either
"linear" or "logistic".
- slab
A character string specifying the prior slab density, either
"laplace" or "gaussian".
- mu
An ncol(X)-dimensional numeric vector, serving as initial
guess for the variational means. If omitted, mu will be estimated
via ridge regression to initialize the coordinate ascent algorithm.
- sigma
A positive ncol(X)-dimensional numeric vector, serving as
initial guess for the variational standard deviations.
- gamma
An ncol(X)-dimensional vector of probabilities, serving
as initial guess for the variational inclusion probabilities. If omitted,
gamma will be estimated via LASSO regression to initialize the
coordinate ascent algorithm.
- alpha
A positive numeric value, parametrizing the beta hyper-prior on
the inclusion probabilities. If omitted, alpha will be chosen
empirically via LASSO regression.
- beta
A positive numeric value, parametrizing the beta hyper-prior on
the inclusion probabilities. If omitted, beta will be chosen
empirically via LASSO regression.
- prior_scale
A numeric value, controlling the scale parameter of the
prior slab density. Used as the scale parameter \(\lambda\) when
prior = "laplace", or as the standard deviation \(\sigma\) if
prior = "gaussian".
- update_order
A permutation of 1:ncol(X), giving the update
order of the coordinate-ascent algorithm. If omitted, a data driven
updating order is used, see Ray and Szabo (2020) in Journal of
the American Statistical Association for details.
- intercept
A Boolean variable, controlling if an intercept should be
included. NB: This feature is still experimental in logistic regression.
- noise_sd
A positive numerical value, serving as estimate for the
residual noise standard deviation in linear regression. If missing it will
be estimated, see estimateSigma from the selectiveInference
package for more details. Has no effect when family = "logistic".
- max_iter
A positive integer, controlling the maximum number of
iterations for the variational update loop.
- tol
A small, positive numerical value, controlling the termination
criterion for maximum absolute differences between binary entropies of
successive iterates.