FSR
FSR(Xy, max_poly_degree = 3, max_interaction_degree = 2, outcome = NULL,
linear_estimation = FALSE, threshold_include = 0.01,
threshold_estimate = 0.001, min_models = NULL, max_fails = 2,
standardize = FALSE, pTraining = 0.8, file_name = NULL,
store_fit = "none", max_block = 250, noisy = TRUE, seed = NULL)
matrix or data.frame; outcome must be in final column.
highest power to raise continuous features; default 3 (cubic).
highest interaction order; default 2 (allow x_i*x_j). Also interacts each level of factors with continuous features.
Treat y as either 'continuous', 'binary', 'multinomial', or NULL (auto-detect based on response).
Logical: model outcome as linear and estimate with ordinary least squares? Recommended for speed on large datasets even if outcome is categorical. (For multinomial outcome, this means treated response as vector.) If FALSE, estimator chosen based on 'outcome' (i.e., OLS for continuous outcomes, glm() to estimate logistic regression models for 'binary' outcomes, and nnet::multinom() for 'multinomial').
minimum improvement to include a recently added term in the model (change in fit originally on 0 to 1 scale). -1.001 means 'include all'. Default: 0.01. (Adjust R^2 for linear models, Pseudo R^2 for logistic regression, out-of-sample accuracy for multinomial models. In latter two cases, the same adjustment for number of predictors is applied as pseudo-R^2.)
minimum improvement to keep estimating (pseudo R^2 so scale 0 to 1). -1.001 means 'estimate all'. Default: 0.001.
minimum number of models to estimate. Defaults to the number of features (unless P > N).
maximum number of models to FSR() can fail on computationally before exiting. Default == 2.
if TRUE (not default), standardizes continuous variables.
portion of data for training
If a file name (and path) is provided, saves output after each model is estimated as an .RData file. ex: file_name = "results.RData". See also store_fit for options as to how much to store in the outputted object.
If file_name is provided, FSR() will return coefficients, measures of fit, and call details. Save entire fit objects? Options include "none" (default, just save those other items), "accepted_only" (only models that meet the threshold), and "all".
Most of the linear algebra is done recursively in blocks to ease memory managment. Default 250. Changing up or down may slow things...
display measures of fit, progress, etc. Recommended.
Automatically set but can also be passed as paramater.
list with slope coefficients, model details, and measures of fit
# NOT RUN {
out <- FSR(mtcars)
# }
Run the code above in your browser using DataLab