Learn R Programming

CIpostSelect (version 0.2.1)

lmps: Function that handles storing our estimation and variable selection matrices during the different splits.

Description

Function that handles storing our estimation and variable selection matrices during the different splits.

Usage

lmps(
  formula,
  data,
  method,
  N,
  p_split = 0.5,
  cores = NULL,
  direction = "backward",
  forced_var = NULL
)

Value

An object of class lmps

Arguments

formula

Regression model to use, specified as a formula.

data

Data set to be used for regression modeling.

method

Method for variable selection. Should be one of "Lasso" or "BIC".

N

Number of splits.

p_split

Probabilities associated with the splits.

cores

Number of cores for parallel processing.

direction

It can take two values: "backward" and "forward". In the case of BIC, it specifies the direction in which the selection will be made.

forced_var

A character string specifying a predictor variable to be forced into selection. By default, it is NULL, allowing for no forced selection. If provided, this variable will be consistently selected during the N splits.

Details

We have data that we will split several times while shuffling it each time. Then, we will divide the data into two parts based on a specific probability for splitting. In the first half, we will perform model selection, followed by calibration on the second half. At the end of these steps, we will obtain matrices of dimensions N*p that represent the selected models and the estimated coefficients associated with these models.

Examples

Run this code

library(mlbench)
data("BostonHousing")
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50)

# \donttest{
# A parallelized example
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50, cores = 2)
# }

Run the code above in your browser using DataLab