Learn R Programming

asympDiag (version 0.3.1)

select_covariates: Select covariates

Description

Select covariates

Usage

select_covariates(
  model,
  threshold = 0.15,
  direction = c("both", "backward", "forward"),
  addable_coefs = names(get_fixef(model)),
  measure_fn = function(x) summary(x)[["coefficients"]][, 4],
  measure_one_at_time = FALSE,
  minimize_only = FALSE,
  max_steps = 1000,
  return_step_results = FALSE,
  do_not_remove = c("(Intercept)"),
  ...
)

Value

A fitted model with selected covariates based on the variable selection process. If return_step_results is TRUE, a list containing the final fitted model and a log of the selection steps is returned.

Arguments

model

A model with stats::update(), stats::coef() methods.

threshold

Value threshold to remove variable. It can be a fixed value or a function. The variable is removed if measure_fn(model) > threshold and added if measure_fn(model) <= threshold.

direction

The direction of variable selection. Options include "backward", "forward", or "both". Defaults to "both".

addable_coefs

A vector of coefficients that can be added during forward selection. Defaults to all coefficients in the model.

measure_fn

Function with model as argument and returns values to be used by threshold. It can also compare two models, where during forward step it calls measure_fn(candidate_model, current_selected_model) and during backward step it calls measure_fn(current_selected_model, candidate_model). Defaults to the p-value from the summary of the coefficients.

measure_one_at_time

Boolean indicating to apply measure_fn to each variable individually during forward and backward steps. Set this option to TRUE if measure_fn returns an atomic value, for example if measure_fn is AIC.

minimize_only

Logical indicating that during backward model update it should minimize the measure_fn instead of maximize it.

max_steps

The maximum number of steps for the variable selection process. Defaults to 1000.

return_step_results

Logical. If TRUE, the function returns a list containing the final fitted model and a log of the selection steps. Defaults to FALSE.

do_not_remove

A character vector specifying variables that should not be removed during backward selection. Defaults to "(Intercept)".

...

Extra arguments to stats::update().

Examples

Run this code
model <- lm(mpg ~ ., data = mtcars)
select_covariates(model)

## measure_fn with two parameters

lrt <- function(model1, model2) {
  lrt_stat <- 2 * (logLik(model1)[1L] - logLik(model2)[1L])
  return(1 - pchisq(lrt_stat, 1))
}

select_covariates(model, measure_fn = lrt)

## AICc selection

AICc <- function(model) {
  loglike <- logLik(model)
  df <- attr(loglike, "df")
  nobs <- attr(loglike, "nobs")
  aic <- -2 * as.numeric(loglike) + 2 * df

  aicc <- aic + (2 * (df^2) + 2 * df) / (nobs - df - 1)

  return(aicc)
}

selection <- select_covariates(model,
  measure_fn = AICc,
  threshold = AICc,
  measure_one_at_time = TRUE,
  minimize_only = TRUE,
  direction = "both",
  data = mtcars
)

Run the code above in your browser using DataLab