discrim (version 0.1.1)

discrim_regularized: General Interface for Regularized Discriminant Models

Description

discrim_regularized() is a way to generate a specification of a regularized discriminant analysis (RDA) model before fitting.

Usage

discrim_regularized(
  mode = "classification",
  frac_common_cov = NULL,
  frac_identity = NULL
)

# S3 method for discrim_regularized update( object, frac_common_cov = NULL, frac_identity = NULL, fresh = FALSE, ... )

Arguments

mode

A single character string for the type of model. The only possible value for this model is "classification".

frac_common_cov, frac_identity

Numeric values between zero and one.

object

A linear discriminant model specification.

fresh

A logical for whether the arguments should be modified in-place of or replaced wholesale.

...

Not used for update().

Engine Details

Engines may have pre-set default arguments when executing the model fit call. For this type of model, the template of the fit calls are:

discrim_regularized() %>% 
  set_engine("klaR") %>% 
  translate()

## Regularized Discriminant Model Specification (classification)
## 
## Computational engine: klaR 
## 
## Model fit template:
## klaR::rda(formula = missing_arg(), data = missing_arg())

The standardized parameter names in parsnip can be mapped to their original names in each engine that has main parameters. Each engine typically has a different default value (shown in parentheses) for each parameter.

parsnip klaR
frac_common_cov lambda (varies)
frac_identity gamma (varies)

Details

The model is from Friedman (1989) and can create LDA models, QDA models, and regularized mixtures of the two. It does not conduct feature selection. The main arguments for the model are:

  • frac_common_cov: The fraction of the regularized covariance matrix that is based on the LDA model (i.e., computed from all classes). A value of 1 is the linear discriminant analysis assumption while a value near zero assumes that there should be separate covariance matrices for each class.

  • frac_identity: The fraction of the final, class-specific covariance matrix that is the identity matrix.

See klaR::rda() for the equations that define these parameters.

These arguments are converted to their specific names at the time that the model is fit. Other options and argument can be set using set_engine(). If left to their defaults here (NULL), the values are taken from the underlying model functions. If parameters need to be modified, update() can be used in lieu of recreating the object from scratch.

For discrim_regularized(), the mode will always be "classification".

References

Friedman, J.H. (1989). Regularized Discriminant Analysis. Journal of the American Statistical Association 84, 165-175.

Examples

Run this code
# NOT RUN {
parabolic_grid <-
  expand.grid(X1 = seq(-5, 5, length = 100),
              X2 = seq(-5, 5, length = 100))

rda_mod <-
  discrim_regularized(frac_common_cov = .5, frac_identity = .5) %>%
  set_engine("klaR") %>%
  fit(class ~ ., data = parabolic)

parabolic_grid$rda <-
  predict(rda_mod, parabolic_grid, type = "prob")$.pred_Class1

library(ggplot2)
ggplot(parabolic, aes(x = X1, y = X2)) +
  geom_point(aes(col = class), alpha = .5) +
  geom_contour(data = parabolic_grid, aes(z = rda), col = "black", breaks = .5) +
  theme_bw() +
  theme(legend.position = "top") +
  coord_equal()


model <- discrim_regularized(frac_common_cov = 10)
model
update(model, frac_common_cov = 1)
# }

Run the code above in your browser using DataLab