penaltyParameter: Compute the penalty parameter for the model.

Description

Find the best penalty parameter \(C\) for the generalized distance weighted discrimination (DWD) model.

Usage

penaltyParameter(X,y,expon,rmzeroFea = 1, scaleFea = 1)

Value

A number which represents the best penalty parameter for the generalized DWD model.

Arguments

X: A \(d\) x \(n\) matrix of \(n\) training samples with \(d\) features.
y: A vector of length \(n\) of training labels. The element of y is either -1 or 1.
expon: A positive number representing the exponent \(q\) of the residual \(r_i\) in the generalized DWD model. Common choices are expon = 1,2,4.
rmzeroFea: Switch for removing zero features in the data matrix. Default is set to be 1 (removing zero features).
scaleFea: Switch for scaling features in the data matrix. This is to make the features having roughly similar magnitude. Default is set to be 1 (scaling features).

Author

Xin-Yee Lam, J.S. Marron, Defeng Sun, and Kim-Chuan Toh

Details

The best parameter is empirically found to be inversely proportional to the typical distance between different samples raised to the power of (\(expon+1\)). It is also dependent on the sample size \(n\) and feature dimension \(d\).

References

Lam, X.Y., Marron, J.S., Sun, D.F., and Toh, K.C. (2018) ``Fast algorithms for large scale generalized distance weighted discrimination", Journal of Computational and Graphical Statistics, forthcoming.
https://arxiv.org/abs/1604.05473

Examples

Run this code

# load the data
data("mushrooms")
# calculate the best penalty parameter
C = penaltyParameter(mushrooms$X,mushrooms$y,expon=1)

Run the code above in your browser using DataLab