A number which represents the best penalty parameter for the generalized DWD model.
Arguments
X
A \(d\) x \(n\) matrix of \(n\) training samples with \(d\) features.
y
A vector of length \(n\) of training labels. The element of y is either -1 or 1.
expon
A positive number representing the exponent \(q\) of the residual \(r_i\) in the generalized DWD model. Common choices are expon = 1,2,4.
rmzeroFea
Switch for removing zero features in the data matrix. Default is set to be 1 (removing zero features).
scaleFea
Switch for scaling features in the data matrix. This is to make the features having roughly similar magnitude. Default is set to be 1 (scaling features).
Author
Xin-Yee Lam, J.S. Marron, Defeng Sun, and Kim-Chuan Toh
Details
The best parameter is empirically found to be inversely proportional to the typical distance between different samples raised to the power of (\(expon+1\)).
It is also dependent on the sample size \(n\) and feature dimension \(d\).
References
Lam, X.Y., Marron, J.S., Sun, D.F., and Toh, K.C. (2018)
``Fast algorithms for large scale generalized distance weighted discrimination", Journal of Computational and Graphical Statistics, forthcoming. https://arxiv.org/abs/1604.05473