QuanDA fits a quantile-regression-based discriminant with label jittering. For each candidate quantile level \(\tau\), the binary labels are jittered (adding \(U(0,1)\)), a penalized quantile regression is fit multiple times, and the coefficient vectors are averaged. The best \(\tau\) is selected by AUC.
quanda(
x,
y,
lambda = 10^(seq(1, -4, length.out = 30)),
lam2 = 0.01,
n_rep = 10,
tau_window = 0.05,
nfolds = 5,
maxit = 10000,
eps = 1e-07,
maxit_cv = 10000,
eps_cv = 1e-05
)An object of class "quanda" with elements:
Numeric vector of length \(p+1\) (intercept first).
Numeric vector of candidate \(\tau\) values.
Chosen \(\tau\).
Vector of AUCs across \(\tau\).
The matched call.
A numeric matrix of predictors with \(n\) rows (observations) and \(p\) columns (features).
A binary response vector of length \(n\) with values 0 or 1.
Optional numeric vector of penalty values (largest lambda[1]).
If NULL, a default sequence will be generated from the data.
Numeric, secondary penalty (ridge/elastic term) passed to hdqr. Default 0.01.
Integer, number of jittering repetitions (averaged). Default 10.
Width around the class rate to explore quantiles.
Candidate \(\tau\) are \(b + \{-w,\ldots,w\}\) in steps of 0.01,
clipped to \([0,1]\), where \(b\) is the class rate and \(w\) is tau_window.
Default 0.1.
Integer, number of CV folds used by cv_z(). Default 5.
Controls for inner optimizers and CV helper.
We jitter labels via \(z_i = y_i + U_i\), where \(U_i \sim \mathrm{Unif}(0,1)\),
fit penalized quantile regression at multiple \(\tau\), average coefficients over n_rep jitters,
compute AUCs on the original \((x,y)\), and pick the \(\tau\) that maximizes AUC.
data(breast)
X <- as.matrix(X)
y <- as.numeric(as.character(y))
y[y==-1]=0
fit <- quanda(X, y)
pred <- predict(fit, tail(X))
Run the code above in your browser using DataLab