This function trains linear logistic regression models with HMC in restricted Gibbs sampling.
It also makes predictions for test cases if X_ts
are provided.
htlr_fit(
X_tr,
y_tr,
fsel = 1:ncol(X_tr),
stdzx = TRUE,
ptype = c("t", "ghs", "neg"),
sigmab0 = 2000,
alpha = 1,
s = -10,
eta = 0,
iters_h = 1000,
iters_rmc = 1000,
thin = 1,
leap_L = 50,
leap_L_h = 5,
leap_step = 0.3,
hmc_sgmcut = 0.05,
initial_state = "lasso",
keep.warmup.hist = FALSE,
silence = TRUE,
rep.legacy = TRUE,
alpha.rda = 0.2,
lasso.lambda = seq(0.05, 0.01, by = -0.01),
X_ts = NULL,
predburn = NULL,
predthin = 1
)
A list of fitting results. If X_ts
is not provided, the list is an object
with S3 class htlr.fit
.
Input matrix, of dimension nobs by nvars; each row is an observation vector.
Vector of response variables. Must be coded as non-negative integers, e.g., 1,2,...,C for C classes, label 0 is also allowed.
Subsets of features selected before fitting, such as by univariate screening.
Logical; if TRUE
, the original feature values are standardized to have mean = 0
and sd = 1
.
The prior to be applied to the model. Either "t" (student-t, default), "ghs" (horseshoe), or "neg" (normal-exponential-gamma).
The sd
of the normal prior for the intercept.
The degree freedom of t/ghs/neg prior for coefficients.
The log scale of priors (logw) for coefficients.
The sd
of the normal prior for logw. When it is set to 0, logw is fixed.
Otherwise, logw is assigned with a normal prior and it will be updated during sampling.
A positive integer specifying the number of warmup (aka burnin).
A positive integer specifying the number of iterations after warmup.
A positive integer specifying the period for saving samples.
The length of leapfrog trajectory in sampling phase.
The length of leapfrog trajectory in burnin phase.
The stepsize adjustment multiplied to the second-order partial derivatives of log posterior.
The coefficients smaller than this criteria will be fixed in each HMC updating step.
The initial state of Markov Chain; can be a previously
fitted fithtlr
object, or a user supplied initial state vector, or
a character string matches the following:
"lasso" - (Default) Use Lasso initial state with lambda
chosen by
cross-validation. Users may specify their own candidate lambda
values via
optional argument lasso.lambda
. Further customized Lasso initial
states can be generated by lasso_deltas
.
"bcbcsfrda" - Use initial state generated by package BCBCSF
(Bias-corrected Bayesian classification). Further customized BCBCSF initial
states can be generated by bcbcsf_deltas
. WARNING: This type of
initial states can be used for continuous features such as gene expression profiles,
but it should not be used for categorical features such as SNP profiles.
"random" - Use random initial values sampled from N(0, 1).
Warmup iterations are not recorded by default, set TRUE
to enable it.
Setting it to FALSE
for tracking MCMC sampling iterations.
Logical; if TRUE
, the output produced in HTLR
versions up to
legacy-3.1-1 is reproduced. The speed would be typically slower than non-legacy mode on
multi-core machine.
A user supplied alpha value for bcbcsf_deltas
when
setting up BCBCSF initial state. Default: 0.2.
- A user supplied lambda sequence for lasso_deltas
when
setting up Lasso initial state. Default: {.01, .02, ..., .05}. Will be ignored if
rep.legacy
is set to TRUE
.
Test data which predictions are to be made.
For prediction base on X_ts
(when supplied), predburn
of
Markov chain (super)iterations will be discarded, and only every predthin
are used for inference.
Longhai Li and Weixin Yao (2018). Fully Bayesian Logistic Regression with Hyper-Lasso Priors for High-dimensional Feature Selection. Journal of Statistical Computation and Simulation 2018, 88:14, 2827-2851.