Finds the tuning parameter value that yields the smallest BIC.
MGLMtune(
formula,
data,
dist,
penalty,
lambdas,
ngridpt,
warm.start = TRUE,
keep.path = FALSE,
display = FALSE,
init,
weight,
penidx,
ridgedelta,
maxiters = 150,
epsilon = 1e-05,
regBeta = FALSE,
overdisp
)
an object of class formula
(or one that can be coerced to that class): a symbolic description of the model to be fitted. The response has to be on the left hand side of ~.
an optional data frame, list or environment (or object coercible by as.data.frame
to a data frame) containing the variables in the model. If not found in data
when using function MGLMtune
, the variables are taken from environment(formula)
, typically the environment from which MGLMtune
is called.
a description of the distribution to fit. See dist
for the details.
penalty type for the regularization term. Can be chosen from "sweep"
, "group"
, or "nuclear"
. See MGLMsparsereg for the description of each penalty type.
an optional vector of the penalty values to tune. If missing, the vector of penalty values will be set inside the function. ngridpt
must be provided if lambdas
is missing.
an optional numeric variable specifying the number of grid points to tune. If lambdas
is given, ngridpt
will be ignored. Otherwise, the maximum \(\lambda\) is determined from the data. The smallest \(\lambda\)is set to \(1/n\), where \(n\) is the sample size.
an optional logical variable to specify whether to give warm start at each tuning grid point. If warm.start=TRUE
, the fitted sparse regression coefficients will be used as the initial value when fitting the sparseregression with the next tuning grid.
an optional logical variable controling whether to output the whole solution path. The default is keep.path=FALSE
. If keep.path=TRUE
, the sparse regression result at each grid point will be kept, and saved in the output object select.list
.
an optional logical variable to specify whether to show each tuning step.
an optional matrix of initial value of the parameter estimates. Should have the compatible dimension with the data. See dist
for details of dimensions in each distribution.
an optional vector of weights assigned to each row of the data. Should be NULL
or a numeric vector. Could be a variable from the data
, or a variable from environment(formula)
with the length equal to the number of rows of the data. If weight=NULL
, equal weights of ones will be assigned.
a logical vector indicating the variables to be penalized. The default value is rep(TRUE, p)
, which means all predictors are subject to regularization. If X
contains intercept, use penidx=c(FALSE,rep(TRUE,p-1))
.
an optional numeric controlling the behavior of the Nesterov's accelerated proximal gradient method. The default value is \(\frac{1}{pd}\).
an optional numeric controlling the maximum number of iterations. The default value is maxiters=150
.
an optional numeric controlling the stopping criterion. The algorithm terminates when the relative change in the objective values of two successive iterates is less then epsilon
. The default value is epsilon=1e-5
.
an optional logical variable used when running negative multinomial regression (dist="NegMN"
). regBeta
controls whether to run regression on the over-dispersion parameter. The default is regBeta=FALSE
.
an optional numerical variable used only when fitting sparse negative multinomial model and regBeta=FALSE
. overdisp
gives the over-dispersion value for all the observations. The default value is estimated using negative-multinomial regression. When dist="MN", "DM", "GDM"
or regBeta=TRUE
, the value of overdisp
is ignored.
select
the final sparse regression result, using the optimal tuning parameter.
path
a data frame with degrees of freedom and BICs at each lambda.
# NOT RUN {
set.seed(118)
n <- 50
p <- 10
d <- 5
m <- rbinom(n, 100, 0.8)
X <- matrix(rnorm(n * p), n, p)
alpha <- matrix(0, p, d)
alpha[c(1, 3, 5), ] <- 1
Alpha <- exp(X %*% alpha)
Y <- rdirmn(size=m, alpha=Alpha)
sweep <- MGLMtune(Y ~ 0 + X, dist="DM", penalty="sweep", ngridpt=10)
show(sweep)
# }
Run the code above in your browser using DataLab