Fit sparse regression in multivariate generalized linear models.
MGLMsparsereg(
formula,
data,
dist,
lambda,
penalty,
weight,
init,
penidx,
maxiters = 150,
ridgedelta,
epsilon = 1e-05,
regBeta = FALSE,
overdisp
)MGLMsparsereg.fit(
Y,
X,
dist,
lambda,
penalty,
weight,
init,
penidx,
maxiters = 150,
ridgedelta,
epsilon = 1e-05,
regBeta = FALSE,
overdisp
)
an object of class formula
(or one that can be coerced
to that class): a symbolic description of the model to be fitted.
The response has to be on the left hand side of ~.
an optional data frame, list or environment (or object coercible by
as.data.frame
to a data frame) containing the variables in the model.
If not found in data
when using function MGLMsparsereg
, the variables
are taken from environment(formula)
, typically the environment from
which MGLMsparsereg
is called.
a description of the error distribution to fit. See dist
for details.
penalty parameter.
penalty type for the regularization term. Can be chosen from "sweep"
,
"group"
, or "nuclear"
. See Details for the description of each penalty type.
an optional vector of weights assigned to each row of the data.
Should be NULL
or a numeric vector. Could be a variable from data
,
or a variable from environment(formula)
with the length equal to
the number of rows of the data.
If weight=NULL
, equal weights of ones will be assigned.
an optional matrix of initial value of the parameter estimates.
Should have the compatible dimension with the data. See dist
for
details of the dimensions in each distribution.
a logical vector indicating the variables to be penalized. The default value is rep(TRUE, p)
, which means all predictors are subject to regularization. If X
contains intercept, use penidx=c(FALSE,rep(TRUE,p-1))
.
an optional numeric controlling the maximum number of iterations. The default value is maxiters=150.
an optional numeric controlling the behavior of the Nesterov's accelerated proximal gradient method. The default value is \(\frac{1}{pd}\).
an optional numeric controlling the stopping criterion. The algorithm terminates when the relative change in the objective values of two successive iterates is less then epsilon
.
The default value is epsilon=1e-5
.
an optional logical variable used when running negative multinomial regression (dist="NegMN"
).
regBeta
controls whether to run regression on the over-dispersion parameter.
The default is regBeta=FALSE
.
an optional numerical variable used only when fitting sparse negative multinomial
model dist="NegMN"
and regBeta=FALSE
. overdisp
gives the over dispersion value
for all the observations. The default value is estimated using negative-multinomial regression. When dist="MN", "DM", "GDM"
or regBeta=TRUE
, the value of overdisp
is ignored.
a matrix containing the multivariate categorical response data.
Rows of the matrix represent observations, while columns are the different
categories. Rows and columns of all zeros are automatically removed when
dist="MN"
, "DM"
, or "GDM"
.
design matrix (including intercept).
Number of rows of the matrix should match that of Y
.
Returns an object of class "MGLMsparsereg"
. An object of class "MGLMsparsereg"
is a list containing at least the following components:
coefficients
the estimated matrix of regression coefficients.
logL
the final loglikelihood value.
AIC
Akaike information criterion.
BIC
Bayesian information criterion.
Dof
degrees of freedom of the estimated parameter.
iter
number of iterations used.
maxlambda
the maxmum tuning parameter such that
the estimated coefficients are not all zero. This value is returned only
when the tuning parameter lambda
given to the function is large enough
such that all the parameter estimates are zero; otherwise, maxlambda
is not computed.
call
a matched call.
data
the data used to fit the model: a list of the predictor matrix
and the response matrix.
penalty
the penalty chosen when running the penalized regression.
In general, we consider regularization problem $$ \min_B h(B) = -l(B)+ J(B), $$ where \(l(B)\) is the loglikelihood function and \(J(B)\) is the regularization function.
Sparsity in the individual elements of the parameter matrix \(B\) is achieved
by the lasso penalty (dist="sweep"
)
$$
J(B) = \lambda \sum_{k\in penidx} \sum_{j=1}^d \|B_{kj}\|
$$
Sparsity in the rows of the regression parameter matrix \(B\) is achieved
by the group penalty (dist="group"
)
$$
J(B) = \lambda \sum_{k \in penidx} \|B_{k \cdot}\|_2,
$$
where \(\|v\|_2\) is the \(l_2\) norm of a vector \(v\). In other words,
\(\|B_{k\cdot}\|_2\) is the \(l_2\) norm of the \(k\)-th row of the
parameter matrix \(B\).
Sparsity in the rank of the parameter matrix \(B\) is achieved by the nuclear norm penalty (dist="nuclear"
)
$$
J(B) = \lambda \|B\|_*= \lambda \sum_{i=1}^{min(p, d)} \sigma_i(B),
$$
where \(\sigma_i(B)\) are the singular values of the parameter matrix \(B\).
The nuclear norm \(\|B\|_*\) is a convex relaxation of \(rank(B)=\|\sigma(B)\|_0\).
See dist
for details about distributions.
# NOT RUN {
## Generate Dirichlet Multinomial data
dist <- "DM"
n <- 100
p <- 15
d <- 5
m <- runif(n, min=0, max=25) + 25
set.seed(134)
X <- matrix(rnorm(n*p),n, p)
alpha <- matrix(0, p, d)
alpha[c(1,3, 5), ] <- 1
Alpha <- exp(X%*%alpha)
Y <- rdirmn(size=m, alpha=Alpha)
## Tuning
ngridpt <- 10
p <- ncol(X)
d <- ncol(Y)
pen <- 'nuclear'
spfit <- MGLMsparsereg(formula=Y~0+X, dist=dist, lambda=Inf, penalty=pen)
# }
Run the code above in your browser using DataLab