Function that estimates a group-regularized elastic net model.
gren(x, y, m=rep(1, nrow(x)), unpenalized=NULL, partitions=NULL, alpha=0.5,
lambda=NULL, intercept=TRUE, monotone=NULL, psel=TRUE, compare=TRUE,
posterior=FALSE, nfolds=nrow(x), foldid=NULL, trace=TRUE,
init=list(lambdag=NULL, mu=NULL, sigma=NULL, chi=NULL, ci=NULL),
control=list(epsilon=0.001, maxit=500, maxit.opt=1000, maxit.vb=100))feature data as either numeric matrix or data.frame of numeric variables.
response as either a numeric with binomial/binary successes of length nrow(x) or a matrix of nrow(x) rows and two columns, where the first column contains the binomial/binary failures and the second column the binomial/binary successes.
numeric of length nrow(x) that contains the number of Bernoulli trials.
Optional numeric matrix or data.frame of numeric unpenalized covariates of nrow(x) rows.
list that contains the (possibly multiple) partitions of the data. Every list object corresponds to one partition, where every partition is a numeric of length ncol(x) containing the group ids of the features.
proportion of L1 penalty as a numeric of length 1.
global penalty parameter. The default NULL will result in estimation by cross-validation.
logical to indicate whether an intercept should be included.
list of two logical vectors of length length(partitions). The first one monotone indicates whether the corresponding partition's penalty parameters should be monotonically estimates, the second vector decreasing indicates whether the monotone penalty parameters are decreasing with group number.
either a numeric vector that indicates the number of features to select or a logical. If TRUE feature selection is done by letting glmnet determine the penalty parameter sequence.
logical, if TRUE, a regular non-group-regularized model is estimated.
if TRUE, the full variational Bayes posterior is returned.
numeric of length 1 with the number of folds used in the cross-validation of the global lambda. The default is nrow(x).
optional numeric vector of length nrow(x) with the fold assignments of the observations.
if TRUE, progress of the algorithm is printed.
optional list containing the starting values of the iterative algorithm. See Details for more information.
a list of algorithm control parameters. See Details for more information.
Function returns an S3 list object of class gren containing output with the following components:
callThe function call that produced the output.
alphaproportion of L1 penalty as a numeric of length 1.
lambdaglobal penalty parameter as numeric. Estimated by cross-validation if lambda=NULL.
lambdag.seqlist with full sequence of penalty multipliers over iterations.
lambdaglist with final estimates of penalty multipliers.
vb.postlist with variational posterior parameters \(mu_j\), \(sigma_{ij}\), \(c_i\), and \(chi_j\).
freq.modelfrequentist elastic net model as output of glmnet call. NULL if psel=FALSE.
iterlist with number of iterations of lambdag estimation, with number of optimisation iterations of lambdag, and number of variational Bayes iterations.
convlist of logicals with convergence of lambdag sequence, optimisation steps, and variational Bayes iterations.
argslist with input arguments of gren call.
This is the main function of the package that estimates a group-regularized elastic net regression. The elastic net penalty's proportion of L1-norm penalisation is determined by alpha. alpha close to 0 implies more ridge-like penalty, while alpha close to 1 implies lasso-like penalty. The algorithm is a two-step procedure: first, a global lambda penalty is estimates by cross-validation. Next, the groupwise lambda multipliers are estimates by an EM algorithm. The EM algorithm consists of: i) an expectation step in which the expected marginal likelihood of the penalty multipliers is iteratively approximated by a variational Bayes EM algorithm and ii) a maximisation step in which the approximate expected marginal likelihood is maximised with respect to the penalty multipliers. After convergence of the algorithm an (optional) frequentist elastic net model is fit using the estimated penalty multipliers by setting psel=TRUE or by setting psel to a numeric vector.
The user may speed up the procedure by specifying initial values for the EM algorithm in init. init is a list that contains:
lambdaginitial values for \(\lambda_g\) in a list of length length(partitions).
muinitial values for the \(\mu_j\) in a numeric vector of length ncol(x) + ncol(unpenalized) + intercept.
chiinitial values for the \(\chi_j\) in a numeric vector of length ncol(x).
ciinitial values for the \(c_i\) in a numeric vector of length nrow(x).
sigmaThe initial values for the \(\Sigma_{ij}\) in a matrix of numerics with ncol(x) rows and columns.
control is a list with parameters to control the estimation procedure. It consists of the following components:
epsilonnumeric with the relative convergence tolerance. Default is epsilon=0.001.
maxitnumeric with whole number that gives the maximum number of iterations to update the lambdag. Default is maxit=500.
maxit.optnumeric with whole number that gives the maximum number of iterations to numerically maximise the lambdag. Maximisation occurs at every iteration. Default is maxit.opt=1000.
maxit.vbnumeric with whole number that gives the maximum number of iterations to update the variational parameters mu, sigma, chi, and ci. One full update sequence per iteration. Default is maxit=100.
M<U+00FC>nch, M.M., Peeters, C.F.W., van der Vaart, A.W., and van de Wiel, M.A. (2018). Adaptive group-regularized logistic elastic net regression. arXiv:1805.00389v1 [stat.ME].
# NOT RUN {
## Create data
p <- 1000
n <- 100
set.seed(2018)
x <- matrix(rnorm(n*p), ncol=p, nrow=n)
beta <- c(rnorm(p/2, 0, 0.1), rnorm(p/2, 0, 1))
m <- rep(1, n)
y <- rbinom(n, m, as.numeric(1/(1 + exp(-x %*% as.matrix(beta)))))
partitions <- list(groups=rep(c(1, 2), each=p/2))
## estimate model
fit.gren <- gren(x, y, m, partitions=partitions)
# }
Run the code above in your browser using DataLab