This function fits a multi-response pliable lasso model over a path of regularization values.
admm_MADMMplasso(
beta0,
theta0,
beta,
beta_hat,
theta,
rho1,
X,
Z,
max_it,
W_hat,
XtY,
y,
N,
e.abs,
e.rel,
alpha,
lambda,
alph,
svd.w,
tree,
my_print,
invmat,
gg = 0.2
)predicted values for the ADMM part beta0: estimated beta_0 coefficients having a size of 1 by ncol(y)
beta: estimated beta coefficients having a matrix ncol(X) by ncol(y)
BETA_hat: estimated beta and theta coefficients having a matrix (ncol(X)+ncol(X) by ncol(Z)) by ncol(y)
theta0: estimated theta_0 coefficients having a matrix ncol(Z) by ncol(y)
theta: estimated theta coefficients having a an array ncol(X) by ncol(Z) by ncol(y) converge: did the algorithm converge?
Y_HAT: predicted response nrow(X) by ncol(y)
a vector of length ncol(y) of estimated beta_0 coefficients
matrix of the initial theta_0 coefficients ncol(Z) by ncol(y)
a matrix of the initial beta coefficients ncol(X) by ncol(y)
a matrix of the initial beta and theta coefficients (ncol(X)+ncol(X) by ncol(Z)) by ncol(y)
an array of initial theta coefficients ncol(X) by ncol(Z) by ncol(y)
the Lagrange variable for the ADMM which is usually included as rho in the MADMMplasso call.
N by p matrix of predictors
N by K matrix of modifying variables. The elements of Z may represent quantitative or categorical variables, or a mixture of the two. Categorical variables should be coded by 0-1 dummy variables: for a k-level variable, one can use either k or k-1 dummy variables.
maximum number of iterations in loop for one lambda during the ADMM optimization
N by (p+(p by nz)) of the main and interaction predictors. This generated internally when MADMMplasso is called or by using the function generate_my_w.
a matrix formed by multiplying the transpose of X by y.
N by D matrix of responses. The X and Z variables are centered in the function. We recommend that X and Z also be standardized before the call
nrow(X)
absolute error for the ADMM
relative error for the ADMM
mixing parameter. When the goal is to include more interactions, alpha should be very small and vice versa.
user specified lambda_3 values.
an overrelaxation parameter in [1, 1.8]. The implementation is borrowed from Stephen Boyd's MATLAB code
singular value decomposition of W
The results from the hierarchical clustering of the response matrix. The easy way to obtain this is by using the function (tree_parms) which gives a default clustering. However, user decide on a specific structure and then input a tree that follows such structure.
Should information form each ADMM iteration be printed along the way? This prints the dual and primal residuals
A list of length ncol(y), each containing the C_d part of equation 32 in the paper
penalty terms for the tree structure for lambda_1 and lambda_2 for the ADMM call.