d.spls.GL: Dual Sparse Partial Least Squares (Dual-SPLS) regression for the group lasso norms

Description

The function performs dimensional reduction with the group lasso norms. Three norms are available where G is the number of groups, the vectors $w_g$ hold the coordinates of $w$ for the observations belonging to the group $g$ and $\alpha_g$, $\lambda_g$ and $\gamma_g$ are all positive scalars.

Norm A (generalized norm): $\Omega_g(w)=\|w_g\|_2+ \lambda_g \|w_g\|_1$ where $\Omega(w)=\sum_{g} \alpha_g \Omega_g(w)=1 \textrm{ and } \sum_{g=1}^G \alpha_g=1$,
Norm B (particular case): $\Omega(w)=\|w\|_2+\sum_{g=1}^G \lambda_g\|w_g\|_1$,
Norm C (particular case): $\Omega(w)=\sum_{g=1}^G \alpha_g \|w \|_2+\sum_{g=1}^G \lambda_g \|w_g \|_1$ where $\sum_{g=1}^G \alpha_g=\sum_{g=1}^G \gamma_g=1$
and $\Omega(w_g)=\gamma_g$.

Dual-SPLS for the group lasso norms has been designed to confront the situations where the predictors variables can be divided into distinct meaningful groups. Each group is constrained by an independent threshold as in the dual sparse lasso methodology, that is each $w_g$ will be collinear to a vector $z_{\nu_g}$ built from the coordinate of $z$ and constrained by the threshold $\nu_g$.

Three variants are defined here depending on the groups combination in the global norm and the weights assigned to each group. They all give the same result as the lasso norm for $G=1$,

Norm A is the generalized norm of the group lasso. applies the lasso norm for each group individually while constraining the overall norm. Moreover, the Euclidean norm of each $w_g$ is computed while minimizing the root mean squares error of prediction,
Norm B is a particular case and a genuine alternative similar to the lasso-like norm,
Norm C is another particular case that assigns user to define weights for each group.

Usage

d.spls.GL(X,y,ncp,ppnu,indG,gamma=NULL,norm="A",verbose=FALSE)

Value

A list of the following attributes

Xmean: the mean vector of the predictors matrix X.
scores: the matrix of dimension (n,ncp) where n is the number of observations. The scores represents the observations in the new component basis computed by the compression step of the Dual-SPLS.
loadings: the matrix of dimension (p,ncp) that represents the Dual-SPLS components.
Bhat: the matrix of dimension (p,ncp) that regroups the regression coefficients for each component.
intercept: the vector of length ncp representing the intercept values for each component.
fitted.values: the matrix of dimension (n,ncp) that represents the predicted values of y
residuals: the matrix of dimension (n,ncp) that represents the residuals corresponding to the difference between the responses and the fitted values.
lambda: the matrix of dimension (G,ncp) collecting the parameters of sparsity $\lambda_g$ used to fit the model at each iteration and for each group.
alpha: the matrix of dimension (G,ncp) collecting the constraint parameters $\alpha_g$ used to fit the model at each iteration and for each group when the norm chosen is B or C.
zerovar: the matrix of dimension (G,ncp) representing the number of variables shrank to zero per component and per group.
PP: the vector of length G specifying the number of variables in each group.
ind_diff0: the list of ncp elements representing the index of the none null regression coefficients elements.
type: a character specifying the Dual-SPLS norm used. In this case it is either GLA, GLB or GLC.

Arguments

X: a numeric matrix of predictors values of dimension (n,p). Each row represents one observation and each column one predictor variable.
y: a numeric vector or a one column matrix of responses. It represents the response variable for each observation.
ncp: a positive integer. ncp is the number of Dual-SPLS components.
ppnu: a positive real value or a vector of length the number of groups, in $[0,1]$. ppnu is the desired proportion of variables to shrink to zero for each component and for each group.
indG: a numeric vector of group index for each observation.
gamma: a numeric vector of the norm $\Omega$ of each $w_g$ in case norm="C".
norm: a character specifying the norm chosen between A, B and C. Default value is A.
verbose: a Boolean value indicating whether or not to display the iterations steps. Default value is FALSE.

Author

Louna Alsouki François Wahl

Details

The resulting solution for $w$ and hence for the coefficients vector, in the case of d.spls.GL, has a simple closed form expression (ref) deriving from the fact that for each group $g$, $w_g$ is collinear to a vector $$z_{\nu,g}=\textrm{sign}({z_g})(|z_g|-\nu_g)_+.$$ Here, for each group $g$, $\nu_g$ is the threshold for which ppnu of the group $g$ of the absolute values of the coordinates of $z_j$ are greater than $\nu_g$. The norms differ in the value of the threshold for each group, that is the expression of $\nu_g$. (see reference for detail)

Examples

Run this code

### load dual.spls library
library(dual.spls)
oldpar <- par(no.readonly = TRUE)

####two predictors matrix
### parameters
n <- 100
p <- c(50,100)
nondes <- c(20,30)
sigmaondes <- c(0.05,0.02)
data=d.spls.simulate(n=n,p=p,nondes=nondes,sigmaondes=sigmaondes)

X <- data$X
X1 <- X[,(1:p[1])]
X2 <- X[,(p[1]+1):sum(p)]
y <- data$y

indG <-c(rep(1,p[1]),rep(2,p[2]))

#fitting the model
ncp <- 10
ppnu <- c(0.99,0.9)

# norm A
mod.dsplsA <- d.spls.GL(X=X,y=y,ncp=ncp,ppnu=ppnu,indG=indG,norm="A",verbose=TRUE)
n <- dim(X)[1]
p <- dim(X)[2]

str(mod.dsplsA)

### plotting the observed values VS predicted values
plot(y,mod.dsplsA$fitted.values[,6], xlab="Observed values", ylab="Predicted values",
 main="Observed VS Predicted for 6 components")
points(-1000:1000,-1000:1000,type='l')

### plotting the regression coefficients

i=6
nz=mod.dsplsA$zerovar[,i]
plot(1:dim(X)[2],mod.dsplsA$Bhat[,i],type='l',
    main=paste(" Dual-SPLS (GLA), ncp =", i, " #0coef =", nz[1], "/", dim(X1)[2]
    , " #0coef =", nz[2], "/", dim(X2)[2]),
    ylab='',xlab='' )
inonz=which(mod.dsplsA$Bhat[,i]!=0)
points(inonz,mod.dsplsA$Bhat[inonz,i],col='red',pch=19,cex=0.5)
legend("topright", legend ="non null values", bty = "n", cex = 0.8, col = "red",pch=19)

# norm B
mod.dsplsB <- d.spls.GL(X=X,y=y,ncp=ncp,ppnu=ppnu,indG=indG,norm="B",verbose=TRUE)

str(mod.dsplsB)

### plotting the observed values VS predicted values
plot(y,mod.dsplsB$fitted.values[,6], xlab="Observed values", ylab="Predicted values",
main="Observed VS Predicted for 6 components")
points(-1000:1000,-1000:1000,type='l')

### plotting the regression coefficients

i=6
nz=mod.dsplsB$zerovar[,i]
plot(1:dim(X)[2],mod.dsplsB$Bhat[,i],type='l',
    main=paste(" Dual-SPLS (GLB), ncp =", i, " #0coef =", nz[1], "/", dim(X1)[2]
    , " #0coef =", nz[2], "/", dim(X2)[2]),
    ylab='',xlab='' )
inonz=which(mod.dsplsB$Bhat[,i]!=0)
points(inonz,mod.dsplsB$Bhat[inonz,i],col='red',pch=19,cex=0.5)

legend("topright", legend ="non null values", bty = "n", cex = 0.8, col = "red",pch=19)

# norm C
mod.dsplsC <- d.spls.GL(X=X,y=y,ncp=ncp,ppnu=ppnu,indG=indG,gamma=c(0.5,0.5),norm="C",verbose=TRUE)
n <- dim(X)[1]
p <- dim(X)[2]

str(mod.dsplsC)

### plotting the observed values VS predicted values
plot(y,mod.dsplsC$fitted.values[,6], xlab="Observed values", ylab="Predicted values",
main="Observed VS Predicted for 6 components")
points(-1000:1000,-1000:1000,type='l')

### plotting the regression coefficients

i=6
nz=mod.dsplsC$zerovar[,i]
plot(1:dim(X)[2],mod.dsplsC$Bhat[,i],type='l',
    main=paste(" Dual-SPLS (GLC), ncp =", i, " #0coef =", nz[1], "/", dim(X1)[2]
    , " #0coef =", nz[2], "/", dim(X2)[2]),
    ylab='',xlab='' )
inonz=which(mod.dsplsC$Bhat[,i]!=0)
points(inonz,mod.dsplsC$Bhat[inonz,i],col='red',pch=19,cex=0.5)
legend("topright", legend ="non null values", bty = "n", cex = 0.8, col = "red",pch=19)

par(oldpar)

Run the code above in your browser using DataLab