Learn R Programming

plsRglm (version 0.3.3)

PLS_glm_kfoldcv: Partial least squares regression glm models with kfold cross validation

Description

This function implements kfold cross validation on complete or incomplete datasets for partial least squares regression generalized linear models

Usage

PLS_glm_kfoldcv(dataY, dataX, nt = 2, limQ2set = 0.0975, modele = "pls", family = NULL, K = nrow(dataX), NK = 1, grouplist = NULL, random = FALSE, scaleX = TRUE, scaleY = NULL, keepcoeffs = FALSE, keepfolds = FALSE, keepdataY = TRUE, keepMclassed=FALSE, tol_Xi = 10^(-12))

Arguments

dataY
response (training) dataset
dataX
predictor(s) (training) dataset
nt
number of components to be extracted
limQ2set
limit value for the Q2
modele
name of the PLS glm model to be fitted ("pls", "pls-glm-gaussian", "pls-glm-logistic", "pls-glm-polr").
family
glm family. not used at the moment
K
number of groups
NK
number of times the group division is made
grouplist
to specify the members of the K groups
random
should the K groups be made randomly
scaleX
scale the predictor(s) : must be set to TRUE for modele="pls" and should be for glms pls.
scaleY
scale the response : Yes/No. Ignored since non always possible for glm responses.
keepcoeffs
shall the coefficients for each model be returned
keepfolds
shall the groups' composition be returned
keepdataY
shall the observed value of the response for each one of the predicted value be returned
keepMclassed
shall the number of miss classed be returned (unavailable)
tol_Xi
minimal value for Norm2(Xi) and $\mathrm{det}(pp' \times pp)$ if there is any missing value in the dataX. It defaults to $10^{-12}$

Value

  • results_kfoldslist of NK. Each element of the list sums up the results for a group division: [object Object],[object Object],[object Object]
  • foldslist of NK. Each element of the list sums up the informations for a group division: [object Object],[object Object],[object Object]
  • dataY_kfoldslist of NK. Each element of the list sums up the results for a group division: [object Object],[object Object],[object Object]
  • callthe call of the function

Details

Predicts 1 group with the K-1 other groups. Leave one out cross validation is thus obtained for K==nrow(dataX).

References

Nicolas Meyer, Myriam Maumy-Bertrand et Fr�d�ric{Fr'ed'eric} Bertrand (2010). Comparaison de la r�gression{r'egression} PLS et de la r�gression{r'egression} logistique PLS : application aux donn�es{donn'ees} d'all�lotypage{d'all'elotypage}. Journal de la Soci�t� Fran�aise de Statistique, 151(2), pages 1-18. http://smf4.emath.fr/Publications/JSFdS/151_2/pdf/sfds_jsfds_151_2_1-18.pdf

See Also

kfolds2coeff, kfolds2Pressind, kfolds2Press, kfolds2Mclassedind, kfolds2Mclassed and kfolds2CVinfos_glm to extract and transform results from kfold cross validation.

Examples

Run this code
data(Cornell)
XCornell<-Cornell[,1:7]
yCornell<-Cornell[,8]
bbb <- PLS_glm_kfoldcv(dataY=yCornell,dataX=XCornell,nt=10,NK=1,modele="pls")
kfolds2CVinfos_glm(bbb)

PLS_glm_kfoldcv(dataY=yCornell,dataX=XCornell,nt=3,modele="pls-glm-gaussian",K=12)
PLS_glm_kfoldcv(dataY=yCornell,dataX=XCornell,nt=3,modele="pls-glm-gaussian",K=6,NK=2,random=TRUE,keepfolds=TRUE)$results_kfolds
PLS_glm_kfoldcv(dataY=yCornell,dataX=XCornell,nt=3,modele="pls-glm-gaussian",K=6,NK=2,random=FALSE,keepfolds=TRUE)$results_kfolds

bbb2 <- PLS_glm_kfoldcv(dataY=yCornell,dataX=XCornell,nt=10,modele="pls-glm-gaussian",keepcoeffs=TRUE)

#For Jackknife computations
kfolds2coeff(bbb2)
boxplot(kfolds2coeff(bbb2)[,1])

kfolds2Chisqind(bbb2)
kfolds2Chisq(bbb2)
kfolds2CVinfos_glm(bbb2)
PLS_v1(log(yCornell),XCornell,10,typeVC="standard")$CVinfos
rm(list=c("XCornell","yCornell","bbb","bbb2"))


data(pine)
Xpine<-pine[,1:10]
ypine<-pine[,11]
bbb <- PLS_glm_kfoldcv(dataY=ypine,dataX=Xpine,nt=10,modele="pls-glm-gaussian",K=10,keepcoeffs=TRUE,keepfolds=FALSE)

#For Jackknife computations
kfolds2coeff(bbb)
boxplot(kfolds2coeff(bbb)[,1])

kfolds2Chisqind(bbb)
kfolds2Chisq(bbb)
kfolds2CVinfos_glm(bbb)
PLS_v1(log(ypine),Xpine,10,typeVC="standard")$CVinfos

XpineNAX21 <- Xpine
XpineNAX21[1,2] <- NA

bbb2 <- PLS_glm_kfoldcv(dataY=ypine,dataX=XpineNAX21,nt=10,modele="pls-glm-gaussian",K=10,keepcoeffs=TRUE,keepfolds=FALSE)

#For Jackknife computations
kfolds2coeff(bbb2)
boxplot(kfolds2coeff(bbb2)[,1])

kfolds2Chisqind(bbb2)
kfolds2Chisq(bbb2)
kfolds2CVinfos_glm(bbb2)
PLS_v1(log(ypine),XpineNAX21,10,typeVC="standard")$CVinfos
rm(list=c("Xpine","XpineNAX21","ypine","bbb","bbb2"))


data(aze_compl)
Xaze_compl<-aze_compl[,2:34]
yaze_compl<-aze_compl$y
bbb <- PLS_glm_kfoldcv(yaze_compl,Xaze_compl,nt=10,K=10,modele="pls")
kfolds2coeff(bbb)
bbb2 <- PLS_glm_kfoldcv(yaze_compl,Xaze_compl,nt=10,K=10,modele="pls-glm-logistic")
kfolds2CVinfos_v2(bbb,MClassed=TRUE)
kfolds2CVinfos_v2(bbb2,MClassed=TRUE)
kfolds2coeff(bbb2)

kfolds2Chisqind(bbb2)
kfolds2Chisq(bbb2)
kfolds2CVinfos_glm(bbb2)
rm(list=c("Xaze_compl","yaze_compl","bbb","bbb2"))

Run the code above in your browser using DataLab