Computation of the AIC and Mallows's
aicplsr(
X, y, nlv, algo = NULL,
meth = c("cg", "div", "cov"),
correct = TRUE, B = 50,
print = FALSE, ...)
dataframe with
dataframe with the differences between the estimated values of
vector with the optimal number of latent variables in the model (i.e. minimizing aic, cp1 and cp2 values)
A
A vector of length
The maximal number of latent variables (LVs) to consider in the model.
a PLS algorithm. Default to NULL
(plskern
is used).
Method used for estimating "cg"
(dfplsr_cg
), "cov"
(dfplsr_cov
)or "div"
(dfplsr_div
).
Logical. If TRUE
(default), the AICc corection is applied to the criteria.
For meth = "div"
: the number of observations in the data receiving perturbation (maximum is dfplsr_cov
). For meth = "cov"
: the number of bootstrap replications (see dfplsr_cov
).
Logical. If TRUE
, fitting information are printed.
Optionnal arguments to pass in algo
.
For a model with aicplsr
calculates
where
By default (argument correct
), the small sample size correction (so-called AICc) is applied to AIC and Cp for deucing the bias.
The functions returns two estimates of Cp (
The model complexity meth
).
Burnham, K.P., Anderson, D.R., 2002. Model selection and multimodel inference: a practical informationtheoretic approach, 2nd ed. Springer, New York, NY, USA.
Burnham, K.P., Anderson, D.R., 2004. Multimodel Inference: Understanding AIC and BIC in Model Selection. Sociological Methods & Research 33, 261-304. https://doi.org/10.1177/0049124104268644
Efron, B., 2004. The Estimation of Prediction Error. Journal of the American Statistical Association 99, 619-632. https://doi.org/10.1198/016214504000000692
Eubank, R.L., 1999. Nonparametric Regression and Spline Smoothing, 2nd ed, Statistics: Textbooks and Monographs. Marcel Dekker, Inc., New York, USA.
Hastie, T., Tibshirani, R.J., 1990. Generalized Additive Models, Monographs on statistics and applied probablity. Chapman and Hall/CRC, New York, USA.
Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, NewYork.
Hastie, T., Tibshirani, R., Wainwright, M., 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press
Hurvich, C.M., Tsai, C.-L., 1989. Regression and Time Series Model Selection in Small Samples. Biometrika 76, 297. https://doi.org/10.2307/2336663
Lesnoff, M., Roger, J.M., Rutledge, D.N., Submitted. Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data. Journal of Chemometrics.
Mallows, C.L., 1973. Some Comments on Cp. Technometrics 15, 661-675. https://doi.org/10.1080/00401706.1973.10489103
Ye, J., 1998. On Measuring and Correcting the Effects of Data Mining and Model Selection. Journal of the American Statistical Association 93, 120-131. https://doi.org/10.1080/01621459.1998.10474094
Zuccaro, C., 1992. Mallows'Cp Statistic and Model Selection in Multiple Linear Regression. International Journal of Market Research. 34, 1-10. https://doi.org/10.1177/147078539203400204
data(cassav)
Xtrain <- cassav$Xtrain
ytrain <- cassav$ytrain
nlv <- 25
res <- aicplsr(Xtrain, ytrain, nlv = nlv)
names(res)
headm(res$crit)
z <- res$crit
oldpar <- par(mfrow = c(1, 1))
par(mfrow = c(1, 4))
plot(z$df[-1])
plot(z$aic[-1], type = "b", main = "AIC")
plot(z$cp1[-1], type = "b", main = "Cp1")
plot(z$cp2[-1], type = "b", main = "Cp2")
par(oldpar)
Run the code above in your browser using DataLab