fregre.pc.cv: Functional penalized PC regression with scalar response using selection of number of PC components

Description

Functional Regression with scalar response using selection of number of (penalized) principal components PC through cross-validation. The algorithm selects the PC with best estimates the response. The selection is performed by cross-validation (CV) or Model Selection Criteria (MSC). After is computing functional regression using the best selection of principal components.

Usage

fregre.pc.cv(fdataobj, y, kmax=8,  lambda = 0, P = c(1, 0, 0), 
    criteria = "SIC",weights=rep(1,len=n),...)

Arguments

fdataobj

fdata class object.

Scalar response with length n.

kmax

The number of components to include in the model.

lambda

Vector with the amounts of penalization. Default value is 0, i.e. no penalization is used. If lambda=TRUE the algorithm computes a sequence of lambda values.

The vector of coefficients to define the penalty matrix object. For example, if P=c(1,0,0), ridge regresion is computed and if P=c(0,0,1), penalized regression is computed penalizing the second derivative (curvature).

criteria

Type of cross-validation (CV) or Model Selection Criteria (MSC) applied. Possible values are "CV", "AIC", "AICc", "SIC","SICc".

weights

...

Further arguments passed to fregre.pc or fregre.pls

Value

Return:
fregre.pcFitted regression object by the best (pc.opt) components.
pc.optIndex of PC components selected.
MSC.minMinimum Model Selection Criteria (MSC) value for the (pc.opt components.
MSCMinimum Model Selection Criteria (MSC) value for kmax components.

Details

The algorithm selects the best principal components pc.opt from the first kmax PC and (optionally) the best penalized parameter lambda.opt from a sequence of non-negative numbers lambda. If kmax is a integer (by default and recomended) the procedure is as follows (see example 1):

Calculate the best principal component (pc.order[1]) betweenkmaxbyfregre.pc.
Calculate the second-best principal component (pc.order [2]) between the(kmax-1)byfregre.pcand calculate the criteria value of the two principal components.
The process (point 1 and 2) is repeated untilkmaxprincipal component (pc.order[kmax]).
The process (point 1, 2 and 3) is repeated for eachlambdavalue.
The method selects the principal components (pc.opt=pc.order[1:k.min]) and (optionally) the lambda parameter with minimum MSC criteria.

If kmax is a sequence of integer the procedure is as follows (see example 2):

The method selects the best principal components with minimum MSC criteria by stepwise regression usingfregre.pcin each step.
The process (point 1) is repeated for eachlambdavalue.
The method selects the principal components (pc.opt=pc.order[1:k.min]) and (optionally) the lambda parameter with minimum MSC criteria.

Finally, is computing functional PC regression between functional explanatory variable $X(t)$ and scalar response $Y$ using the best selection of PC pc.opt and ridge parameter rn.opt. The criteria selection is done by cross-validation (CV) or Model Selection Criteria (MSC).

Predictive Cross-Validation:$PCV(k_n)=\frac{1}{n}\sum_{i=1}^{n}{\Big(y_i -\hat{y}_{(-i,k_n)} \Big)^2}$, criteria=``CV''
Model Selection Criteria:$MSC(k_n)=log \left[ \frac{1}{n}\sum_{i=1}^{n}{\Big(y_i-\hat{y}_i\Big)^2} \right] +p_n\frac{k_n}{n}$ $p_n=\frac{log(n)}{n}$,criteria=``SIC'' (by default) $p_n=\frac{log(n)}{n-k_n-2}$,criteria=``SICc'' $p_n=2$,criteria=``AIC'' $p_n=\frac{2n}{n-k_n-2}$,criteria=``AICc'' $p_n=\frac{2log(log(n))}{n}$,criteria=``HQIC'' %\item The generalized minimum description length (gmdl) criteria: \cr % % \eqn{gmdl(k_n)=log \left[ \frac{1}{n-k_n}\sum_{i=1}^{n}{\Big(y_i-\hat{y}_i\Big)^2} \right] +K_n log \left(\frac{(n-k_n)\sum_{i=1}^{n}\hat{y}_i^2}{{\sum_{i=1}^{n}\Big(y_i-\hat{y}_i\Big)^2} }\right)+log(n) } %{MSC(k_n)=log [ 1/(n-k_n) \sum_(i=1:n){ (y_i- < X_i , \beta_(i,k_n) > )^2} ] +p_n k_n/n } %\item The rho criteria: \eqn{rho(k_n)=log \left[ \frac{1}{n-k_n}\sum_{i=1}^{n}\left(\frac{y_i-\hat{y}_i}{1-H_{ii}} \right)^2\right]}

where criteria is an argument that controls the type of validation used in the selection of the smoothing parameter kmax$=k_n$ and penalized parameter lambda$=\lambda$.

References

Febrero-Bande, M., Oviedo de la Fuente, M. (2012). Statistical Computing in Functional Data Analysis: The R Package fda.usc. Journal of Statistical Software, 51(4), 1-28. http://www.jstatsoft.org/v51/i04/

Examples

Run this code

data(tecator)
x<-tecator$absorp.fdata[1:129]
y<-tecator$y$Fat[1:129]
# no penalization
 res.pc1=fregre.pc.cv(x,y,8)
# 2nd derivative penalization
 res.pc2=fregre.pc.cv(x,y,8,lambda=TRUE,P=c(0,0,1))
#Ridge regression
res.pc3=fregre.pc.cv(x,y,1:8,lambda=TRUE,P=1)

Run the code above in your browser using DataLab