cv.ff.interaction: Cross-validation for function-on-function regression models with specified main effects and two-way interaction terms

Description

This function is used to perform cross-validation and build the final model using the signal compression approach for the following linear function-on-function regression model with main effects, two-way interaction effects and quadratic effects. Let $\{X_i(s), 1\le i\le p\}$ be $p$ potential functional predictors. The model is given by $$Y(t)= \mu(t)+\sum_{i \in M}\int_{a_i}^{b_i} X_i(s)\beta_i(s,t)ds+\sum_{(i,j) \in I}\int_{a_i}^{b_i}\int_{a_j}^{b_j} X_i(u)X_j(v)\gamma_{ij}(u,v,t)dudv+\epsilon(t),$$ where $\mu(t)$ is the intercept function. The index set $M$ of main effects is a subset of $\{1,...,p\}$, and the index set $I$ of interactions and quadratic effects is a subset of the collection of all possible pairs $\{(i,j), 1\le i\le j\le p\}$. The $\{\beta_i(s,t), i \in M\}$ and $\{\gamma_{ij}(u,v,t),(i,j)\in I\}$ are the corresponding coefficient functions. The $\epsilon(t)$ is the noise function.

Usage

cv.ff.interaction(X, Y, t.x, t.y, main.effect, interaction.effect=NULL,
       adaptive=FALSE, s.n.basis=40, t.n.basis=40, inter.n.basis=20,
       basis.type.x="Bspline", basis.type.y="Bspline", K.cv=5, 
       upper.comp=8, thresh=0.01)

Arguments

a list of $p$ potential functional predictors. Its $i$-th element is the $n\times m_i$ data matrix for the $i$-th potential functional predictor $X_i(s)$, where $n$ is the sample size and $m_i$ is the number of observation time points for $X_i(s)$.

the $n\times m$ data matrix for the functional response $Y(t)$, where $n$ is the sample size and $m$ is the number of the observation time points for $Y(t)$.

t.x

a list of length $p$. Its $i$-th element is the vector of observation time points of the $i$-th functional predictor $X_i(s)$, $1\le i\le p$.

t.y

the vector of observation time points of the functional response $Y(t)$.

main.effect

a vector of indices for main effects. It is a subset of $\{1,2,...,p\}$.

interaction.effect

a matrix of two columns. Each row of this matrix specifies the index of a two-way interaction or a quadratic effect. Default is NULL.

adaptive

a logic value indicating whether using adaptive penalty that has different smoothness tuning parameters for different target functions (see Details). Default is FALSE.

s.n.basis

the number of basis functions used for estimating the functions $\psi_{ik}(s)$ (see Details). Default is 40.

t.n.basis

the number of basis functions used for estimating the functions $w_{k}(t)$. Default is 40.

inter.n.basis

the number of one-dimensional basis functions used to construct the tensor product basis functions for estimating the functions $\phi_{ijk}(u,v)$. Default is 20.

basis.type.x

the type of basis functions $\psi_{ik}(s)$. Only "BSpline" (default) and "Fourier" are supported.

basis.type.y

the type of basis functions $w_{k}(t)$. Only "BSpline" (default) and "Fourier" are supported.

K.cv

the number of CV folds. Default is 5.

upper.comp

the upper bound for the maximum number of components to be calculated. Default is 10.

thresh

a number between 0 and 1 used to determine the maximum number of components we need to calculate. The maximum number is between one and the "upp.comp" above. The optimal number of components will be chosen between 1 and this maximum number, together with other tuning parameters by cross-validation. A smaller thresh value leads to a larger maximum number of components and a longer running time. A larger thresh value needs less running time, but may miss some important components and lead to a larger prediction error. Default is 0.01.

Value

An object of the ``cv.ff.interaction'' class, which is used in the function pred.ff.interaction for prediction and getcoef.ff.interaction for extracting the estimated coefficient functions.

fitted_model

a list for interval use.

y_penalty_inv

a list for internal use.

the input X.

the input Y.

x.smooth.params

a list for internal use.

y.smooth.params

a list for internal use.

basis.types

a vector including basis.type.x and basis.type.y.

Details

This method uses the decomposition of the coefficient functions $$\beta_i(s,t)=\sum_{k=1}^\infty\psi_{ik}(s)w_k(t), i\in M$$ and $$\gamma_{ij}(u,v,t)=\sum_{k=1}^\infty\phi_{ij,k}(s)w_k(t), (i,j)\in I$$ where for each $k>0$, $\{\{\psi_{ik}(s), i\in M\}, \{\phi_{ij,k}(s), (i,j)\in I\}\}$ are estimated by solving a generalized functional eigenvalue problem with the nonadaptive penalty $$\lambda\sum_{i\in M} \{|| \psi_{ik}||^2+ \tau ||\psi''_{ik}||^2\} + \lambda\sum_{(i,j)\in I} \{|| \phi_{ij,k}||^2+ \tau (||\partial_{uu}\psi_{ij,k}||^2+||\partial_{uv}\psi_{ij,k}||^2+||\partial_{vv}\psi_{ij,k}||^2)\} $$ or the adaptive penalty $$\lambda\sum_{i\in M} \{\omega_{ik}^{(0)}||\psi_{ik}||^2+ \tau \omega_{ik}^{(2)}||\psi''_{ik}||^2\}+$$ $$+ \lambda\sum_{(i,j)\in I} \{\omega_{ij,k}^{(00)}|| \phi_{ij,k}||^2+ \tau (\omega_{ij,k}^{(20)}||\partial_{uu}\psi_{ij,k}||^2+\omega_{ij,k}^{(11)}||\partial_{uv}\psi_{ij,k}||^2+\omega_{ij,k}^{(02)}||\partial_{vv}\psi_{ij,k}||^2)\} $$ and then $\{w_{k}(t), k>0\}$ are estimate by regressing $Y(t)$ on $\{\hat{z}_{1},... \hat{z}_{K}\}$ with nonadaptive penalty $\kappa \sum_{k=0}^K \|w''_k\|^2$ or adaptive penalty $\kappa \sum_{k=0}^K \omega_k^{(t)}\|w''_k\|^2$ tuned by the smoothness parameter $\kappa$. Here $\hat{z}_{k}= \sum_{i \in M}\int_{a_i}^{b_i} X_{i}(s)\hat{\psi}_{ik}(s)ds +\sum_{(i,j) \in I}\int_{a_i}^{b_i}\int_{a_j}^{b_j} X_{i}(u)X_{j}(v)\hat{\phi}_{ij,k}(u,v)dudv$ and then centered around its sample mean.

References

Ruiyan Luo and Xin Qi (2018) Interaction model and model selection for function-on-function regression, Journal of Computational and Graphical Statistics. https://doi.org/10.1080/10618600.2018.1514310

Examples

Run this code

# NOT RUN {
 
################################################################################# 
# Example: interaction function-on-function model with 
#          specified effects 
############################################################################### 


ptm <- proc.time()
library(FRegSigCom)
data(ocean)
Y=ocean[[1]]
Y.train=Y[1:50,]
Y.test=Y[-(1:50),]
t.y=seq(0,1, length.out = ncol(Y))
X.list=list()
X.train.list=list()
X.test.list=list()
t.x.list=list()
for(i in 1:4)
{
  X.list[[i]]=ocean[[i+1]]
  X.train.list[[i]]=X.list[[i]][1:50,]
  X.test.list[[i]]=X.list[[i]][-(1:50),]
  t.x.list[[i]]=seq(0,1, length.out = ncol(X.list[[i]]))
}
main.effect=1:2
inter.effect=rbind(c(1,1), c(1,2), c(2,2))
fit.fix.adaptive=cv.ff.interaction(X.train.list, Y.train, t.x.list, t.y,
           adaptive=TRUE, main.effect, inter.effect)
Y.pred=pred.ff.interaction(fit.fix.adaptive,  X.test.list)

error<- mean((Y.pred-Y.test)^2) 
print(c(" prediction error=", error))
print(proc.time()-ptm)

# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

Details

References

See Also

Examples