cv.sof.spike: Cross-validation for linear scalar-on-function regression for highly densely observed spiky functional data

Description

This function is used to perform cross-validation and build the final model for highly densely observed spiky data using the signal compression approach for the following linear scalar-on-function regression model: $$Y= \mu+\sum_{i=1}^p\int_{a_i}^{b_i}X_i(s)\beta_i(s)ds+\varepsilon$$ where $\mu$ is the intercept. The $\{X_i(s),1\le i\le p\}$ are $p$ functional predictors and $\{\beta_i(s),1\le i\le p\}$ are their corresponding coefficient functions, where $p$ is a positive integer. The $\epsilon$ is the random noise.

We require that all the sample curves of each functional predictor are observed in a common dense grid of time points, but the grid can be different for different predictors. All the sample curves of the functional response are observed in a common dense grid.

Usage

cv.sof.spike(X, Y, t.x, K.cv = 10, upper.level = 10)

Arguments

a list of length $p$, the number of functional predictors. Its $i$-th element is the $n\times m_i$ data matrix for the $i$-th functional predictor $X_i(s)$, where $n$ is the sample size and $m_i$ is the number of observation time points for $X_i(s)$.

an $n$ dimensional vector of the observed values for the response, where $n$ is the sample size.

t.x

a list of length $p$. Its $i$-th element is the vector of obesrvation time points of the $i$-th functional predictor $X_i(s)$, $1\le i\le p$.

K.cv

the number of CV folds. Default is 10.

upper.level

the upper bound of the maximum resolution level. The optimal maximum resolution level is chosen between 1 and "upper.level", together with other tuning parameters, by cross-validation.

Value

An object of the ``cv.sof.spike'' class, which is used in the function pred.sof.spike for prediction.

the estimated intercept.

coef

a list of $p$ vectors, where the $i$-th vector contains the estimated values of the slope coefficient function $beta_i(s)$ at $t.x$.

...

optimal tuning parameters

Details

This method uses wavelet basis to expand $X_i(s)$ and $\beta_i(s)$, ($1 \le i \le p$), and estimates the expansion coefficients of $\beta_i(s)$'s by penalized least squares method with penalty $$\lambda\sum_{i=1}^p \{\sum_{j=0}^{J_1}\{2^{-2\alpha e^{-(j-\tau)/\alpha}}2^{2\alpha j}||b_{ij}||^2+ \kappa ||b_i||^2\}\},$$ where $b_{ij}$ denotes the vector of wavelet coefficient for $\beta_i(s)$ at the $j$th level, and $b_{i}$ is the vector concatenating all $b_{ij}$, $(0\le j \le J_1)$.

References

Xin Qi and Ruiyan Luo, (manuscript) Functional regression for highly densely observed functional data with novel regularity.

Examples

Run this code

# NOT RUN {

##########################################################################
# Example: scalar-on-function for highly-densely observed curves
##########################################################################


ptm <- proc.time()
library(FRegSigCom)
data(Pork)
X=Pork$X
Y=Pork$Y
ntrain=40 # in paper, we use 80 observations as training data
xtrange=c(0,1) # the range of t in x(t).
t.x.list=list(seq(0,1,length.out=dim(X)[2]))
train.index=sample(1:dim(X)[1], ntrain)
X.train <- X.test <- list()

X.train[[1]]=X[train.index,]
X.test[[1]]=X[-(train.index),]
Y.train <- Y[train.index]
Y.test <- Y[-(train.index)]

fit.cv=cv.sof.spike(X.train, Y.train, t.x.list)
Y.pred=pred.sof.spike(fit.cv, X.test)
pred.error=mean((Y.pred-Y.test)^2)

print(c("pred.error=",pred.error))

print(proc.time()-ptm)


# }

Run the code above in your browser using DataLab