rob.ff.reg: Robust function-on-function regression

Description

This function is used to perform both classical and robust function-on-function regression model $$ Y(t) = \sum_{m=1}^M \int X_m(s) \beta_m(s,t) ds + \epsilon(t),$$ where $Y(t)$ denotes the functional response, $X_m(s)$ denotes the $m$-th functional predictor, $\beta_m(s,t)$ denotes the $m$-th bivariate regression coefficient function, and $\epsilon(t)$ is the error function.

Usage

rob.ff.reg(Y, X, model = c("full", "selected"), emodel = c("classical", "robust"),
 fmodel = c("MCD", "MLTS", "MM", "S", "tau"), nbasisY = NULL, nbasisX = NULL,
 gpY = NULL, gpX = NULL, ncompY = NULL, ncompX = NULL)

Value

A list object with the following components:

data: A list of matrices including the original functional response and functional predictors.
fitted.values: An $n \times p$-dimensional matrix containing the fitted values of the functional response.
residuals: An $n \times p$-dimensional matrix containing the residual functions.
fpca.results: A list object containing the functional principal component analysis results of the functional predictor and functional predictors variables.
model.details: A list object containing model details, such as number of basis functions, number of principal components, and grid points used for each functional variable.

Arguments

Y: An $n \times p$-dimensional matrix containing the observations of functional response $Y(t)$, where $n$ is the sample size and $p$ denotes the number of grid points for $Y(t)$.
X: A list consisting of $M$ functional predictors $X_m(s), 1\le m\le M$. Each element of X is an $n \times p_m$-dimensional matrix containing the observations of $m$-th functional predictor $X_m(s)$, where $n$ is the sample size and $p_m$ denotes the number of grid points for $X_m(s)$.
model: Model to be fitted. Possibilities are "full" and "selected".
emodel: Method to be used for functional principal component decomposition. Possibilities are "classical"" and "robust".
fmodel: Fitting model used to estimate the function-on-function regression model. Possibilities are "MCD", "MLTS", "MM", "S", and "tau".
nbasisY: An integer value specifying the number of B-spline basis expansion functions to be used to approximate the functional principal components for the response variable $Y(t)$. If NULL, then, $min(20, p/4)$ number of B-spline basis expansion functions are used.
nbasisX: A vector with length $M$. Its $m$-th value denotes the number of B-spline basis expansion functions to be used to approximate the functional principal components for the $m$-th functional predictor $X_m(s)$. If NULL, then, $min(20, p_m/4)$ number of B-spline basis expansion functions are used for each functional predictor, where $p_m$ denotes the number of grid points for $X_m(s)$.
gpY: A vector containing the grid points of the functional response $Y(t)$. If NULL, then $p$ equally spaced time points in te interval [0, 1] are used.
gpX: A list with length $M$. The $m$-th element of gpX is a vector containing the grid points of the $m$-th functional predictor $X_m(s)$. If NULL, then, $p_m$ equally spaced time points in te interval [0, 1] are used for the $m$-th functional predictor.
ncompY: An integer specifying the number of functional principal components to be computed for the functional response $Y(t)$. If NULL, then, the number whose usage results in at least 95% explained variation is used as the number of principal components.
ncompX: A vector with length $M$. Its $m$-th value denotes the number of functional principal components to be computed for the $m$-th functional predictor $X_m(s)$. If NULL, then, for each functional predictor, the number whose usage results in at least 95% explained variation is used as the number of principal components.

Author

Ufuk Beyaztas and Han Lin Shang

Details

When performing a function-on-function regression model based on the functional principal component analysis, first, both the functional response $Y(t)$ and functional predictors $X_m(s), 1\le m\le M$ are decomposed by the functional principal component analysis method: $$Y(t) = \bar{Y}(t) + \sum_{k=1}^K \nu_k \phi_k(t),$$ $$X_m(s) = \bar{X}_m(s) + \sum_{l=1}^{K_m} \xi_{ml} \psi_{ml}(s),$$ where $\bar{Y}(t)$ and $\bar{X}_m(s)$ are the mean functions, $\phi_k(t)$ and $\psi_{ml}(s)$ are the weight functions, and $\nu_k = \int (Y(t) - \bar{Y}(t)) \phi_k(t)$ and $\xi_{ml} = \int (X_m(s) - \bar{X}_m(s)) \psi_{ml}(s)$ are the principal component scores for the functional response and $m$-th functional predictor, respectively. Assume that the $m$-th bivariate regression coefficient function admits the expansion $$\beta_m(s,t) = \sum_{k=1}^K \sum_{l=1}^{K_m} b_{mkl} \phi_k(t) \psi_{ml}(s),$$ where $b_{mkl} = \int \int \beta_m(s,t) \phi_k(t) \psi_{ml}(s) dt ds$. Then, the following multiple regression model is obtained for the functional response: $$\hat{Y}(t) = \bar{Y}(s) + \sum_{k=1}^K ( \sum_{m=1}^M \sum_{l=1}^{K_m} b_{mkl} \xi_{ml} ) \phi_k(t).$$

If model = "full", then, all the functional predictor variables are used in the model.

If model = "selected", then, only the significant functional predictor variables determined by the forward variable selection procedure of Beyaztas and Shang (2021) are used in the model.

If emodel = "classical", then, the least-squares method is used to estimate the function-on-function regression model.

If emodel = "robust", then, the robust functional principal component analysis of Bali et al. (2011) along with the method specified in fmodel is used to estimate the function-on-function regression model.

If fmodel = "MCD", then, the minimum covariance determinant estimator of Rousseeuw et al. (2004) is used to estimate the function-on-function regression model.

If fmodel = "MLTS", then, the multivariate least trimmed squares estimator Agullo et al. (2008) is used to estimate the function-on-function regression model.

If fmodel = "MM", then, the MM estimator of Kudraszow and Maronna (2011) is used to estimate the function-on-function regression model.

If fmodel = "S", then, the S estimator of Bilodeau and Duchesne (2000) is used to estimate the function-on-function regression model.

If fmodel = "tau", then, the tau estimator of Ben et al. (2006) is used to estimate the function-on-function regression model.

References

J. Agullo and C. Croux and S. V. Aelst (2008), "The multivariate least-trimmed squares estimator", Journal of Multivariate Analysis, 99(3), 311-338.

M. G. Ben and E. Martinez and V. J. Yohai (2006), "Robust estimation for the multivariate linear model based on a $\tau$ scale", Journal of Multivariate Analysis, 97(7), 1600-1622.

U. Beyaztas and H. L. Shang (2021), "A partial least squares approach for function-on-function interaction regression", Computational Statistics, 36(2), 911-939.

J. L. Bali and G. Boente and D. E. Tyler and J. -L.Wang (2011), "Robust functional principal components: A projection-pursuit approach", The Annals of Statistics, 39(6), 2852-2882.

M. Bilodeau and P. Duchesne (2000), "Robust estimation of the SUR model", The Canadian Journal of Statistics, 28(2), 277-288.

N. L. Kudraszow and R. A. Moronna (2011), "Estimates of MM type for the multivariate linear model", Journal of Multivariate Analysis, 102(9), 1280-1292.

P. J. Rousseeuw and K. V. Driessen and S. V. Aelst and J. Agullo (2004), "Robust multivariate regression", Technometrics, 46(3), 293-305.

Examples

Run this code

sim.data <- generate.ff.data(n.pred = 5, n.curve = 200, n.gp = 101)
Y <- sim.data$Y
X <- sim.data$X
gpY <- seq(0, 1, length.out = 101) # grid points of Y
gpX <- rep(list(seq(0, 1, length.out = 101)), 5) # grid points of Xs
model.MM <- rob.ff.reg(Y = Y, X = X, model = "full", emodel = "robust",
                       fmodel = "MM", gpY = gpY, gpX = gpX)

Run the code above in your browser using DataLab