ffpc: Construct a PC-based function-on-function regression term

Description

Defines a term $\int X_i(s)\beta(t,s)ds$ for inclusion in an mgcv::gam-formula (or bam or gamm or gamm4:::gamm) as constructed by pffr. In contrast to ff, ffpc does an FPCA decomposition $X(s) \approx \sum^K_{k=1} \xi_{ik} \Phi_k(s)$ using fpca.sc. Because $$\int X_i(s)\beta(t,s)ds = \sum^K_{k=1} \xi_{ik} \int \Phi_k(s) \beta(s,t) ds = \sum^K_{k=1} \xi_{ik} \tilde \beta_k(t),$$ i.e., the function-on-function term can be represented as a sum of $K$ univariate functions $\tilde \beta_k(t)$ in $t$ each multiplied by the FPC loadings $\xi_{ik}$. The truncation paramter $K$ is chosen as described in fpca.sc. To reduce model complexity, the $\tilde \beta_k(t)$ all have a single joint smoothing parameter (in mgcv, they get the same id).

Usage

ffpc(X, yind, xind = seq(0, 1, length = ncol(X)),
    splinepars = list(bs = "ps", m = c(2, 1), k = 8),
    center = TRUE,
    decomppars = list(pve = 0.99, useSymm = TRUE),
    npc.max = 15)

Arguments

an n by ncol(xind) matrix of function evaluations $X_i(s_{i1}),\dots, X_i(s_{iS})$; $i=1,\dots,n$.

yind

matrix (or vector) of indices of evaluations of $Y_i(t)$

xind

matrix (or vector) of indices of evaluations of $X_i(t)$, defaults to

seq(0, 1,
  length=ncol(X))

splinepars

optional arguments supplied to the basistype-term. Defaults to a cubic B-spline with first difference penalties and 8 basis functions for each $\tilde \beta_k(t)$.

center

center X so that the mean of $X(s)$ is zero for each index value $s$?

decomppars

paramters for the FPCA performed with fpca.sc.

npc.max

maximal number $K$ of FPCs to use, regardless of decomppars; defaults to 15

Value

a list containing the necessary information to construct a term to be included in a mgcv::gam-formula.

Details

Using this instead of ff() can be beneficial if the covariance operator of the $X_i(s)$ has low effective rank (i.e., if $K$ is small). If the covariance operator of the $X_i(s)$ is of (very) high rank, i.e., if eqn{K} is large, ffpc() will not be very efficient.

Please see pffr for details on model specification and implementation. ffpc() IS AN EXPERIMENTAL FEATURE AND NOT WELL TESTED YET -- USE AT YOUR OWN RISK.

Examples

Run this code

set.seed(1122)
n <- 55
S <- 60
T <- 50
s <- seq(0,1, l=S)
t <- seq(0,1, l=T)
df <- 10
B <- bs(s, df=df)
X <- t(B %*% matrix(rt(n*df, df=3), nrow=df))
beta.st <- outer(s,t, test2)

y <- (1/S*X)%*%beta.st

data <- list(y=y, X=X)
# set number of FPCs to true rank of process for this example:
m.pc <- pffr(y ~ c(1) + 0 + ffpc(X, yind=t, decomppars=list(npc=df)),
        data=data, yind=t)
summary(m.pc)
m.ff <- pffr(y ~ c(1) + 0 + ff(X, yind=t), data=data, yind=t)
summary(m.ff)

# plot implied coefficient surfaces:
betatilde <- predict.gam(m.pc, newdata=data.frame(t.vec=t, X.PC1=1, X.PC2=1,
        X.PC3=1, X.PC4=1, X.PC5=1, X.PC6=1, X.PC7=1, X.PC8=1, X.PC9=1,
        X.PC10=1, X.PC10=1), type="terms")
layout(t(1:3))
persp(t, s, t(beta.st), theta=30, phi=40, main="Truth")
plot(m.ff, select=1, pers=TRUE, zlim=range(beta.st), theta=30, phi=40,
        main="ff()")
# betatilde_k (t) = \int Phi_k(s) beta(s,t) ds \approx 1/S t(Phi_k) %*% beta(s,t)
# --> beta(s,t) = S * Phi %*% [betatilde_1(t) ... betatilde_K(t)]
persp(t, s, t(S * (m.pc$pffr$ffpc[[1]]$PCMat %*% t(betatilde))), zlim=range(beta.st),
   theta=30, phi=40, main="ffpc()")

Run the code above in your browser using DataLab