fpca.face: Functional principal component analysis with fast covariance estimation

Description

A fast implementation of the sandwich smoother (Xiao et al., 2013) for covariance matrix smoothing. Pooled generalized cross validation at the data level is used for selecting the smoothing parameter.

Usage

fpca.face(
  Y = NULL,
  ydata = NULL,
  Y.pred = NULL,
  argvals = NULL,
  pve = 0.99,
  npc = NULL,
  var = FALSE,
  simul = FALSE,
  sim.alpha = 0.95,
  center = TRUE,
  knots = 35,
  p = 3,
  m = 2,
  lambda = NULL,
  alpha = 1,
  search.grid = TRUE,
  search.length = 100,
  method = "L-BFGS-B",
  lower = -20,
  upper = 20,
  control = NULL
)

Arguments

Y, ydata

the user must supply either Y, a matrix of functions observed on a regular grid, or a data frame ydata representing irregularly observed functions. See Details.

Y.pred

if desired, a matrix of functions to be approximated using the FPC decomposition.

argvals

numeric; function argument.

pve

proportion of variance explained: used to choose the number of principal components.

npc

how many smooth SVs to try to extract, if NA (the default) the hard thresholding rule of Gavish and Donoho (2014) is used (see Details, References).

var

logical; should an estimate of standard error be returned?

simul

logical; if TRUE curves will we simulated using Monte Carlo to obtain an estimate of the sim.alpha quantile at each argval; ignored if var == FALSE

sim.alpha

numeric; if simul==TRUE, quantile to estimate at each argval; ignored if var == FALSE

center

logical; center Y so that its column-means are 0? Defaults to TRUE

knots

number of knots to use or the vectors of knots; defaults to 35

integer; the degree of B-splines functions to use

integer; the order of difference penalty to use

lambda

smoothing parameter; if not specified smoothing parameter is chosen using optim or a grid search

alpha

numeric; tuning parameter for GCV; see parameter gamma in gam

search.grid

logical; should a grid search be used to find lambda? Otherwise, optim is used

search.length

integer; length of grid to use for grid search for lambda; ignored if search.grid is FALSE

method

method to use; see optim

lower

see optim

upper

see optim

control

see optim

Value

A list with components

Yhat - If Y.pred is specified, the smooth version of Y.pred. Otherwise, if Y.pred=NULL, the smooth version of Y.
scores - matrix of scores
mu - mean function
npc - number of principal components
efunctions - matrix of eigenvectors
evalues - vector of eigenvalues

if var == TRUE additional components are returned

sigma2 - estimate of the error variance
VarMats - list of covariance function estimate for each subject
diag.var - matrix containing the diagonals of each matrix in
crit.val - list of estimated quantiles; only returned if simul == TRUE

References

Xiao, L., Li, Y., and Ruppert, D. (2013). Fast bivariate P-splines: the sandwich smoother, Journal of the Royal Statistical Society: Series B, 75(3), 577-599.

Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.

Examples

Run this code

# NOT RUN {
#### settings
I <- 50 # number of subjects
J <- 3000 # dimension of the data
t <- (1:J)/J # a regular grid on [0,1]
N <- 4 #number of eigenfunctions
sigma <- 2 ##standard deviation of random noises
lambdaTrue <- c(1,0.5,0.5^2,0.5^3) # True eigenvalues
  
case = 1
### True Eigenfunctions
  
if(case==1) phi <- sqrt(2)*cbind(sin(2*pi*t),cos(2*pi*t),
                                sin(4*pi*t),cos(4*pi*t))
if(case==2) phi <- cbind(rep(1,J),sqrt(3)*(2*t-1),
                          sqrt(5)*(6*t^2-6*t+1),
                         sqrt(7)*(20*t^3-30*t^2+12*t-1))

###################################################
########     Generate Data            #############
###################################################
xi <- matrix(rnorm(I*N),I,N);
xi <- xi %*% diag(sqrt(lambdaTrue))
X <- xi %*% t(phi); # of size I by J
Y <- X + sigma*matrix(rnorm(I*J),I,J)

results <- fpca.face(Y,center = TRUE, argvals=t,knots=100,pve=0.99)
###################################################
####               FACE                ########
###################################################  
Phi <- results$efunctions
eigenvalues <- results$evalues

for(k in 1:N){
  if(Phi[,k] %*% phi[,k]< 0) 
    Phi[,k] <- - Phi[,k]
}

### plot eigenfunctions
par(mfrow=c(N/2,2))
seq <- (1:(J/10))*10
for(k in 1:N){
  plot(t[seq],Phi[seq,k]*sqrt(J),type="l",lwd = 3, 
       ylim = c(-2,2),col = "red",
       ylab = paste("Eigenfunction ",k,sep=""),
       xlab="t",main="FACE")
  
  lines(t[seq],phi[seq,k],lwd = 2, col = "black")
}
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

References

See Also

Examples