fpca.sc: Functional principal components analysis by smoothed covariance

Description

Decomposes functional observations using functional principal components analysis. A mixed model framework is used to estimate scores and obtain variance estimates.

Usage

fpca.sc(Y, Y.pred=NULL, nbasis = 10, pve = .99, npc = NULL, var = FALSE, 
        simul = FALSE, sim.alpha = .95, useSymm = FALSE, makePD = FALSE, center=TRUE)

Arguments

matrix of observed functions for which estimates and covariance matrices are desired.

Y.pred

if desired, a matrix of observed functions that will be estimated using the FPC decomposition of Y.

nbasis

number of splines used in the estimation of the mean function and the bivariate smoothing of the covariance matrix.

pve

proportion of variance explained used to choose the number of principal components to be included in the expansion.

npc

prespecified value for the number of principal components to be included in the expansion (if given, this overrides 'pve').

var

TRUE or FALSE indicating whether model-based estimates for the variance of FPCA expansions should be computed.

simul

TRUE or FALSE, indicating whether critical values for simultaneous confidence intervals should be estimated.

sim.alpha

alpha level of the simultaneous intervals.

useSymm

TRUE or FALSE, indicating whether to do the smoothing based on the upper triagonal of the empiricial covariance or not. Can save computation time for large data and allows for covariance surfaces that are very peaked on th

makePD

TRUE or FALSE, indicating whether to enforce positive definiteness of the estimated surface.

center

TRUE or FALSE, indicating whether to substract the (smoothed) mean function from Y or not.

Value

YhatIf Y.pred is specified, the smooth version of Y.pred. Otherwise, if Y.pred=NULL, the smooth version of Y.
scores$n \times npc$ matrix of estimated principal component scores.
muestimated mean function (or a vector of zeroes if center=FALSE).
efunctions$d \times npc$ matrix of estimated eigenfunctions of the functional covariance operator, i.e., the FPC basis functions.
evaluesestimated eigenvalues of the covariance operator, i.e., variances of FPC scores.
npcnumber of FPCs: either the supplied npc, or the minimum number of basis functions needed to explain proportion pve of the variance in the observed curves.
sigma2estimated measurement error variance.
diag.varthe diagonal elements of the covariance matrices for each estimated curve.
VarMatsa list containing the estimated covariance matrices for each curve in Y.
crit.valestimated critical values for constructing simultaneous confidence intervals.

Details

This function computes a FPC decomposition for a collection of observed curves. The curves may be sparsely observed or measured with error. After the decomposition is estimated, a mixed model framework is used to estimate curve-specific scores and variances. FPCA via kernel smoothing of the covariance function, with the diagonal treated separately, was proposed in Staniswalis and Lee (1998) and much extended by Yao et al. (2005), who introduced the "PACE" method. fpca.sc uses penalized splines to smooth the covariance function, as developed by Di et al. (2009) and Goldsmith et al. (2012).

References

Di, C., Crainiceanu, C., Caffo, B., and Punjabi, N. (2009). Multilevel functional principal component analysis. Annals of Applied Statistics, 3, 458--488. Goldsmith, J., Greven, S., and Crainiceanu, C. (2013). Corrected confidence bands for functional data using principal components. Biometrics, 69(1), 41--51. Staniswalis, J. G., and Lee, J. J. (1998). Nonparametric regression analysis of longitudinal data. Journal of the American Statistical Association, 93, 1403--1418. Yao, F., Mueller, H.-G., and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100, 577--590.

Examples

Run this code

data(cd4)

Fit.MM = fpca.sc(cd4, var = TRUE, simul = TRUE)

# for one subject, examine curve estimate, pointwise and simultaneous itervals
EX = 1
EX.MM =  cbind(Fit.MM$Yhat[EX,], 
      Fit.MM$Yhat[EX,] + 1.96 * sqrt(Fit.MM$diag.var[EX,]), 
      Fit.MM$Yhat[EX,] - 1.96 * sqrt(Fit.MM$diag.var[EX,]),
      Fit.MM$Yhat[EX,] + Fit.MM$crit.val[EX] * sqrt(Fit.MM$diag.var[EX,]),
      Fit.MM$Yhat[EX,] - Fit.MM$crit.val[EX] * sqrt(Fit.MM$diag.var[EX,]))


par(mfrow=c(1,3))

# plot data for one subject, with curve and interval estimates
d = as.numeric(colnames(cd4))
plot(d[which(!is.na(cd4[EX,]))], cd4[EX,which(!is.na(cd4[EX,]))], type = 'o', pch = 19,
  cex=.75, ylim = range(0, 3400), xlim = range(d), xlab = "Months since seroconversion", 
    lwd = 1.2, ylab = "Total CD4 Cell Count", main = "Est. & CI - Sampled Data")

matpoints(d, EX.MM, col = 4, type = 'l', lwd = c(2, 1, 1, 1, 1), lty = c(1,1,1,2,2))

# plot estimated mean function
plot(d, Fit.MM$mu, type = 'l', xlab = "Months since seroconversion",
  ylim = range(0, 3400), ylab = "Total CD4 Cell Count", main = "Est. Mean Function")

# plot the first estimated basis function
plot(d, Fit.MM$efunctions[,1], type = 'l', xlab = "Months since seroconversion",
  ylab = "Total CD4 Cell Count", main = "First Est. Basis Function")

Run the code above in your browser using DataLab