Decomposes functional observations using functional principal components analysis. A mixed model framework is used to estimate scores and obtain variance estimates.

```
fpca.sc(
Y = NULL,
ydata = NULL,
Y.pred = NULL,
argvals = NULL,
random.int = FALSE,
nbasis = 10,
pve = 0.99,
npc = NULL,
var = FALSE,
simul = FALSE,
sim.alpha = 0.95,
useSymm = FALSE,
makePD = FALSE,
center = TRUE,
cov.est.method = 2,
integration = "trapezoidal"
)
```

Y, ydata

the user must supply either `Y`

, a matrix of functions
observed on a regular grid, or a data frame `ydata`

representing
irregularly observed functions. See Details.

Y.pred

if desired, a matrix of functions to be approximated using the FPC decomposition.

argvals

the argument values of the function evaluations in `Y`

,
defaults to a equidistant grid from 0 to 1.

random.int

nbasis

number of B-spline basis functions used for estimation of the mean function and bivariate smoothing of the covariance surface.

pve

proportion of variance explained: used to choose the number of principal components.

npc

prespecified value for the number of principal components (if
given, this overrides `pve`

).

var

`TRUE`

or `FALSE`

indicating whether model-based
estimates for the variance of FPCA expansions should be computed.

simul

logical: should critical values be estimated for simultaneous confidence intervals?

sim.alpha

1 - coverage probability of the simultaneous intervals.

useSymm

logical, indicating whether to smooth only the upper
triangular part of the naive covariance (when `cov.est.method==2`

).
This can save computation time for large data sets, and allows for
covariance surfaces that are very peaked on the diagonal.

makePD

logical: should positive definiteness be enforced for the covariance surface estimate?

center

logical: should an estimated mean function be subtracted from
`Y`

? Set to `FALSE`

if you have already demeaned the data using
your favorite mean function estimate.

cov.est.method

covariance estimation method. If set to `1`

, a
one-step method that applies a bivariate smooth to the \(y(s_1)y(s_2)\)
values. This can be very slow. If set to `2`

(the default), a two-step
method that obtains a naive covariance estimate which is then smoothed.

integration

quadrature method for numerical integration; only
`'trapezoidal'`

is currently supported.

An object of class `fpca`

containing:

FPC approximation (projection onto leading components)
of `Y.pred`

if specified, or else of `Y`

.

the observed data

\(n \times npc\) matrix of estimated FPC scores.

estimated mean
function (or a vector of zeroes if `center==FALSE`

).

\(d \times npc\) matrix of estimated eigenfunctions of the functional covariance, i.e., the FPC basis functions.

estimated eigenvalues of the covariance operator, i.e., variances of FPC scores.

number of FPCs: either the supplied `npc`

, or the minimum
number of basis functions needed to explain proportion `pve`

of the
variance in the observed curves.

argument values of eigenfunction evaluations

estimated measurement error variance.

diagonal elements of the covariance matrices for each estimated curve.

a list containing the estimated
covariance matrices for each curve in `Y`

.

estimated critical values for constructing simultaneous confidence intervals.

This function computes a FPC decomposition for a set of observed curves, which may be sparsely observed and/or measured with error. A mixed model framework is used to estimate curve-specific scores and variances.

FPCA via kernel smoothing of the covariance function, with the diagonal
treated separately, was proposed in Staniswalis and Lee (1998) and much
extended by Yao et al. (2005), who introduced the 'PACE' method.
`fpca.sc`

uses penalized splines to smooth the covariance function, as
developed by Di et al. (2009) and Goldsmith et al. (2013).

The functional data must be supplied as either

an \(n \times d\) matrix

`Y`

, each row of which is one functional observation, with missing values allowed; ora data frame

`ydata`

, with columns`'.id'`

(which curve the point belongs to, say \(i\)),`'.index'`

(function argument such as time point \(t\)), and`'.value'`

(observed function value \(Y_i(t)\)).

Di, C., Crainiceanu, C., Caffo, B., and Punjabi, N. (2009).
Multilevel functional principal component analysis. *Annals of Applied
Statistics*, 3, 458--488.

Goldsmith, J., Greven, S., and Crainiceanu, C. (2013). Corrected confidence
bands for functional data using principal components. *Biometrics*,
69(1), 41--51.

Staniswalis, J. G., and Lee, J. J. (1998). Nonparametric regression
analysis of longitudinal data. *Journal of the American Statistical
Association*, 93, 1403--1418.

Yao, F., Mueller, H.-G., and Wang, J.-L. (2005). Functional data analysis
for sparse longitudinal data. *Journal of the American Statistical
Association*, 100, 577--590.

# NOT RUN { library(ggplot2) library(reshape2) data(cd4) Fit.MM = fpca.sc(cd4, var = TRUE, simul = TRUE) Fit.mu = data.frame(mu = Fit.MM$mu, d = as.numeric(colnames(cd4))) Fit.basis = data.frame(phi = Fit.MM$efunctions, d = as.numeric(colnames(cd4))) ## for one subject, examine curve estimate, pointwise and simultaneous itervals EX = 1 EX.MM = data.frame(fitted = Fit.MM$Yhat[EX,], ptwise.UB = Fit.MM$Yhat[EX,] + 1.96 * sqrt(Fit.MM$diag.var[EX,]), ptwise.LB = Fit.MM$Yhat[EX,] - 1.96 * sqrt(Fit.MM$diag.var[EX,]), simul.UB = Fit.MM$Yhat[EX,] + Fit.MM$crit.val[EX] * sqrt(Fit.MM$diag.var[EX,]), simul.LB = Fit.MM$Yhat[EX,] - Fit.MM$crit.val[EX] * sqrt(Fit.MM$diag.var[EX,]), d = as.numeric(colnames(cd4))) ## plot data for one subject, with curve and interval estimates EX.MM.m = melt(EX.MM, id = 'd') ggplot(EX.MM.m, aes(x = d, y = value, group = variable, color = variable, linetype = variable)) + geom_path() + scale_linetype_manual(values = c(fitted = 1, ptwise.UB = 2, ptwise.LB = 2, simul.UB = 3, simul.LB = 3)) + scale_color_manual(values = c(fitted = 1, ptwise.UB = 2, ptwise.LB = 2, simul.UB = 3, simul.LB = 3)) + labs(x = 'Months since seroconversion', y = 'Total CD4 Cell Count') ## plot estimated mean function ggplot(Fit.mu, aes(x = d, y = mu)) + geom_path() + labs(x = 'Months since seroconversion', y = 'Total CD4 Cell Count') ## plot the first two estimated basis functions Fit.basis.m = melt(Fit.basis, id = 'd') ggplot(subset(Fit.basis.m, variable %in% c('phi.1', 'phi.2')), aes(x = d, y = value, group = variable, color = variable)) + geom_path() ## input a dataframe instead of a matrix nid <- 20 nobs <- sample(10:20, nid, rep=TRUE) ydata <- data.frame( .id = rep(1:nid, nobs), .index = round(runif(sum(nobs), 0, 1), 3)) ydata$.value <- unlist(tapply(ydata$.index, ydata$.id, function(x) runif(1, -.5, .5) + dbeta(x, runif(1, 6, 8), runif(1, 3, 5)) ) ) Fit.MM = fpca.sc(ydata=ydata, var = TRUE, simul = FALSE) # }