50% off | Unlimited Data & AI Learning
Get 50% off unlimited learning

fPASS (version 1.0.0)

fpca_sc: Functional principal components analysis by smoothed covariance

Description

[Stable]

This function computes a FPC decomposition for a set of observed curves, which may be sparsely observed and/or measured with error. A mixed model framework is used to estimate curve-specific scores and variances.

Usage

fpca_sc(
  Y = NULL,
  ydata = NULL,
  Y.pred = NULL,
  argvals = NULL,
  random.int = FALSE,
  nbasis = 10,
  pve = 0.95,
  npc = NULL,
  useSymm = FALSE,
  makePD = FALSE,
  center = TRUE,
  cov.est.method = 2,
  integration = "trapezoidal"
)

Value

An object of class fpca containing:

Yhat

FPC approximation (projection onto leading components) of Y.pred if specified, or else of Y.

Y

the observed data

scores

n×npc matrix of estimated FPC scores.

mu

estimated mean function (or a vector of zeroes if center==FALSE).

efunctions

d×npc matrix of estimated eigenfunctions of the functional covariance, i.e., the FPC basis functions.

evalues

estimated eigenvalues of the covariance operator, i.e., variances of FPC scores.

npc

number of FPCs: either the supplied npc, or the minimum number of basis functions needed to explain proportion pve of the variance in the observed curves.

argvals

argument values of eigenfunction evaluations

sigma2

estimated measurement error variance.

diag.var

diagonal elements of the covariance matrices for each estimated curve.

VarMats

a list containing the estimated covariance matrices for each curve in Y.

crit.val

estimated critical values for constructing simultaneous confidence intervals.

Arguments

Y, ydata

the user must supply either Y, a matrix of functions observed on a regular grid, or a data frame ydata representing irregularly observed functions. See Details.

Y.pred

if desired, a matrix of functions to be approximated using the FPC decomposition.

argvals

the argument values of the function evaluations in Y, defaults to a equidistant grid from 0 to 1.

random.int

If TRUE, the mean is estimated by gamm4 with random intercepts. If FALSE (the default), the mean is estimated by gam treating all the data as independent.

nbasis

number of B-spline basis functions used for estimation of the mean function and bivariate smoothing of the covariance surface.

pve

proportion of variance explained: used to choose the number of principal components.

npc

prespecified value for the number of principal components (if given, this overrides pve).

useSymm

logical, indicating whether to smooth only the upper triangular part of the naive covariance (when cov.est.method==2). This can save computation time for large data sets, and allows for covariance surfaces that are very peaked on the diagonal.

makePD

logical: should positive definiteness be enforced for the covariance surface estimate?

center

logical: should an estimated mean function be subtracted from Y? Set to FALSE if you have already demeaned the data using your favorite mean function estimate.

cov.est.method

covariance estimation method. If set to 1, a one-step method that applies a bivariate smooth to the y(s1)y(s2) values. This can be very slow. If set to 2 (the default), a two-step method that obtains a naive covariance estimate which is then smoothed.

integration

quadrature method for numerical integration; only 'trapezoidal' is currently supported.

Author

Salil Koner salil.koner@duke.edu

Details

This function is emulated from the refund::fpca.sc() function where the estimation of covariance surface and the eigenfunctions are exactly as that of refund::fpca.sc(), but it rectifies the computational intricacies involved in the estimation of shrinkage scores, and fixes the issue of NA values in the score estimation when the measurement error variance is estimated to be zero. Moreover, since this function is written purely for the purpose of using it in the Extract_Eigencomp_fDA() function, where we do not need the usage of the arguments var and simul and sim.alpha at all, we have deleted those arguments in the fpca_sc() function.

The functional data must be supplied as either

  • an n×d matrix Y, each row of which is one functional observation, with missing values allowed; or

  • a data frame ydata, with columns '.id' (which curve the point belongs to, say i), '.index' (function argument such as time point t), and '.value' (observed function value Yi(t)).

References

Di, C., Crainiceanu, C., Caffo, B., and Punjabi, N. (2009). Multilevel functional principal component analysis. Annals of Applied Statistics, 3, 458--488.

Goldsmith, J., Greven, S., and Crainiceanu, C. (2013). Corrected confidence bands for functional data using principal components. Biometrics, 69(1), 41--51.

Staniswalis, J. G., and Lee, J. J. (1998). Nonparametric regression analysis of longitudinal data. Journal of the American Statistical Association, 93, 1403--1418.

Yao, F., Mueller, H.-G., and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100, 577--590.

Examples

Run this code

if(rlang::is_installed("refund")){
  library(refund)
  data(cd4)
  Fit.MM = fpca_sc(refund::cd4, pve = 0.95)
}

# input a data frame instead of a matrix
nid <- 20
nobs <- sample(10:20, nid, rep=TRUE)
ydata <- data.frame(
    .id = rep(1:nid, nobs),
    .index = round(runif(sum(nobs), 0, 1), 3))
ydata$.value <- unlist(tapply(ydata$.index,
                              ydata$.id,
                              function(x)
                                  runif(1, -.5, .5) +
                                  dbeta(x, runif(1, 6, 8), runif(1, 3, 5))
                              )
                       )

Fit.MM = fpca_sc(ydata=ydata)


Run the code above in your browser using DataLab