pfr: Penalized Functional Regression

Description

Implements various approaches to penalized scalar-on-function regression. These techniques include Penalized Functional Regression (Goldsmith et al., 2011), Longitudinal Penalized Functional Regression (Goldsmith, et al., 2012), Functional Principal Component Regression (Reiss and Ogden, 2007), Partially Empirical Eigenvectors for Regression (Randolph et al., 2012), Functional Generalized Additive Models (McLean et al., 2013), and Variable-Domain Functional Regression (Gellar et al., 2014). This function is a wrapper for mgcv's gam and its siblings to fit models with a scalar (but not necessarily continuous) response.

Usage

pfr(formula = NULL, fitter = NA, method = "REML", ...)

Arguments

formula

a formula that could contain any of the following special terms: lf(), af(), lf.vd(), peer(), fpc(), or re(); also mgcv's s(), te(), or t2().

fitter

the name of the function used to estimate the model. Defaults to gam if the matrix of functional responses has less than 2e5 data points and to bam if not. "gamm" (see gamm) and "gamm4" (see gamm4) are valid options as well.

method

The smoothing parameter estimation method. Default is "REML". For options, see gam.

...

additional arguments that are valid for gam or bam. These include data and family to specify the input data and outcome family, as well as many options to control the estimation.

Value

A fitted pfr-object, which is a gam-object with some additional information in a $pfr-element. If fitter is "gamm" or "gamm4", only the $gam part of the returned list is modified in this way.

Warning

Binomial responses should be specified as a numeric vector rather than as a matrix or a factor.

References

Goldsmith, J., Bobb, J., Crainiceanu, C., Caffo, B., and Reich, D. (2011). Penalized functional regression. Journal of Computational and Graphical Statistics, 20(4), 830-851.

Goldsmith, J., Crainiceanu, C., Caffo, B., and Reich, D. (2012). Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. Journal of the Royal Statistical Society: Series C, 61(3), 453-469.

Reiss, P. T., and Ogden, R. T. (2007). Functional principal component regression and functional partial least squares. Journal of the American Statistical Association, 102, 984-996.

Randolph, T. W., Harezlak, J, and Feng, Z. (2012). Structured penalties for functional linear models - partially empirical eigenvectors for regression. Electronic Journal of Statistics, 6, 323-353.

McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23 (1), pp. 249-269. Available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982924/.

Gellar, J. E., Colantuoni, E., Needham, D. M., and Crainiceanu, C. M. (2014). Variable-Domain Functional Regression for Modeling ICU Data. Journal of the American Statistical Association, 109(508): 1425-1439.

Examples

Run this code

# NOT RUN {
# See lf(), lf.vd(), af(), fpc(), and peer() for additional examples

data(DTI)
DTI1 <- DTI[DTI$visit==1 & complete.cases(DTI),]
par(mfrow=c(1,2))

# Fit model with linear functional term for CCA
fit.lf <- pfr(pasat ~ lf(cca, k=30, bs="ps"), data=DTI1)
plot(fit.lf, ylab=expression(paste(beta(t))), xlab="t")
# }
# NOT RUN {
# Alternative way to plot
bhat.lf <- coef(fit.lf, n=101)
bhat.lf$upper <- bhat.lf$value + 1.96*bhat.lf$se
bhat.lf$lower <- bhat.lf$value - 1.96*bhat.lf$se
matplot(bhat.lf$cca.argvals, bhat.lf[,c("value", "upper", "lower")],
        type="l", lty=c(1,2,2), col=1,
        ylab=expression(paste(beta(t))), xlab="t")

# Fit model with additive functional term for CCA, using tensor product basis
fit.af <- pfr(pasat ~ af(cca, Qtransform=TRUE, k=c(7,7)), data=DTI1)
plot(fit.af, scheme=2, xlab="t", ylab="cca(t)", main="Tensor Product")
plot(fit.af, scheme=2, Qtransform=TRUE,
     xlab="t", ylab="cca(t)", main="Tensor Product")

# Change basistype to thin-plate regression splines
fit.af.s <- pfr(pasat ~ af(cca, basistype="s", Qtransform=TRUE, k=50),
                data=DTI1)
plot(fit.af.s, scheme=2, xlab="t", ylab="cca(t)", main="TPRS", rug=FALSE)
plot(fit.af.s, scheme=2, Qtransform=TRUE,
     xlab="t", ylab="cca(t)", main="TPRS", rug=FALSE)

# Visualize bivariate function at various values of x
par(mfrow=c(2,2))
vis.pfr(fit.af, xval=.2)
vis.pfr(fit.af, xval=.4)
vis.pfr(fit.af, xval=.6)
vis.pfr(fit.af, xval=.8)

# Include random intercept for subject
DTI.re <- DTI[complete.cases(DTI$cca),]
DTI.re$ID <- factor(DTI.re$ID)
fit.re <- pfr(pasat ~ lf(cca, k=30) + re(ID), data=DTI.re)
coef.re <- coef(fit.re)
par(mfrow=c(1,2))
plot(fit.re)

# FPCR_R Model
fit.fpc <- pfr(pasat ~ fpc(cca), data=DTI.re)
plot(fit.fpc)

# PEER Model with second order difference penalty
DTI.use <- DTI[DTI$case==1,]
DTI.use <- DTI.use[complete.cases(DTI.use$cca),]
fit.peer <- pfr(pasat ~ peer(cca, argvals=seq(0,1,length=93),
                             integration="riemann", pentype="D"), data=DTI.use)
plot(fit.peer)
# }

Run the code above in your browser using DataLab