Defines a term \(\int_{T}F(X_i(t),t)dt\) for inclusion in an mgcv::gam
-formula (or
bam
or gamm
or gamm4:::gamm
) as constructed by
pfr
, where \(F(x,t)\) is an unknown smooth bivariate function and \(X_i(t)\)
is a functional predictor on the closed interval \(T\). See smooth.terms
for a list of bivariate basis and penalty options; the default is a tensor
product basis with marginal cubic regression splines for estimating \(F(x,t)\).
af(
X,
argvals = NULL,
xind = NULL,
basistype = c("te", "t2", "s"),
integration = c("simpson", "trapezoidal", "riemann"),
L = NULL,
presmooth = NULL,
presmooth.opts = NULL,
Xrange = range(X, na.rm = T),
Qtransform = FALSE,
...
)
functional predictors, typically expressed as an N
by J
matrix,
where N
is the number of columns and J
is the number of
evaluation points. May include missing/sparse functions, which are
indicated by NA
values. Alternatively, can be an object of class
"fd"
; see fd
.
indices of evaluation of X
, i.e. \((t_{i1},.,t_{iJ})\) for
subject \(i\). May be entered as either a length-J
vector, or as
an N
by J
matrix. Indices may be unequally spaced. Entering
as a matrix allows for different observations times for each subject. If
NULL
, defaults to an equally-spaced grid between 0 or 1 (or within
X$basis$rangeval
if X
is a fd
object.)
same as argvals. It will not be supported in the next version of refund.
method used for numerical integration. Defaults to "simpson"
's rule
for calculating entries in L
. Alternatively and for non-equidistant grids,
"trapezoidal"
or "riemann"
.
an optional N
by ncol(argvals)
matrix giving the weights for the numerical
integration over t
. If present, overrides integration
.
string indicating the method to be used for preprocessing functional predictor prior
to fitting. Options are fpca.sc
, fpca.face
, fpca.ssvd
, fpca.bspline
, and
fpca.interpolate
. Defaults to NULL
indicateing no preprocessing. See
create.prep.func
.
list including options passed to preprocessing method
create.prep.func
.
numeric; range to use when specifying the marginal basis for the x-axis. It may
be desired to increase this slightly over the default of range(X)
if concerned about predicting
for future observed curves that take values outside of range(X)
logical; should the functional be transformed using the empirical cdf and
applying a quantile transformation on each column of X
prior to fitting?
A list with the following entries:
call
a "call"
to te
(or s
, t2
) using the appropriately
constructed covariate and weight matrices.
argvals
the argvals
argument supplied to af
L
the matrix of weights used for the integration
xindname
the name used for the functional predictor variable in the formula
used by mgcv
tindname
the name used for argvals
variable in the formula
used by mgcv
Lname
the name used for the L
variable in the formula
used by mgcv
presmooth
the presmooth
argument supplied to af
Xrange
the Xrange
argument supplied to af
prep.func
a function that preprocesses data based on the preprocessing method specified in presmooth
. See
create.prep.func
McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23 (1), pp. 249-269. Available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982924/.
pfr
, lf
, mgcv's linear.functional.terms
,
pfr
for examples
# NOT RUN { data(DTI) ## only consider first visit and cases (no PASAT scores for controls) DTI1 <- DTI[DTI$visit==1 & DTI$case==1,] DTI2 <- DTI1[complete.cases(DTI1),] ## fit FGAM using FA measurements along corpus callosum ## as functional predictor with PASAT as response ## using 8 cubic B-splines for marginal bases with third ## order marginal difference penalties ## specifying gamma > 1 enforces more smoothing when using ## GCV to choose smoothing parameters fit1 <- pfr(pasat ~ af(cca, k=c(8,8), m=list(c(2,3), c(2,3)), presmooth="bspline", bs="ps"), method="GCV.Cp", gamma=1.2, data=DTI2) plot(fit1, scheme=2) vis.pfr(fit1) ## af term for the cca measurements plus an lf term for the rcst measurements ## leave out 10 samples for prediction test <- sample(nrow(DTI2), 10) fit2 <- pfr(pasat ~ af(cca, k=c(7,7), m=list(c(2,2), c(2,2)), bs="ps", presmooth="fpca.face") + lf(rcst, k=7, m=c(2,2), bs="ps"), method="GCV.Cp", gamma=1.2, data=DTI2[-test,]) par(mfrow=c(1,2)) plot(fit2, scheme=2, rug=FALSE) vis.pfr(fit2, select=1, xval=.6) pred <- predict(fit2, newdata = DTI2[test,], type='response', PredOutOfRange = TRUE) sqrt(mean((DTI2$pasat[test] - pred)^2)) ## Try to predict the binary response disease status (case or control) ## using the quantile transformed measurements from the rcst tract ## with a smooth component for a scalar covariate that is pure noise DTI3 <- DTI[DTI$visit==1,] DTI3 <- DTI3[complete.cases(DTI3$rcst),] z1 <- rnorm(nrow(DTI3)) fit3 <- pfr(case ~ af(rcst, k=c(7,7), m = list(c(2, 1), c(2, 1)), bs="ps", presmooth="fpca.face", Qtransform=TRUE) + s(z1, k = 10), family="binomial", select=TRUE, data=DTI3) par(mfrow=c(1,2)) plot(fit3, scheme=2, rug=FALSE) abline(h=0, col="green") # 4 versions: fit with/without Qtransform, plotted with/without Qtransform fit4 <- pfr(case ~ af(rcst, k=c(7,7), m = list(c(2, 1), c(2, 1)), bs="ps", presmooth="fpca.face", Qtransform=FALSE) + s(z1, k = 10), family="binomial", select=TRUE, data=DTI3) par(mfrow=c(2,2)) zlms <- c(-7.2,4.3) plot(fit4, select=1, scheme=2, main="QT=FALSE", zlim=zlms, xlab="t", ylab="rcst") plot(fit4, select=1, scheme=2, Qtransform=TRUE, main="QT=FALSE", rug=FALSE, zlim=zlms, xlab="t", ylab="p(rcst)") plot(fit3, select=1, scheme=2, main="QT=TRUE", zlim=zlms, xlab="t", ylab="rcst") plot(fit3, select=1, scheme=2, Qtransform=TRUE, main="QT=TRUE", rug=FALSE, zlim=zlms, xlab="t", ylab="p(rcst)") vis.pfr(fit3, select=1, plot.type="contour") # } # NOT RUN { # }