semipar.mp: Massively parallel semiparametric regression

Description

Fits a possibly very large number of semiparametric models by quadratically penalized least squares. The model may include a combination of parametric terms, smooth terms, varying-coefficient terms, and simple random effect structures.

Usage

semipar.mp(formula, Y, lsp, data = NULL, range.basis = NULL,
  knots = "quantile", rm.constr = FALSE, random = NULL,
  store.reml = FALSE, store.fitted = FALSE)

Arguments

formula

a formula object such as "

~ x1 + sf(x2) +sf(x2, effect
= x3)

" where x1 is a linear (parametric) predictor, x2 is a predictor on which the responses depend smoothly, and x3 is a predictor whose effect is

an $n \times V$ response matrix, where $V$ is the number of models fitted in parallel, e.g., voxels in neuroimaging applications.

lsp

vector of candidate log tuning parameters ($log(\lambda)$).

data

an optional data frame containing the variables in the model.

range.basis

a numeric vector of length 2 defining the interval over which the B-spline basis is created. If NULL, it will be set as the range of the variable to be evaluated by the basis.

knots

knot placement for the B-spline bases. The default, "quantile", gives knots at equally spaced quantiles of the data. The alternative, "equispaced", gives equally spaced knots.

rm.constr

logical: should the constraints be removed for varying-coefficient models?

random

a formula or a matrix for random effects.

store.reml

logical: should the pointwise REML criterion at each grid point be included in the output? FALSE by default, as this output can be very large.

store.fitted

logical: should the fitted values be included in the output? FALSE by default.

Value

An object of class "semipar.mp", which is also of class "qplsc.mp" but includes the following additional elements:
where.sf, where.nsfvectors or scalars identifying where the smooth and non-smooth terms, respectively, appear in the model formula.
list.alla list of lists, one for each term of the model; see Details.
formulamodel formula.
Yresponse matrix.
lspcandidate values for the log smoothing parameter.
datathe supplied data frame, if any.

Details

The basic approach to massively parallel smoothing is described in Reiss et al. (2014). Although simple mixed-effect models are available, semipar.mix.mp is generally preferable for mixed models with a single smooth term.

Each element of list.all corresponding to a nonparametric term of the model is a list with components modmat, penmat, pen.order, start, and end. For each parametric term, the same five components are included, plus basis, argvals, effect, k, and norder.

References

Reiss, P. T., Huang, L., Chen, Y.-H., Huo, L., Tarpey, T., and Mennes, M. (2014). Massively parallel nonparametric regression, with an application to developmental brain mapping. Journal of Computational and Graphical Statistics, Journal of Computational and Graphical Statistics, 23(1), 232--248.

Examples

Run this code

n<-32
Ys <- matrix(0, n, 5)
for(i in 1:n) Ys[i,]<--2:2+rnorm(5, i^2, i^0.5)+sin(i)
x1 <- rnorm(n,0,5)
x2 <- 1:n+runif(n, 1, 20)
semipar.obj <- semipar.mp(~x1+sf(x2,k=10),Y=Ys,lsp=seq(5,50,,30))

Run the code above in your browser using DataLab