bhistx: Base-learners for Functional Covariates

Description

Base-learners that fit historical functional effects that can be used with the tensor product, as, e.g., hbistx(...) %X% bolsc(...), to form interaction effects (Ruegamer et al., 2018). For expert use only! May show unexpected behavior compared to other base-learners for functional data!

Usage

bhistx(
  x,
  limits = "s

Arguments

object of type hmatrix containing time, index and functional covariate; note that timeLab in the hmatrix-object must be equal to the name of the time-variable in timeformula in the FDboost-call

limits

defaults to "s<=t" for an historical effect with s<=t; either one of "s<t" or "s<=t" for [l(t), u(t)] = [T1, t]; otherwise specify limits as a function for integration limits [l(t), u(t)]: function that takes \(s\) as the first and t as the second argument and returns TRUE for combinations of values (s,t) if \(s\) falls into the integration range for the given \(t\).

standard

the historical effect can be standardized with a factor. "no" means no standardization, "time" standardizes with the current value of time and "lenght" standardizes with the lenght of the integral

intFun

specify the function that is used to compute integration weights in s over the functional covariate \(x(s)\)

inS

historical effect can be smooth, linear or constant in s, which is the index of the functional covariates x(s).

inTime

historical effect can be smooth, linear or constant in time, which is the index of the functional response y(time).

knots

either the number of knots or a vector of the positions of the interior knots (for more details see bbs).

boundary.knots

boundary points at which to anchor the B-spline basis (default the range of the data). A vector (of length 2) for the lower and the upper boundary knot can be specified.

degree

degree of the regression spline.

differences

a non-negative integer, typically 1, 2 or 3. Defaults to 1. If differences = k, k-th-order differences are used as a penalty (0-th order differences specify a ridge penalty).

trace of the hat matrix for the base-learner defining the base-learner complexity. Low values of df correspond to a large amount of smoothing and thus to "weaker" base-learners.

lambda

smoothing parameter of the penalty, computed from df when df is specified.

penalty

by default, penalty="ps", the difference penalty for P-splines is used, for penalty="pss" the penalty matrix is transformed to have full rank, so called shrinkage approach by Marra and Wood (2011)

check.ident

use checks for identifiability of the effect, based on Scheipl and Greven (2016); see Brockhaus et al. (2017) for identifiability checks that take into account the integration limits

Value

Equally to the base-learners of package mboost:

An object of class blg (base-learner generator) with a dpp function (dpp, data pre-processing).

The call of dpp returns an object of class bl (base-learner) with a fit function. The call to fit finally returns an object of class bm (base-model).

Details

bhistx implements a base-learner for functional covariates with flexible integration limits l(t), r(t) and the possibility to standardize the effect by 1/t or the length of the integration interval. The effect is stand * int_{l(t)}^{r_{t}} x(s)beta(t,s) ds. The base-learner defaults to a historical effect of the form \(\int_{T1}^{t} x_i(s)beta(t,s) ds\), where \(T1\) is the minimal index of \(t\) of the response \(Y(t)\). bhistx can only be used if \(Y(t)\) and \(x(s)\) are observd over the same domain \(s,t \in [T1, T2]\). The base-learner bhistx can be used to set up complex interaction effects like factor-specific historical effects as discussed in Ruegamer et al. (2018).

Note that the data has to be supplied as a hmatrix object for model fit and predictions.

References

Brockhaus, S., Melcher, M., Leisch, F. and Greven, S. (2017): Boosting flexible functional regression models with a high number of functional historical effects, Statistics and Computing, 27(4), 913-926.

Marra, G. and Wood, S.N. (2011): Practical variable selection for generalized additive models. Computational Statistics & Data Analysis, 55, 2372-2387.

Ruegamer D., Brockhaus, S., Gentsch K., Scherer, K., Greven, S. (2018). Boosting factor-specific functional historical models for the detection of synchronization in bioelectrical signals. Journal of the Royal Statistical Society: Series C (Applied Statistics), 67, 621-642.

Scheipl, F., Staicu, A.-M. and Greven, S. (2015): Functional Additive Mixed Models, Journal of Computational and Graphical Statistics, 24(2), 477-501. https://arxiv.org/abs/1207.5947

Scheipl, F. and Greven, S. (2016): Identifiability in penalized function-on-function regression models. Electronic Journal of Statistics, 10(1), 495-526.

Examples

Run this code

# NOT RUN {
if(require(refund)){
## simulate some data from a historical model
## the interaction effect is in this case not necessary
n <- 100
nygrid <- 35
data1 <- pffrSim(scenario = c("int", "ff"), limits = function(s,t){ s <= t }, 
                n = n, nygrid = nygrid)
data1$X1 <- scale(data1$X1, scale = FALSE) ## center functional covariate                  
dataList <- as.list(data1)
dataList$tvals <- attr(data1, "yindex")

## create the hmatrix-object
X1h <- with(dataList, hmatrix(time = rep(tvals, each = n), id = rep(1:n, nygrid), 
                             x = X1, argvals = attr(data1, "xindex"), 
                             timeLab = "tvals", idLab = "wideIndex", 
                             xLab = "myX", argvalsLab = "svals"))
dataList$X1h <- I(X1h)   
dataList$svals <- attr(data1, "xindex")
## add a factor variable 
dataList$zlong <- factor(gl(n = 2, k = n/2, length = n*nygrid), levels = 1:3)  
dataList$z <- factor(gl(n = 2, k = n/2, length = n), levels = 1:3)

## do the model fit with main effect of bhistx() and interaction of bhistx() and bolsc()
mod <- FDboost(Y ~ 1 + bhistx(x = X1h, df = 5, knots = 5) + 
               bhistx(x = X1h, df = 5, knots = 5) %X% bolsc(zlong), 
              timeformula = ~ bbs(tvals, knots = 10), data = dataList)
              
## alternative parameterization: interaction of bhistx() and bols()
mod <- FDboost(Y ~ 1 + bhistx(x = X1h, df = 5, knots = 5) %X% bols(zlong), 
              timeformula = ~ bbs(tvals, knots = 10), data = dataList)

# }
# NOT RUN {
  # find the optimal mstop over 5-fold bootstrap (small example to reduce run time)
  cv <- cvrisk(mod, folds = cv(model.weights(mod), B = 5))
  mstop(cv)
  mod[mstop(cv)]
  
  appl1 <- applyFolds(mod, folds = cv(rep(1, length(unique(mod$id))), type = "bootstrap", B = 5))

 # plot(mod)
# }
# NOT RUN {
}

# }

Run the code above in your browser using DataLab