Fits an additive partial linear model to data using an approximate Bayesian inference technique based on penalized regression splines and Laplace approximations. Smooth additive terms are specified as a linear combination of of a large number of cubic B-splines. To counterbalance the roughness of the fit, a discrete penalty on neighboring spline coefficients is imposed in the spirit of Eilers and Marx (1996). The error of the model is assumed to be Gaussian with zero mean and finite variance.The optimal amount of smoothing is determined by a grid-based exploration of the posterior penalty space when the number of smooth terms is small to moderate. When the dimension of the penalty space is large, the optimal smoothing parameter is chosen to be the value that maximizes the (log-)posterior of the penalty vector.
amlps(formula, data, K = 30, penorder = 2, cred.int = 0.95)An object of class amlps containing several components from
the fit. Details can be found in amlps.object. Details on
the output printed by amlps can be found in
print.amlps. Fitted smooth terms can be visualized with the
plot.amlps routine.
A formula object where the ~ operator separates the response
from the covariates of the linear part z1,z2,.. and the smooth
terms. A smooth term is specified by using the notation sm(.).
For instance, the formula y ~ z1+sm(x1)+sm(x2) specifies an
additive model of the form E(y)=b0+b1z1+f1(x1)+f2(x2), where
b0, b1 are the regression coefficients of the linear part and
f1(.) and f2(.) are smooth functions of the continuous
covariates x1 and x2 respectively.
Optional. A data frame to match the variable names provided in formula.
A positive integer specifying the number of cubic B-spline
functions in the basis used to model the smooth terms. Default is
K = 30 and allowed values are 15 <= K <= 60. The same basis
dimension is used for each smooth term in the model. Also, the
computational cost to fit the model increases with K.
The penalty order used on finite differences of the coefficients of contiguous B-splines. Can be either 2 for a second-order penalty (the default) or 3 for a third-order penalty.
The level of the pointwise credible interval to be computed for the coefficients in the linear part of the model.
Oswaldo Gressani oswaldo_gressani@hotmail.fr.
The B-spline basis used to approximate a smooth additive component
is computed with the function cubicbs. The lower (upper)
bound of the B-spline basis is taken to be the minimum (maximum) value of
the covariate associated to the smooth. For identifiability
purposes, the B-spline matrices (computed over the observed covariates)
are centered. The centering consists is subtracting from each column of the
B-spline matrix, the corresponding column average of another B-spline matrix
computed on a fine grid of equidistant values in the domain of the smooth
term. A hierarchical Gamma prior is imposed on the roughness penalty vector
and Jeffreys' prior is imposed on the precision of the error. A
Newton-Raphson algorithm is used to compute the posterior
mode of the (log-)posterior penalty vector. The latter algorithm uses
analytically derived versions of the gradient and Hessian. When the number
of smooth terms in the model is smaller or equal to 4, a grid-based strategy
is used for posterior exploration of the penalty space. Above that
threshold, the optimal amount of smoothness is determined by the posterior
maximum value of the penalty vector. This strategy allows to keep the
computational burden to fit the model relatively low and to conserve good
statistical performance.
Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2): 89-121.
Fan, Y. and Li, Q. (2003). A kernel-based method for estimating additive partially linear models. Statistica Sinica, 13(3): 739-762.
Gressani, O. and Lambert, P. (2018). Fast Bayesian inference using Laplace approximations in a flexible promotion time cure model based on P-splines. Computational Statistical & Data Analysis 124: 151-167.
Opsomer, J. D. and Ruppert, D. (1999). A root-n consistent backfitting estimator for semiparametric additive modeling. Journal of Computational and Graphical Statistics, 8(4): 715-732.
cubicbs, amlps.object,
print.amlps, plot.amlps
### Classic simulated data example (with simgamdata)
set.seed(17)
sim.data <- simgamdata(setting = 2, n = 200, dist = "gaussian", scale = 0.4)
data <- sim.data$data # Simulated data frame
# Fit model
fit <- amlps(y ~ z1 + z2 + sm(x1) + sm(x2), data = data, K = 15)
fit
Run the code above in your browser using DataLab