ps: Specifying a smooth term in the gcrq formula.

Description

Function used to define the smooth term (via P-splines) within the gcrq formula. The function actually does not evaluate a (spline) smooth, but simply it passes relevant information to proper fitter functions.

Usage

ps(..., lambda = -1, d = 3, by=NULL, ndx = NULL, deg = 3, knots=NULL,
    monotone = 0, concave = 0, var.pen = NULL, pen.matrix=NULL, dropc=TRUE, 
    center=TRUE, K=NULL, decom=FALSE, constr.fit=TRUE, shared.pen=FALSE, 
    st=FALSE, ad=0)

Value

The function simply returns the covariate with added attributes relevant to smooth term.

Arguments

...

The covariate supposed to have a nonlinear relationship with the quantile curve(s) being estimated. A B-spline is built, and a (difference) penalty is applied. In growth charts this variable is typically the age. If the covariate is a factor, category-specific coefficients are estimated subject to a lasso penalty. See the last example in ?gcrq. A matrix of (continuous) covariates can be also supplied to perfom variable selection (among its columns).

lambda

A supplied smoothing parameter for the smooth term. If it is negative scalar, the smoothing parameter is estimated iteratively as discussed in Muggeo et al. (2021). If a positive scalar, it represents the actual smoothing parameter. If it is a vector, cross validation is performed to select the `best' value. See Details in gcrq.

d

The difference order of the penalty. Default to 3 Ignored if pen.matrix is supplied.

by

if different from NULL, a numeric or factor variable of the same dimension as the covariate in ... If numeric the elements multiply the smooth (i.e. a varying coefficient model); if factor, a smooth is fitted for each factor level. Usually the variable by is also included as main effect in the formula, see examples in gcrq. When by includes a factor, the formula should include the model intecept, i.e. y~g+ps(x,by=g) and not y~ 0+g+ps(x,by=g).

ndx

The number of intervals of the covariate range used to build the B-spline basis. Non-integer values are rounded by round(). If NULL, default, it is taken \(min(n/4,9)\) (versions <=1.1-0 it was \(min(n/4,40)\), the empirical rule of Ruppert). It could be reduced further (but no less than 5 or 6, say) if the sample size is not large and the default value leads to some error in the fitting procedure, see section Note in gcrq. Likewise, if the underlying relationship is strongly nonlinear, ndx could be increased. The returned basis wil have `ndx+deg-1' (if dropc=TRUE) basis functions.

deg

The degree of the spline polynomial. Default to 3. The B-spline basis is composed by ndx+deg basis functions and if dropc=TRUE the first column is removed for identifiability (and the model intercept is estimated without any penalty).

knots

The knots locations. If NULL, equispaced knots are set. Note if predictions outside the observed covariate range have to be computed (via predict.gcrq), the knots should be set enought outside the observed range.

monotone

Numeric value to set up monotonicity restrictions on the first derivative of fitted smooth function

'0' = no constraint (default);
'1' = non-decreasing smooth function;
'-1' = non-increasing smooth function.

concave

Numeric value to set up monotonicity restrictions on the second derivative of fitted smooth function

'0' = no constraint (default);
'1' = concave smooth function;
'-1' = convex smooth function.

var.pen

A character indicating the varying penalty. See Details.

pen.matrix

if provided, a penalty matrix \(A\), say, such that the penalty in the objective function, apart from the smoothing parameter, is \(||Ab||_1\) where \(b\) is the spline coefficient vector being penalized.

dropc

logical. Should the first column of the B-spline basis be dropped for the basis identifiability? Default to TRUE. Note, if dropc=FALSE is set, it is necessary to omit the model intercept AND not to center the basis, i.e. center=FALSE. Alternatively, both a full basis and the model intercept may be included by adding a small ridge penalty via lambda.ridge>0.

center

logical. If TRUE the smooth effects are 'centered' over the covariate values, i.e. \(\sum_i \hat{f}(x_i)=0\).

K

A scalar tuning the selection of wiggliness of the smoothed curve when \(\lambda\) has to be estimated (i.e. lambda<0 is set). The larger K, the smoother the curve. Simulations suggest K=2 for the smoothing, and K=log(n/p^(2/3)) for variable selection and random intercepts (p is the number of variables or number of subjects). See details.

decom

logical. If TRUE, the B-spline \(B\) (with a \(d\) order difference penalty) is decomposed into truncated power functions namely unpenalized polynomial terms up to degree d-1, and additional terms \(Z= B D'(DD')^{-1}\). Only the coefficients of \(Z\) are penalized via an identity matrix, i.e. a lasso penalty. Currently decom=TRUE does not work with shape (monotonicity and concavity) restrictions and noncrossing constraints.

constr.fit

logical. If monotone or concave are different from 0, constr.fit=TRUE means that these constraints are set on the fitted quantiles rather than on the spline coefficients.

shared.pen

logical. If TRUE and the smooth is a VC term with a factor specified in by, the smooths in each level of the factor share the same smoothing parameter.

st

logical. If TRUE the variable(s) are standardized via the scale() function. Typically used for variable selection via lasso, i.e. when a matrix of covariates is passed in ps().

ad

a positive number to carry out a form of adaptive lasso. More specifically, at each step of the iterative algorithm, the penalty is \(\lambda\sum_jw_j|\beta_j|\) where \(w_j=|\tilde{\beta}_j|^\mathtt{-ad}\) and \(\tilde{\beta}_j\) are estimates coming from the previous iteration with a different value of \(\lambda\). ad=0 means the standard lasso and ad=1 adaptive lasso (with weights updated during the iterative process.

Author

Vito M. R. Muggeo

Details

If a numeric variable has been supplied, ps() builds a B-spline basis with number of columns equal to ndx+deg (or length(knots)-deg-1). However, unless dropc=FALSE is specified, the first column is removed for identifiability, and the spline coefficients are penalized via differences of order d; d=0 leads to a penalty on the coefficients themselves. If pen.matrix is supplied, d is ignored. Since versions 1.5-0 and 1.6-0, a factor or matrix can be supplied.

lambda is the tuning parameter, fixed or to be estimated. When lambda=0 an unpenalized (and typically wiggly) fit is obtained, and as lambda increases the curve gets smoother till a d-1 degree polynomial. At 'intermediate' lambda values, the fitted curve is a piecewise polynomial of degree d-1.

It is also possible to put a varying penalty via the argument var.pen. Namely for a constant smoothing (var.pen=NULL) the penalty is \(\lambda\sum_k |\Delta^d_k|\) where \(\Delta^d_k\) is the k-th difference (of order d) of the spline coefficients. For instance if \(d=1\), \(|\Delta^1_k|=|b_k-b_{k-1}|\) where the \(b_k\) are the spline coefficients. When a varying penalty is set, the penalty becomes \(\lambda\sum_k |\Delta_k^d| w_k\) where the weights \(w_k\) depend on var.pen; for instance var.pen="((1:k)^2)" results in \(w_k=k^2\). See models m6 and m6a in the examples of gcrq.

If decom=TRUE, the smooth can be plotted with or without the fixed part, see overall.eff in the function plot.gcrq.

References

Muggeo VMR, Torretta F, Eilers PHC, Sciandra M, Attanasio M (2021). Multiple smoothing parameters selection in additive regression quantiles, Statistical Modelling, 21, 428-448.

For a general discussion on using B-spline and penalties in regression model see

Eilers PHC, Marx BD. (1996) Flexible smoothing with B-splines and penalties. Statistical Sciences, 11:89-121.

Examples

Run this code

##see ?gcrq

##gcrq(y ~ ps(x),..) #it works (default: center = TRUE, dropc = TRUE)
##gcrq(y ~ 0 + ps(x, center = TRUE, dropc = FALSE)) #it does NOT work
##gcrq(y ~ 0 + ps(x, center = FALSE, dropc = FALSE)) #it works

Run the code above in your browser using DataLab