Learn R Programming

dlnm (version 1.1.0)

crossbasis: Generate a cross-basis matrix for a DLNM

Description

Generate the basis functions for the two spaces of predictor and lags, choosing among a set of possible bases. Then, these functions are combined in order to create the related cross-basis matrix, which can be included in a model formula to fit a distributed lag non-linear model (DLNM).

Usage

crossbasis(var, vartype="ns", vardf=1, vardegree=1, varknots=NULL,
	varbound=range(var), varint=FALSE, cen=TRUE, cenvalue=mean(var),
	maxlag=0, lagtype="ns", lagdf=1, lagdegree=1, lagknots=NULL,
	lagbound=c(0,maxlag), lagint=TRUE)

## S3 method for class 'crossbasis':
summary(object, ...)

Arguments

var
the predictor variable, defined as a numeric vector of ordered observations.
vartype, lagtype
type of basis. See Details below for the list of possible choices.
vardf, lagdf
dimension of the basis, equivalent to number of degrees of freedom spent to specify the relationship in each space. They depend on knots if provided, or on degree for type="poly".
vardegree, lagdegree
degree of polynomial. Used only for type equal to "bs" (degree of the piecewise polynomial for the B-spline) or "poly" (degree of the polynomial).
varknots, lagknots
knots location for the basis. They specify the position of the internal knots for "ns" and "bs", the cut-off points for "strata" (defining right-open intervals) and the threshold(s)/cut-off points for "lthr"
varbound, lagbound
boundary knots (sometimes called external knots). Used only for type equal to "ns" and "bs".
varint, lagint
logical. If TRUE and df>1, an 'intercept' is included in the basis. The default values should not be changed: see Warnings below.
cen
logical. If TRUE, the basis functions for the space of predictor are centered. See Note below.
cenvalue
centering value, used as a reference point for the predicted effects.
maxlag
a positive value defining the maximum lag.
object
a object of class "crossbasis".
...
additional arguments to be passed to summary.

Value

  • A matrix object of class "crossbasis" which can be included in a model formula in order to fit a DLNM. It contains the attributes crossdf (global number of degrees of freedom) and range (range of the original vector of observations). Additional attributes are returned that correspond to the arguments to crossbasis, and explicitly give type, df, degree, knots, bound, cen, cenvalue and maxlag related to the corresponding basis ( with stub var- or lag-) for use of crosspred. The function summary.crossbasis returns a summary of the cross-basis matrix and the related attributes, and can be used to check the options for the bases chosen for the two dimensions.

Warnings

It is strongly recommended to avoid the inclusion of an intercept in the basis for var, otherwise the presence of the additional intercept (when included) in the model used to fit the data will cause some of the cross-basis variables to be excluded. Conversely, an intercept should always be included in the basis for the space of lags when lagtype is equal to "ns", "bs", "strata" or "poly".

Details

The value in type defines the basis for each space (predictor and lags). It must be one of: "ns": natural cubic B-splines (constrained to be linear beyond the boundary knots). Specified by knots (internal knots) and bound (boundary or external knots). See the functions ns for additional information. If knots is provided, the dimension df is set to length(knots)+1+int. An intercept is included if int=T. The transformed variables can be centered at cenvalue. "bs": B-splines characterized by degree (degree of the piecewise polynomial). Specified by knots (internal knots) and bound (boundary or external knots). See the functions bs for additional information. If knots is provided, the dimension df is set to length(knots)+degree+int; if not, df must be higher than degree+int. An intercept is included if int=T. The transformed variables can be centered at cenvalue. "strata": strata variables (dummy parameterization) determined by internal cut-off values specified in knots, which represent the lower boundaries for the right-open intervals. Intervals containing no observation are automatically discarded. If knots is provided, the dimension df is set to length(knots)+int. A dummy variable for the reference stratum (the first one by default) is included if int=T, generating a full rank basis. Never centered. "poly": polynomial with power specified by degree. The dimension df is set to to degree+int. An intercept, corresponding to a vector of 1's (the power 0 of the polynomial) is included if int=T. The transformed variables can be centered at cenvalue. "integer": strata variables (dummy parameterization) for each integer values, expressly created to specify an unconstrained function in the space of lags. df is set automatically to the number of integer values minus 1 plus int. A dummy variable for the reference stratum (the first one by default) is included if int=T, generating a full rank basis. Never centered. "hthr", "lthr": high and low threshold parameterization, with a linear relationship above or below the threshold, respectively, and flat otherwise. The threshold is chosen by knots: if more than one is provided, a piecewise linear relationship is applied above the first knot or below the last one, respectively, with the slope changing at each further knot. df is automatically set to length(knots)+int. An intercept (corresponding to a vector of 1's) is included if int=T. Never centered. "dthr": double threshold parameterization (2 independent linear relationships above the second and below the first threshold, flat between them). The thresholds are chosen by knots. If only one is provided, the threshold is unique (V-model). If more than 2 are provided, the first and the last ones are chosen. df is automatically set to 2+int. An intercept (corresponding to a vector of 1's) is included if int=T. Never centered. "lin": linear relationship (untransformed apart from optional centering). df is automatically set to 1+int. An intercept (corresponding to a vector of 1's) is included if int=T. It can be centered at cenvalue. Some arguments can be automatically changed for not sensible combinations, or set to NULL if not required. For a detailed overview of the options, see: vignette("dlnmOverview")

References

Armstrong, B. Models for the relationship between ambient temperature and daily mortality. Epidemiology. 2006, 17(6):624-31.

See Also

crosspred, crossplot

Examples

Run this code
# Example 1. See crosspred and crossplot for other examples

### simple DLM for the effect of PM10 on mortality up to 15 days of lag
### space of predictor: linear effect for PM10
### space of predictor: 5df natural cubic spline for temperature
### lag function: 4th degree polynomial for PM10
### lag function: strata intervals at lag 0 and 1-3 for temperature

data(chicagoNMMAPS)
basis.pm <- crossbasis(chicagoNMMAPS$pm10, vartype="lin", lagtype="poly",
	lagdegree=4,cen=FALSE,maxlag=15)
basis.temp <- crossbasis(chicagoNMMAPS$temp, vardf=5, lagtype="strata",
	lagknots=1, cenvalue=21, maxlag=3)
summary(basis.pm)
summary(basis.temp)
model <- glm(death ~  basis.pm + basis.temp, family=quasipoisson(), chicagoNMMAPS)
pred.pm <- crosspred(basis.pm, model, at=0:20)

crossplot(pred.pm,"slices",var=10,
	title="Effect of a 10-unit increase in PM10 along lags")
# overall effect for a 10-unit increase in PM over 15 days of lag, with CI
pred.pm$allRRfit["10"]
cbind(pred.pm$allRRlow, pred.pm$allRRhigh)["10",]
crossplot(pred.pm, "overall", ylim=c(0.99,1.04), label="PM10", ci="lines",
	title="Overall effect of PM10 over 15 days of lag")

### See the vignette 'dlnmOverview' for a detailed explanation of this example

Run the code above in your browser using DataLab