Learn R Programming

dlnm (version 0.3.0)

crossbasis: Generate a cross-basis matrix for a DLNM

Description

Generate the basis functions for the two spaces of predictor and lags, choosing among a set of possible bases. Then, these functions are combined in order to create the related cross-basis matrix, which can be included in a model formula to fit a distributed lag non-linear model (DLNM).

Usage

crossbasis(var, vartype="ns", vardf=1, varknots=NULL, varbound=range(var),
	varint=FALSE, cen=TRUE, cenvalue=mean(var), maxlag=0, lagtype="ns",
	lagdf=1, lagknots=NULL, lagbound=c(0,maxlag), lagint=TRUE)

## S3 method for class 'crossbasis':
summary(object, ...)

Arguments

var
the predictor variable, defined as a numeric vector of ordered observations.
vartype, lagtype
type of basis. See Details below for the list of possible choices.
vardf, lagdf
dimension of the basis (number of degrees of freedom spent to specify the relationship in each space). They depend on varknots and lagknots if provided.
varknots, lagknots
knots location for the basis. They specify the position of the knots for "ns" and "bs", the cut-off points for "strata" (defining right-open intervals) and the threshold(s) for "lthr", "hthr"
varbound, lagbound
boundary knots (sometimes called external knots). Used only for type equal to "ns" and "bs".
varint, lagint
logical. If TRUE, an 'intercept' is included in the basis. The default values should not be changed: see Warnings below.
cen
logical. If TRUE, the basis functions for the space of predictor are centered. See Note below.
cenvalue
centering value, used as a reference point for the predicted effects.
maxlag
a positive value defining the maximum lag.
object
a object of class "crossbasis".
...
additional arguments to be passed to summary.

Value

  • A matrix object of class "crossbasis" which can be included in a model formula in order to fit a DLNM. It contains the attributes crossdf (global number of degrees of freedom) and range (range of the original vector of observations). Additional attributes are returned that correspond to the arguments to crossbasis, and explicitly give type, df, knots, bound, cen, cenvalue and maxlag related to the corresponding basis ( with stub var- or lag-) for use of crosspred. The function summary.crossbasis returns a summary of the cross-basis matrix and the related attributes, and can be used to check the options for the bases chosen for the two dimensions.

Warnings

It is strongly recommended to avoid the inclusion of an intercept in the basis for var, otherwise the presence of the additional intercept (when included) in the model used to fit the data will cause some of the cross-basis variables to be excluded. Conversely, an intercept should always be included in the basis for the space of lags when lagtype is equal to "ns", "bs", "strata" or "poly".

Details

The value in type defines the basis for each space (predictor and lags). It must be one of: "ns", "bs": cubic splines with and without the natural constraint, respectively. Specified by knots (internal knots) and bound (boundary or external knots). See the functions ns and bs for additional information. An intercept is included if int=T. The transformed variables can be centered at cenvalue. "strata": strata variables (dummy parameterization) determined by internal cut-off values specified in knots, which represent the lower boundaries for the right-open intervals. A dummy variable for the reference stratum (the first one by default) is included if int=T, generating a full rank basis. Never centered. "poly": polynomial with degree equal to df-int. An intercept, corresponding to a vector of 1's (the power 0 of the polynomial) is included if int=T. The transformed variables can be centered at cenvalue. "integer": strata variables (dummy parameterization) for each integer values, expressly created to specify an unconstrained function in the space of lags. df is set automatically to the number of integer values minus 1 plus int. A dummy variable for the reference stratum (the first one by default) is included if int=T, generating a full rank basis. Never centered. "hthr", "lthr": high and low threshold parameterization (linear relationship above or below the threshold, respectively, flat otherwise). The threshold is chosen by knots (if more than one are provided, the last and the first one is chosen, respectively). df is automatically set to 1+int. An intercept (corresponding to a vector of 1's) is included if int=T. Never centered. "thr": double threshold parameterization (2 independent linear relationships above the second and below the first threshold, flat between them). The thresholds are chosen by knots. If only one is provided, the threshold is unique (V-model). If more than 2 are provided, the first 2 are chosen. df is automatically set to 2+int. An intercept (corresponding to a vector of 1's) is included if int=T. Never centered. "lin": linear relationship (untransformed apart from optional centering). df is automatically set to 1+int. An intercept (corresponding to a vector of 1's) is included if int=T. It can be centered at cenvalue. The values in knots, if provided, are automatically ordered and made unique and determine the value of df (equal to length(knots)+int for "strata", length(knots)+int+3 for "bs" and length(knots)+int+1 for "ns" and "poly"). The value of df is fixed for the other types. If not provided, varknots are placed at equally spaced quantiles, and lagknots at equally spaced values on the log scale of lags. Some arguments can be automatically changed for not sensible combinations, or set to NULL if not required. For a detailed overview of the options, see: vignette("dlnmOverview", package = "dlnm")

References

Armstrong, B. Models for the relationship between ambient temperature and daily mortality. Epidemiology. 2006, 17(6):624-31.

See Also

crosspred, crossplot

Examples

Run this code
# Example 1. See crosspred and crossplot for other examples

# load and prepare the dataset
initDB()
data <- readCity("chic", collapseAge=TRUE)
data$temp <- (data$tmpd-32)*5/9
data$pm10 <- with(data, pm10tmean+pm10mtrend)

### simple DLM for the effect of PM10 on mortality up to 15 days of lag
### space of predictor: linear effect for PM10
### space of predictor: 5df natural cubic spline for temperature
### lag function: 4th degree polynomial for PM10
### lag function: strata intervals at lag 0 and 1-3 for temperature

basis.pm <- crossbasis(data$pm10, vartype="lin", lagtype="poly",
	lagdf=5,cen=FALSE,maxlag=15)
basis.temp <- crossbasis(data$temp, vardf=5, lagtype="strata",
	lagknots=1, cenvalue=21, maxlag=3)
summary(basis.pm)
summary(basis.temp)
model <- glm(death ~  basis.pm + basis.temp, family=quasipoisson(), data)
pred.pm <- crosspred(basis.pm, model, at=0:20)

crossplot(pred.pm,"slices",var=10,
	title="Effect of a 10-unit increase in PM10 along lags")
# overall effect for a 10-unit increase in PM over 15 days of lag, with CI
pred.pm$allRRfit["10"]
cbind(pred.pm$allRRlow, pred.pm$allRRhigh)["10",]
crossplot(pred.pm, "overall", ylim=c(0.99,1.04), label="PM10",
	title="Overall effect of PM10 over 15 days of lag")

### See the vignette 'dlnmOverview' for a detailed explanation of this example

Run the code above in your browser using DataLab