mpr: Fitting a Multi-Parameter Regression (MPR) model.

Description

Fits a Multi-Parameter Regression (MPR) model using a Newton-type algorithm via the nlm function.

Usage

mpr(formula, data, family = "Weibull", init, iterlim = 1000, ...)

Arguments

formula

a two-sided formula object with the response on the left hand side of the ~ operator and a list of one-sided formula objects on the right hand side (one for each regression component in the mpr model). The response must be a right-censored survival object as returned by the Surv function. See “Details” for more information on the struture of the formula within the mpr function as it differs from standard regression models.

data

an optional data.frame containing the variables in the model. If missing, the variables are taken from the environment from which mpr is called.

family

the name of the parametric distribution to be used in the model. See distributions for the list of distributions currently available.

init

an optional vector of initial values for the optimisation routine. If missing, default values are used. One may also set init="random" to randomly generate initial values.

iterlim

a positive integer specifying the maximum number of iterations to be performed before the optimisation procedure is terminated. This is supplied to nlm.

…

additional arguments to be passed to nlm.

Value

mpr returns an object of class “mpr”.

The function summary (i.e., summary.mpr) can be used to obtain and print a summary of the results. The the generic accessor function coefficients extracts the list of regression coefficient vectors. One can also apply predict (i.e., predict.mpr) to predict various quantites from the fitted mpr model. A stepwise variable selection procedure has been implemented for mpr models - see stepmpr.

An object of class mpr is a list containing the following components:

model

a data.frame containing useful information about the fitted model with the following headings:

family: the chosen distribution.
npar: number of estimated parameters in the fitted model.
loglike: value of the log-likelihood.
aic: value of the AIC (Akaike Information Criterion).
bic: value of the BIC (Bayesian Information Criterion).
code: an integer indicating why the Newton optimisation procedure terminated (for more details on this stop-code see nlm) where, in particular, “1” means “relative gradient is close to zero”.

coefficients

a list whose elements are named vectors of coefficients (one vector per regression component).

vcov

the variance-covariance matrix for the estimates.

gradient

the values of the (negative) score functions from nlm.

ncomp

the number of regression components in the model, i.e., the number of distributional parameters in the underlying distribution.

formula

the formula supplied.

xvars

a record of the names of all variables (i.e., covariates) used in fitting.

xlevels

a record of the levels of any factors (i.e., categorical variables) used in fitting.

call

the matched call.

Details

Multi-Parameter Regression (MPR) models are generated by allowing multiple distributional parameters to depend on covariates, for example, both the scale and shape parameters. This is in contrast to the more typical approach where covariates enter a model only through one distributional parameter. As these standard models have a single regression component, we may refer to them as Single Parameter Regression (SPR) models and, clearly, they are special cases of MPR models. The parameter through which covariates enter such SPR models may be referred to as the “interest” parameter since it generally has some specific subject-matter importance. However, this standard approach neglects other parameters which may also be important in describing the phenomenon at hand. The MPR approach generalises the standard SPR approach by viewing all distributional parameters as interest parameters in which covariate effects can be investigated.

In the context of survival analysis (currently the focus of the mpr package), the Weibull model is one of the most popular parametric models. Its hazard function is given by $$ h(t) = \lambda \gamma t^{\gamma - 1} $$ where $\lambda > 0$, the scale parameter, controls the overall magnitude of $h(t)$ and $\gamma > 0$, the shape parameter, controls its time evolution. In the standard SPR Weibull model, $\lambda$ depends on covariates via $\log \lambda = x^T \beta$ leading to a proportional hazards (PH) model. The MPR model generalises this by allowing both parameters to depend on covariates as follows $$\log \lambda = x^T \beta$$ $$\log \gamma = z^T \alpha$$ where $x$ and $z$ are the scale and shape covariate vectors (which may or may not contain covariates in common) and $\beta$ and $\alpha$ are the corresponding regression coefficients.

Note that the log-link is used above to ensure positivity of the parameters. More generally, we may have $$g_1(\lambda) = x^T \beta$$ $$g_2(\gamma) = z^T \alpha$$ where $g_1(\cdot)$ and $g_2(\cdot)$ are appropriate link functions. The mpr function does not allow the user to alter these link functions but, rather, uses the following default link functions: log-link (for parameters which must be positive) and identity-link (for parameters which are unconstrained). Although the two-parameter Weibull distribution is discussed here (due to its popularity), other distributions may have additional shape parameters, for example, $$g_3(\rho) = w^T \tau$$ where $w$ and $\tau$ are the vectors of covariates and regression coefficients for this additional shape component. See distributions for further details on the distributions currently available.

The struture of the formula within the mpr function is, for example, Surv(time, status) ~ list(~ x1 + x2, ~ x1) which clearly generalises the typical formula used in standard models (i.e., those with only one regression component) in the sense that the right hand side is a list of one-sided formula objects. Note the requirement that the ~ operator precedes each element within the list. Specifically, the example shown here represents the case where the covariates x1 and x2 appear in the first regression component, $\lambda$, and the covariate x2 appears in the second regression component, $\gamma$. If there was a third regression component, $\rho$, then there would be an additional component in the list, for example, Surv(time, status) ~ list(~ x1 + x2, ~ x1, ~ x1). The mpr function also accepts more typical two-sided formula objects, such as Surv(time, status) ~ x1 + x2, which imply that the terms on the right hand side appear in each of the regression components.

Examples

Run this code

# NOT RUN {
# Veterans' administration lung cancer data
data(veteran, package="survival")
head(veteran)

# treatment variable, "trt", in scale (lambda) and shape (gamma)
# components of a Weibull model
mpr(Surv(time, status) ~ list(~ trt, ~ trt), data=veteran, family="Weibull")

# same as first model
mpr(Surv(time, status) ~ trt, data=veteran, family="Weibull")

# now with "celltype" also appearing in the scale
mpr(Surv(time, status) ~ list(~ trt + celltype, ~ trt), data=veteran,
    family="Weibull")

# trt in scale only (this is a PH Weibull model)
mpr(Surv(time, status) ~ list(~ trt, ~ 1), data=veteran, family="Weibull")

# trt in all three components (scale and two shape parameters) of a Burr model
mpr(Surv(time, status) ~ list(~ trt, ~ trt, ~ trt), data=veteran,
    family="Burr")

# use of summary
mod1 <- mpr(Surv(time, status) ~ list(~ trt, ~ trt), data=veteran)
summary(mod1)
# }

Run the code above in your browser using DataLab