tram: Stratified Linear Transformation Models

Description

Likelihood-inference for stratified linear transformation models, including linear shift-scale transformation models.

Usage

tram(formula, data, subset, weights, offset, cluster, na.action = na.omit, 
     distribution = c("Normal", "Logistic", "MinExtrVal", "MaxExtrVal",
                      "Exponential", "Cauchy", "Laplace"), 
     frailty = c("None", "Gamma", "InvGauss", "PositiveStable"),
     transformation = c("discrete", "linear", "logarithmic", "smooth"), 
     LRtest = TRUE, prob = c(0.1, 0.9), support = NULL, 
     bounds = NULL, add = c(0, 0), order = 6, 
     negative = TRUE, remove_intercept = TRUE, 
     scale = TRUE, scale_shift = FALSE, extrapolate = FALSE, 
     log_first = FALSE, sparse_nlevels = Inf,
     model_only = FALSE, constraints = NULL, ...)
tram_data(formula, data, subset, weights, offset, cluster, na.action = na.omit)

Value

An object of class tram inheriting from mlt.

Arguments

formula: an object of class "formula": a symbolic description of the model structure to be fitted. The details of model specification are given under Details and in the package vignette.
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula).
subset: an optional vector specifying a subset of observations to be used in the fitting process.
weights: an optional vector of case weights to be used in the fitting process. Should be NULL or a numeric vector. If present, the weighted log-likelihood is maximised.
offset: this can be used to specify an _a priori_ known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases.
cluster: optional factor with a cluster ID employed for computing clustered covariances.
na.action: a function which indicates what should happen when the data contain NAs. The default is set to na.omit.
distribution: character specifying how the transformation function is mapped into probabilities. Available choices include the cumulative distribution functions of the standard normal, the standard logistic and the standard minimum extreme value distribution.
frailty: character specifying the addition of a frailty term, that is, a random component added to the linear predictor of the model, with specific distribution (Gamma, inverse Gaussian, positive stable).
transformation: character specifying the complexity of the response-transformation. For discrete responses, one parameter is assigned to each level (except the last one), for continuous responses linear, log-linear and smooth (parameterised as a Bernstein polynomial) function are implemented.
LRtest: logical specifying if a likelihood-ratio test for the null of all coefficients in the linear predictor being zero shall be performed.
prob: two probabilities giving quantiles of the response defining the support of a smooth Bernstein polynomial (if transformation = "smooth").
support: a vector of two elements; the support of a smooth Bernstein polynomial (if transformation = "smooth").
bounds: an interval defining the bounds of a real sample space.
add: add these values to the support before generating a grid via mkgrid.
order: integer >= 1 defining the order of the Bernstein polynomial (if transformation = "smooth").
negative: logical defining the sign of the linear predictor.
remove_intercept: logical defining if the intercept shall be removed from the linear shift predictor in favour of an (typically implicit) intercept in the baseline transformation. If FALSE, the linear shift predictor has an intercept (unless -1 is added to the formula) but the baseline transformation is centered. For linear transformation models, this does not change the in-sample log-likelihood. For shift-scale transformation models, using FALSE ensures that centering of variables in the linear shift predictor does not affect the corresponding estimates and standard errors. Note that linear scale predictors are always fitted without intercept.
scale: logical defining if variables in the linear predictor shall be scaled. Scaling is internally used for model estimation, rescaled coefficients are reported in model output.
scale_shift: a logical choosing between two different model types in the presence of a scaling term, see ctm.
extrapolate: logical defining the behaviour of the Bernstein transformation function outside support. The default FALSE is to extrapolate linearily without requiring the second derivative of the transformation function to be zero at support. If TRUE, this additional constraint is respected.
sparse_nlevels: integer; use a sparse model matrix if the number of levels of an ordered factor is at least as large as sparse_nlevels.
log_first: logical; if TRUE, a Bernstein polynomial is defined on the log-scale.
model_only: logical, if TRUE the unfitted model is returned.
constraints: additional constraints on regression coefficients in the linear predictor of the form lhs %*% coef(object) >= rhs, where lhs and rhs can be specified as a character (as in glht) or by a matrix lhs (assuming rhs = 0), or as a list containing the two elements lhs and rhs.
...: additional arguments.

Details

The model formula is of the form y | s ~ x | z where y is an at least ordered response variable, s are the variables defining strata and x defines the linear predictor. Optionally, z defines a scaling term (see ctm). y ~ x defines a model without strata (but response-varying intercept function) and y | s ~ 0 sets-up response-varying coefficients for all variables in s.

The two functions tram and tram_data are not intended to be called directly by users. Instead, functions Coxph (Cox proportional hazards models), Survreg (parametric survival models), Polr (models for ordered categorical responses), Lm (normal linear models), BoxCox (non-normal linear models) or Colr (continuous outcome logistic regression) allow direct access to the corresponding models.

The model class and the specific models implemented in tram are explained in the package vignette of package tram. The underlying theory of most likely transformations is presented in Hothorn et al. (2018), computational and modelling aspects in more complex situations are discussed by Hothorn (2018).

References

Torsten Hothorn, Lisa Moest, Peter Buehlmann (2018), Most Likely Transformations, Scandinavian Journal of Statistics, 45(1), 110--134, tools:::Rd_expr_doi("10.1111/sjos.12291").

Torsten Hothorn (2020), Most Likely Transformations: The mlt Package, Journal of Statistical Software, 92(1), tools:::Rd_expr_doi("10.18637/jss.v092.i01").

Sandra Siegfried, Lucas Kook, Torsten Hothorn (2023), Distribution-Free Location-Scale Regression, The American Statistician, tools:::Rd_expr_doi("10.1080/00031305.2023.2203177").

Examples

Run this code


  data("BostonHousing2", package = "mlbench")

  ### unconstrained regression coefficients
  ### BoxCox calls tram internally
  m1 <- BoxCox(cmedv ~ chas + crim + zn + indus + nox + 
               rm + age + dis + rad + tax + ptratio + b + lstat, 
               data = BostonHousing2)

  ### now with two constraints on regression coefficients
  m2 <- BoxCox(cmedv ~ chas + crim + zn + indus + nox + 
               rm + age + dis + rad + tax + ptratio + b + lstat, 
               data = BostonHousing2, 
               constraints = c("crim >= 0", "chas1 + rm >= 1.5"))
  coef(m1)
  coef(m2)

  K <- matrix(0, nrow = 2, ncol = length(coef(m2)))
  colnames(K) <- names(coef(m2))
  K[1, "crim"] <- 1
  K[2, c("chas1", "rm")] <- 1
  m3 <- BoxCox(cmedv ~ chas + crim + zn + indus + nox + 
               rm + age + dis + rad + tax + ptratio + b + lstat, 
               data = BostonHousing2, 
               constraints = list(K, c(0, 1.5)))
  all.equal(coef(m2), coef(m3))

Run the code above in your browser using DataLab