regress: General Regression for an Arbitrary Functional

Description

Produces point estimates, interval estimates, and p values for an arbitrary functional (mean, geometric mean, proportion, odds, hazard) of a variable of class integer, or numeric when regressed on an arbitrary number of covariates. Multiple Partial F-tests can be specified using the U function.

Usage

regress(
  fnctl,
  formula,
  data,
  intercept = TRUE,
  weights = rep(1, nrow(data.frame(data))),
  subset = rep(TRUE, nrow(data.frame(data))),
  robustSE = TRUE,
  conf.level = 0.95,
  exponentiate = fnctl != "mean",
  replaceZeroes,
  useFdstn = TRUE,
  suppress = FALSE,
  na.action,
  method = "qr",
  qr = TRUE,
  singular.ok = TRUE,
  contrasts = NULL,
  init = NULL,
  ties = "efron",
  offset,
  control = list(...),
  ...
)

Value

An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.

Arguments

fnctl: a character string indicating the functional (summary measure of the distribution) for which inference is desired. Choices include "mean", "geometric mean", "odds", "rate", "hazard".
formula: an object of class formula as might be passed to lm, glm, or coxph. Functions of variables, specified using dummy or polynomial may also be included in formula.
data: a data frame, matrix, or other data structure with matching names to those entered in formula.
intercept: a logical value indicating whether a intercept exists or not. Default value is TRUE for all functionals. Intercept may also be removed if a "-1" is present in formula. If "-1" is present in formula but intercept = TRUE is specified, the model will fit without an intercept. Note that when fnctl = "hazard", the intercept is always set to FALSE because Cox proportional hazards regression models do not explicitly estimate an intercept.
weights: vector indicating optional weights for weighted regression.
subset: vector indicating a subset to be used for all inference.
robustSE: a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.
conf.level: a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.
exponentiate: a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.
replaceZeroes: if not FALSE, this indicates a value to be used in place of zeroes when computing a geometric mean. If TRUE, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used. Defaults to TRUE when fnctl = "geometric mean". This parameter is always FALSE for all other values of fnctl.
useFdstn: a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.
suppress: if TRUE, and a model which requires exponentiation (for instance, regression on the geometric mean) is computed, then a table with only the exponentiated coefficients and confidence interval is returned. Otherwise, two tables are returned - one with the original unexponentiated coefficients, and one with the exponentiated coefficients.
na.action, qr, singular.ok, offset, contrasts, control: optional arguments that are passed to the functionality of lm or glm.
method: the method to be used in fitting the model. The default value for fnctl = "mean" and fnctl = "geometric mean" is "qr", and the default value for fnctl = "odds" and fnctl = "rate" is "glm.fit". This argument is passed into the lm() or glm() function, respectively. You may optionally specify method = "model.frame", which returns the model frame and does no fitting.
init: a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.
ties: a character string describing method for breaking ties in hazard regression. Only efron, breslow, or exact is accepted. See more details in the documentation for this argument in the survival::coxph function. Default to efron.
...: additional arguments to be passed to the lm function call

Details

Regression models include linear regression (for the ``mean'' functional), logistic regression with logit link (for the ``odds'' functional), Poisson regression with log link (for the ``rate'' functional), linear regression of a log-transformed outcome (for the ``geometric mean'' functional), and Cox proportional hazards regression (for the hazard functional).

Currently, for the hazard functional, only `coxph` syntax is supported; in other words, using `dummy`, `polynomial`, and U functions will result in an error when `fnctl = hazard`.

Note that the only possible link function in `regress` with `fnctl = odds"` is the logit link. Similarly, the only possible link function in `regress` with `fnctl = "rate"` is the log link.

Objects created using the U function can also be passed in. If the U call involves a partial formula of the form ~ var1 + var2, then regress will return a multiple-partial F-test involving var1 and var2. If an F-statistic will already be calculated regardless of the U specification, then any naming convention specified via name ~ var1 will be ignored. The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can follow them).

Examples

Run this code

# Loading dataset
data(mri)

# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)

# Linear regression of atrophy on sex and height and their interaction, 
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)

# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)

# Cox regression of age on survival 
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)