Produces point estimates, interval estimates, and p values for an arbitrary
functional (mean, geometric mean, proportion, odds, hazard) of a
variable of class integer, or numeric when
regressed on an arbitrary number of covariates. Multiple Partial F-tests can
be specified using the U function.
regress(
fnctl,
formula,
data,
intercept = TRUE,
weights = rep(1, nrow(data.frame(data))),
subset = rep(TRUE, nrow(data.frame(data))),
robustSE = TRUE,
conf.level = 0.95,
exponentiate = fnctl != "mean",
replaceZeroes,
useFdstn = TRUE,
suppress = FALSE,
na.action,
method = "qr",
qr = TRUE,
singular.ok = TRUE,
contrasts = NULL,
init = NULL,
ties = "efron",
offset,
control = list(...),
...
)An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.
a character string indicating
the functional (summary measure of the distribution) for which inference is
desired. Choices include "mean", "geometric mean",
"odds", "rate", "hazard".
an object of class formula as might be passed to
lm, glm, or coxph. Functions of variables, specified using dummy
or polynomial may also be included in formula.
a data frame, matrix, or other data structure with matching
names to those entered in formula.
a logical value
indicating whether a intercept exists or not. Default value is TRUE for all
functionals. Intercept may also be removed if a "-1" is present in formula. If "-1"
is present in formula but intercept = TRUE is specified, the model will fit
without an intercept. Note that when fnctl = "hazard", the intercept is always set to
FALSE because Cox proportional hazards regression models do not explicitly estimate
an intercept.
vector indicating optional weights for weighted regression.
vector indicating a subset to be used for all inference.
a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.
a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.
a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.
if not
FALSE, this indicates a value to be used in place of zeroes when
computing a geometric mean. If TRUE, a value equal to one-half the
lowest nonzero value is used. If a numeric value is supplied, that value is
used. Defaults to TRUE when fnctl = "geometric mean". This parameter
is always FALSE for all other values of fnctl.
a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.
if TRUE, and a model which requires exponentiation
(for instance, regression on the geometric mean) is computed, then a table
with only the exponentiated coefficients and confidence interval is
returned. Otherwise, two tables are returned - one with the original
unexponentiated coefficients, and one with the exponentiated coefficients.
optional arguments that are passed to the functionality of lm or
glm.
the method to be used in fitting the model. The default value for
fnctl = "mean" and fnctl = "geometric mean" is "qr", and the default value for
fnctl = "odds" and fnctl = "rate" is "glm.fit". This argument is passed into the
lm() or glm() function, respectively. You may optionally specify method = "model.frame", which
returns the model frame and does no fitting.
a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.
a character string describing method for breaking ties in hazard regression.
Only efron, breslow, or exact is accepted.
See more details in the documentation for this argument in the survival::coxph function.
Default to efron.
additional arguments to be passed to the lm function call
Regression models include linear regression (for the ``mean'' functional), logistic regression with logit link (for the ``odds'' functional), Poisson regression with log link (for the ``rate'' functional), linear regression of a log-transformed outcome (for the ``geometric mean'' functional), and Cox proportional hazards regression (for the hazard functional).
Currently, for the hazard functional, only `coxph` syntax is supported; in other words, using `dummy`, `polynomial`,
and U functions will result in an error when `fnctl = hazard`.
Note that the only possible link function in `regress` with `fnctl = odds"` is the logit link. Similarly, the only possible link function in `regress` with `fnctl = "rate"` is the log link.
Objects created using the
U function can also be passed in. If the
U call involves a partial formula of the form
~ var1 + var2, then regress will return a multiple-partial
F-test involving var1 and var2. If an F-statistic will already be
calculated regardless of the U specification,
then any naming convention specified via name ~ var1 will be ignored.
The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can
follow them).
# Loading dataset
data(mri)
# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)
# Linear regression of atrophy on sex and height and their interaction,
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)
# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)
# Cox regression of age on survival
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)
Run the code above in your browser using DataLab