Produces point estimates, interval estimates, and p values for an arbitrary
functional (mean, geometric mean, proportion, odds, hazard) of a
variable of class integer
, or numeric
when
regressed on an arbitrary number of covariates. Multiple Partial F-tests can
be specified using the U
function.
regress(
fnctl,
formula,
data,
intercept = TRUE,
weights = rep(1, nrow(data.frame(data))),
subset = rep(TRUE, nrow(data.frame(data))),
robustSE = TRUE,
conf.level = 0.95,
exponentiate = fnctl != "mean",
replaceZeroes,
useFdstn = TRUE,
suppress = FALSE,
na.action,
method = "qr",
qr = TRUE,
singular.ok = TRUE,
contrasts = NULL,
init = NULL,
ties = "efron",
offset,
control = list(...),
...
)
An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.
a character string indicating
the functional (summary measure of the distribution) for which inference is
desired. Choices include "mean"
, "geometric mean"
,
"odds"
, "rate"
, "hazard"
.
an object of class formula
as might be passed to
lm
, glm
, or coxph
. Functions of variables, specified using dummy
or polynomial
may also be included in formula
.
a data frame, matrix, or other data structure with matching
names to those entered in formula
.
a logical value
indicating whether a intercept exists or not. Default value is TRUE
for all
functionals. Intercept may also be removed if a "-1" is present in formula
. If "-1"
is present in formula
but intercept = TRUE
is specified, the model will fit
without an intercept. Note that when fnctl = "hazard"
, the intercept is always set to
FALSE
because Cox proportional hazards regression models do not explicitly estimate
an intercept.
vector indicating optional weights for weighted regression.
vector indicating a subset to be used for all inference.
a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.
a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.
a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.
if not
FALSE
, this indicates a value to be used in place of zeroes when
computing a geometric mean. If TRUE
, a value equal to one-half the
lowest nonzero value is used. If a numeric value is supplied, that value is
used. Defaults to TRUE
when fnctl = "geometric mean"
. This parameter
is always FALSE
for all other values of fnctl
.
a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.
if TRUE
, and a model which requires exponentiation
(for instance, regression on the geometric mean) is computed, then a table
with only the exponentiated coefficients and confidence interval is
returned. Otherwise, two tables are returned - one with the original
unexponentiated coefficients, and one with the exponentiated coefficients.
optional arguments that are passed to the functionality of lm
or
glm
.
the method to be used in fitting the model. The default value for
fnctl = "mean"
and fnctl = "geometric mean"
is "qr"
, and the default value for
fnctl = "odds"
and fnctl = "rate"
is "glm.fit"
. This argument is passed into the
lm() or glm() function, respectively. You may optionally specify method = "model.frame"
, which
returns the model frame and does no fitting.
a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.
a character string describing method for breaking ties in hazard regression.
Only efron
, breslow
, or exact
is accepted.
See more details in the documentation for this argument in the survival::coxph function.
Default to efron
.
additional arguments to be passed to the lm
function call
Regression models include linear regression (for the ``mean'' functional), logistic regression with logit link (for the ``odds'' functional), Poisson regression with log link (for the ``rate'' functional), linear regression of a log-transformed outcome (for the ``geometric mean'' functional), and Cox proportional hazards regression (for the hazard functional).
Currently, for the hazard functional, only `coxph` syntax is supported; in other words, using `dummy`, `polynomial`,
and U
functions will result in an error when `fnctl = hazard`.
Note that the only possible link function in `regress` with `fnctl = odds"` is the logit link. Similarly, the only possible link function in `regress` with `fnctl = "rate"` is the log link.
Objects created using the
U
function can also be passed in. If the
U
call involves a partial formula of the form
~ var1 + var2
, then regress
will return a multiple-partial
F-test involving var1
and var2
. If an F-statistic will already be
calculated regardless of the U
specification,
then any naming convention specified via name ~ var1
will be ignored.
The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can
follow them).
# Loading dataset
data(mri)
# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)
# Linear regression of atrophy on sex and height and their interaction,
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)
# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)
# Cox regression of age on survival
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)
Run the code above in your browser using DataLab