Given a main model, an outcome nuisance model and an
exposure nuisance model drgeeData
extracts the model
variables and matrices from a data.frame
or an environment
object. It also performs some data cleaning and error checking.
drgeeData(outcome, exposure,
oformula, eformula, iaformula = formula(~1),
olink = c("identity", "log", "logit"),
elink = c("identity", "log", "logit"),
data, subset = NULL,
estimation.method = c("dr", "o", "e"),
cond = FALSE, clusterid, clusterid.vcov)
The outcome as a variable or as a character string naming a variable in the
data
argument. If it is not found in the data
argument, it will be searched for in the calling frame. If missing,
the outcome is assumed to be the response of oformula
.
The exposure as a variable or as a character string naming a variable in the
data
argument. If it is not found in the data
argument, it will be searched for in the calling frame. If missing,
the outcome is assumed to be the response of eformula
.
An expression or formula for the outcome nuisance model. The outcome is identified as the response in this formula.
An expression or formula for the exposure nuisance model. The exposure is identified as the response in this formula.
An expression or formula where the RHS should contain the variables
that "interact" (i.e. are supposed to be multiplied with) with the
exposure in the main model to create the terms associated with the
parameters of interest. "1" will always added. Default value is no
interactions, i.e. formula(~1)
.
A character string naming the link function in the outcome nuisance
model. Have to be "identity"
, "log"
or
"logit"
. Default is "identity"
.
A character string naming the link function in the exposure nuisance
model. Have to be "identity"
, "log"
or
"logit"
. Default is "identity"
. When
olink="logit"
this is replaced by "logit"
.
A data frame or environment containing the variables in iaformula
,
oformula
and eformula
. If missing, variables are
expected to be found in the calling frame.
An optional vector defining a subset of the data to be used.
A character string naming the desired estimation method. Choose
"o"
for O-estimation,
"e"
for E-estimation or
"dr"
for DR-estimation. Default is "dr"
.
A logical value indicating whether the nuisance models should have
cluster-specific intercepts. If cond=TRUE
the design matrices
for the nuisance models do not have an intercept. Requires a
clusterid
argument.
A cluster-defining variable or a character string naming a cluster-defining variable in the
data
argument. If it is not found in the data
argument, it will be searched for in the calling frame. If missing,
each observation will be considered to be a separate cluster. This
argument is required when cond = TRUE
.
A cluster-defining variable or a character string naming a
cluster-defining variable in the data
argument to be used for
adding contributions from the same cluster. These clusters can be
different from the clusters defined by clusterid
. However,
each cluster defined by clusterid
needs to be contained in
exactly one cluster defined by clusterid.vcov
. This variable
is useful when the clusters are hierarchical.
drgee.data
returns an object of class drgeeData
containing
The rows numbers in the original data for the used rows (after subset selection and exlusions).
The original order of the observations.
The outcome matrix.
The exposure matrix.
The matrix of of interactions defined in iaformula
.
This matrix will always contain a column with 1's.
The matrix of elementwise product(s) of a
and
each column in x
.
The matrix of terms in the outcome nuisance model.
The matrix of terms in the exposure nuisance model.
The matrix of elementwise product(s) of y
and each
column in x
.
A factor defining clusters. For independent observations, the number of levels equals the number of complete observations.
A string for the name of the cluster defining variable.
A string for the name of the outcome.
A string for the name of the exposure.
A string vector for the variable names in x
.
A string vector for the variable names in ax
.
A string vector for the variable names in v
.
A string vector for the variable names in z
.
A string vector for the variable names in yx
.
A character string naming the link function in the outcome nuisance model.
A character string naming the link function in the outcome nuisance model.
A logical value indicating whether cluster-specific intercepts should
be assumed. If TRUE
, the is no column for the intercept in
v
and z
. Outcome concordant will also be removed.
The terms
object corresponding to the outcome
nuisance model.
The terms
object corresponding to the exposure
nuisance model.
drgeeData
is called by drgee
and gee
to extract
data from a data.frame
or environment
object. The data can then be used to for O-estimation, E-estimation or
DR-estimation. drgeeData
uses
model.frame
and model.matrix
to remove incomplete
observations and to convert factors to dummy variables. It also
performs check the supplied data for errors or inconsistencies.
The class method summary.drgeeData
produces strings for the
formulas with terms referring to the columns in the produced design
matrices.
drgee
, gee
, model.frame
and model.matrix
.