extending-emmeans: Support functions for model extensions

Description

This documents the methods that ref_grid calls. A user or package developer may add emmeans support for a model class by writing recover_data and emm_basis methods for that class.

Usage

recover_data(object, ...)
# S3 method for call
recover_data(object, trms, na.action, data = NULL,
  params = NULL, ...)
emm_basis(object, trms, xlev, grid, ...)
.recover_data(object, ...)
.emm_basis(object, trms, xlev, grid, ...)

Arguments

object

An object of the same class as is supported by a new method.

...

Additional parameters that may be supported by the method.

trms

The terms component of object (typically with the response deleted, e.g. via delete.response)

na.action

Integer vector of indices of observations to ignore; or NULL if none

data

Data frame. Usually, this is NULL. However, if non-null, this is used in place of the reconstructed dataset. It must have all of the predictors used in the model, and any factor levels must match those used in fitting the model.

params

Character vector giving the names of any variables in the model formula that are not predictors. An example would be a variable knots specifying the knots to use in a spline model.

xlev

Named list of factor levels (excluding ones coerced to factors in the model formula)

grid

A data.frame (provided by ref_grid) containing the predictor settings needed in the reference grid

Value

The recover_data method must return a data.frame containing all the variables that appear as predictors in the model, and attributes "call", "terms", "predictors", and "responses". (recover_data.call will provide these attributes.)

The emm_basis method should return a list with the following elements:

X: The matrix of linear functions over grid, having the same number of rows as grid and the number of columns equal to the length of bhat.
bhat: The vector of regression coefficients for fixed effects. This should include any NAs that result from rank deficiencies.
nbasis: A matrix whose columns form a basis for non-estimable functions of beta, or a 1x1 matrix of NA if there is no rank deficiency.
V: The estimated covariance matrix of bhat.
dffun: A function of (k, dfargs) that returns the degrees of freedom associated with sum(k * bhat).
dfargs: A list containing additional arguments needed for dffun

.recover_data and .emm_basis are hidden exported versions of recover_data and emm_basis, respectively. They run in emmeans's namespace, thus providing access to all existing methods.

Optional hooks

Some models may need something other than standard linear estimates and standard errors. If so, custom functions may be pointed to via the items misc$estHook, misc$vcovHook and misc$postGridHook. If just the name of the hook function is provided as a character string, then it is retrieved using get.

The estHook function should have arguments (object, do.se, tol, ...) where object is the emmGrid object, do.se is a logical flag for whether to return the standard error, and tol is the tolerance for assessing estimability. It should return a matrix with 3 columns: the estimates, standard errors (NA when do.se==FALSE), and degrees of freedom (NA for asymptotic). The number of rows should be the same as object@linfct. The vcovHook function should have arguments (object, tol, ...) as described. It should return the covariance matrix for the estimates. Finally, postGridHook, if present, is called at the very end of ref_grid; it takes one argument, the constructed object, and should return a suitably modified emmGrid object.

Details

To create a reference grid, the ref_grid function needs to reconstruct the data used in fitting the model, and then obtain a matrix of linear functions of the regression coefficients for a given grid of predictor values. These tasks are performed by calls to recover_data and emm_basis respectively. A vignette giving details and examples is available via vignette("extending", "emmeans")

To extend emmeans's support to additional model types, one need only write S3 methods for these two functions. The existing methods serve as helpful guidance for writing new ones. Most of the work for recover_data can be done by its method for class "call", providing the terms component and na.action data as additional arguments. Writing an emm_basis method is more involved, but the existing methods (e.g., emmeans:::emm_basis.lm) can serve as models. Certain recover_data and emm_basis methods are exported from emmeans. (To find out, do methods("recover_data").) If your object is based on another model-fitting object, it may be that all that is needed is to call one of these exported methods and perhaps make modifications to the results. Contact the developer if you need others of these exported.

If the model has a multivariate response, bhat needs to be “flattened” into a single vector, and X and V must be constructed consistently.

In models where a non-full-rank result is possible (often, you can tell by seeing if there is a singular.ok argument in the model-fitting function), summary.emmGrid and its relatives check the estimability of each prediction, using the nonest.basis function in the estimability package.