Using a fitted model object, determine a reference grid for which least-squares means are defined. The resulting ref.grid
object encapsulates all the information needed to calculate LS means and make inferences on them.
ref.grid(object, at, cov.reduce = mean, mult.name, mult.levs,
options = get.lsm.option("ref.grid"), data, df, type,
transform = c("none", "response", "mu", "unlink", "log"),
nesting, ...)
.Last.ref.grid
An object produced by a supported model-fitting function, such as lm
. Many models are supported. See models
.
Optional named list of levels for the corresponding variables
A function, logical value, or formula; or a named list of these. Each covariate not specified in at
is reduced according to these specifications.
If a single function, it is applied to each covariate.
If logical and TRUE
, mean
is used. If logical and FALSE
, it is equivalent to specifying function(x) sort(unique(x)), and these values are considered part of the reference grid; thus, it is a handy alternative to specifying these same values in at
.
If a formula (which must be two-sided), then a model is fitted to that formula using lm
; then in the reference grid, its response variable is set to the results of predict
for that model, with the reference grid as newdata
. (This is done after the reference grid is determined.) A formula is appropriate here when you think experimental conditions affect the covariate as well as the response.
If cov.reduce
is a named list, then the above criteria are used to determine what to do with covariates named in the list. (However, formula elements do not need to be named, as those names are determined from the formulas' left-hand sides.) Any unresolved covariates are reduced using "mean"
.
Any cov.reduce
specification for a covariate also named in at
is ignored.
A named list of levels for the dimensions of a multivariate response. If there is more than one element, the combinations of levels are used, in expand.grid
order. The (total) number of levels must match the number of dimensions. If mult.name
is specified, this argument is ignored.
If non-NULL
, a named list
of arguments to pass to update
, just after the object is constructed.
A data.frame
to use to obtain information about the predictors (e.g. factor levels). If missing, then recover.data
is used to attempt to reconstruct the data.
This is a courtesy shortcut, equivalent to specifying options(df = df)
. See update
.
If provided, this is saved as the "predict.type"
setting. See update
If other than "none"
, the reference grid is reconstructed via regrid
with the given transform
argument. See Details.
If the model has nested fixed effects, this may be specified here via a named list
specifying the nesting structure. Specifying nesting
overrides the nesting structure that may be automatically detected. See Details.
An S4 object of class "ref.grid"
(see ref.grid-class
). These objects encapsulate everything needed to do calculations and inferences for least-squares means, and contain nothing that depends on the model-fitting procedure. As a side effect, the result is also saved as .Last.ref.grid
(in the global environment, unless this variable is found in another position).
The reference grid consists of combinations of independent variables over which predictions are made. Least-squares means are defined as these predictions, or marginal averages thereof.
The grid is determined by first reconstructing the data used in fitting the model (see recover.data
), or by using the data.frame
provided in context
. The default reference grid is determined by the observed levels of any factors, the ordered unique values of character-valued predictors, and the results of cov.reduce
for numeric predictors. These may be overridden using at
.
Ability to support a particular class of object
depends on the existence of recover.data
and lsm.basis
methods -- see extending-lsmeans for details. The call methods("recover.data")
will help identify these.
In certain models, (e.g., results of glmer.nb
),
it is not possible to identify the original dataset. In such cases, we can work around this by setting data
equal to the dataset used in fitting the model, or a suitable subset.
Only the complete cases in data
are used, so it may be necessary to exclude some unused variables.
Using data
can also help save computing, especially when the dataset is large. In any case, data
must represent all factor levels used in fitting the model. It cannot be used as an alternative to at
. (Note: If there is a pattern of NAs
that caused one or more factor levels to be excluded when fitting the model, then data
should also exclude those levels.)
By default, the variance-covariance matrix for the fixed effects is obtained from object
, usually via its vcov
method. However, the user may override this via a vcov.
argument, specifying a matrix or a function. If a matrix, it must be square and of the same dimension and parameter order of the fixed effects. If a function, must return a suitable matrix when it is called with object
as its only argument.
Nested factors: ref.grid
tries to discern which factors are nested in other factors, but it is not always obvious, and if it misses some, the user must specify this structure via nesting
; or later using update
. Each member of nesting
should be a character vector of the name(s) of grouping factors; and the name for that member should be that of the factor that is nested therein; for example, list(city = c("state", "country")
. Having a nesting structure affects marginal averaging in lsmeans
in that it is done separately for each level (or combination thereof) of the grouping factors.
There is a subtle difference between specifying type = "response" and transform = "response". While the summary statistics for the grid itself are the same, subsequent use in lsmeans
will yield different results if there is a response transformation. With type = "response", LS means are computed by averaging together predictions on the linear-predictor scale and then back-transforming to the response scale; while with transform = "response", the predictions are already on the response scale so that the LS means will be the arithmetic means of those response-scale predictions. To add further to the possibilities, geometric means of the response-scale predictions are obtainable via transform = "log", type = "response".
The most recent result of ref.grid
, whether called directly or indirectly via lsmeans
, lstrends
, or some other function that calls one of these, is saved in the user's environment as .Last.ref.grid
. This facilitates checking what reference grid was used, or reusing the same reference grid for further calculations. This automatic saving is enabled by default, but may be disabled via lsm.options(save.ref.grid = FALSE), and re-enabled by specifying TRUE
.
See also summary
and other methods for the returned objects. Reference grids are fundamental to lsmeans
. Click here for more on the ref.grid
class. Supported models are detailed in models
.
# NOT RUN {
require(lsmeans)
fiber.lm <- lm(strength ~ machine*diameter, data = fiber)
ref.grid(fiber.lm)
summary(ref.grid(fiber.lm))
ref.grid(fiber.lm, at = list(diameter = c(15, 25)))
# }
# NOT RUN {
# We could substitute the sandwich estimator vcovHAC(fiber.lm)
# as follows:
require(sandwich)
summary(ref.grid(fiber.lm, vcov. = vcovHAC))
# }
# NOT RUN {
# If we thought that the machines affect the diameters
# (admittedly not plausible in this example), then we should use:
ref.grid(fiber.lm, cov.reduce = diameter~machine)
# Multivariate example
MOats.lm = lm(yield ~ Block + Variety, data = MOats)
ref.grid(MOats.lm, mult.name = "nitro")
# silly illustration of how to use 'mult.levs'
ref.grid(MOats.lm, mult.levs = list(T=LETTERS[1:2], U=letters[1:2]))
# }
Run the code above in your browser using DataLab