lsmeans: Least-squares means (or predicted marginal means)

Description

Compute least-squares means (predicted marginal means) for specified factors or factor combinations in a linear model, and optionally comparisons or contrasts among them.

Usage

# S3 method for character
lsmeans(object, specs, ...)
## (used when 'specs' is 'character')
# S3 method for character.ref.grid
lsmeans(object, specs, by = NULL, 
    fac.reduce = function(coefs) apply(coefs, 2, mean), contr, 
    options = getOption("lsmeans")$lsmeans, weights, trend, ...)
## (used when 'object' is a 'ref.grid' and 'specs' is 'character')
    
# S3 method for list
lsmeans(object, specs, ...)
## (used when 'specs' is a 'list')
# S3 method for formula
lsmeans(object, specs, contr.list, trend, ...)
## (used when 'specs' is a 'formula')
lstrends(model, specs, var, delta.var = 0.01 * rng, data, 
   transform = c("none", "response"), ...)
lsmobj(bhat, V, levels, linfct, df = NA, post.beta = matrix(NA), ...)
pmmeans(...)
pmtrends(...)
pmmobj(...)

Arguments

object

An object of class ref.grid; or a fitted model object that is supported, such as the result of a call to lm or lmer. Many fitted-model objects are supported; see link{models} for details.

specs

A character vector specifying the names of the predictors over which LS-means are desired. specs may also be a formula or a list (optionally named) of valid specs. Use of formulas is described in the Details section below.

A character vector specifying the names of predictors to condition on.

fac.reduce

A function that combines the rows of a matrix into a single vector. This implements the ``marginal averaging'' aspect of least-squares means. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If fac.reduce is nonlinear, bizarre results are likely, and LS means will not be interpretable. If the weights argument is non-missing, fac.reduce is ignored.

contr

A list of contrast coefficients to apply to the least-squares means -- or the root name of an .lsmc function that returns such coefficients. In addition, contr = "cld" is an alternative way to invoke the cld function. See contrast for more details on contrasts. NOTE: contr is ignored when specs is a formula.

contr.list

A named list of lists of contrast coefficients, as for contr. This is used only in the formula method; see Details below.

options

If non-NULL, a named list of arguments to pass to update, just after the object is constructed.

weights

Numeric vector, numeric matrix, or character string specifying weights to use in averaging predictions. If a vector, its length must equal the number of predictions to be averaged to obtain each least-squares mean. If a matrix, each row of the matrix is used in turn, wrapping back to the first row as needed. When in doubt about what is being averaged (or how many), first call with weights = "show.levels".)

If a string, it should partially match one of the following:

"equal": Use an equally weighted average.
"proportional": Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.
"outer": Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins
"cells": Weight according to the frequencies of the cells being averaged.
"flat": Give equal weight to all cells with data, and ignore empty cells.
"show.levels": This is a convenience feature for understanding what is being averaged over. Instead of a table of LS means, this causes the function to return a table showing the levels that are averaged over, in the order they appear.

Outer weights are like the 'expected' counts in a chi-square test of independence, and will yield the same results as those obtained by proportional averaging with one factor at a time. All except "cells" uses the same set of weights for each mean. In a model where the predicted values are the cell means, cell weights will yield the raw averages of the data for the factors involved. Using "flat" is similar to "cells", except nonempty cells are weighted equally and empty cells are ignored.

Note: If a nested structure exists (see the nests argument in ref.grid), then averaging is done separately over each nesting group; thus, these groups are potentially of different sizes. Accordingly, it is unsafe to specify numerical weights.

Note: If weights were used in fitting the model, then weight totals are used in place of frequencies in these schemes.

If weights is used, fac.reduce is ignored.

trend

Including this argument is an alternative way of calling lstrends with trend as its var argument and object as its model.

model

A supported model object (not a ref.grid).

var

Character giving the name of a variable with respect to which a difference quotient of the linear predictors is computed. In order for this to be useful, var should be a numeric predictor that interacts with at least one factor in specs. Then instead of computing least-squares means, we compute and compare the slopes of the var trend over levels of the specified other predictor(s). As in least-squares means, marginal averages are computed when some variables in the reference grid are excluded for the specification.

The user may specify some monotone function of one variable, e.g., var = "log(dose)". If so, the chain rule is applied. Note that, in this example, if model contains log(dose) as a predictor, we will be comparing the slopes estimated by that model, whereas specifying var = "dose" would perform a transformation of those slopes.

delta.var

The value of h to use in forming the difference quotient (f(x+h) - f(x))/h. Changing it (especially changing its sign) may be necessary to avoid numerical problems such as logs of negative numbers. The default value is 1/100 of the range of var over the dataset.

data

As in ref.grid, you may use this argument to supply the dataset used in fitting the model, for situations where it is not possible to reconstruct the data. Otherwise, leave it missing.

transform

In lstrends, if object has a response transformation, then specifying transform = "response" will cause lstrends to calculate the trends after back-transforming to the response scale. This is done using the chain rule, and standard errors are estimated via the delta method. With transform = "none" (the default), the trends are calculated on the scale of the linear predictor, without back-transforming it. This argument works similarly to the transform argument in ref.grid (but without a "log" option), in that the returned object is re-gridded to the new scale (see also regrid).

bhat

Numeric. Vector of regression coefficients.

Square matrix. Covariance matrix of bhat

levels

Named list or vector. Levels of factor(s) that define the estimates defined by linfct. If not a list, we assume one factor named "level"

linfct

Matrix. Linear functions of bhat for each combination of levels

Numeric or function with arguments (x,dfargs). If a number, that is used for the degrees of freedom. If a function, it should return the degrees of freedom for sum(x*bhat); if additional parameters are needed, include them in … as dfargs (not abbreviated).

post.beta

Matrix whose columns comprise a sample from the posterior distribution of the regression coefficients (so that typically, the column averages will be bhat). A 1 x 1 matrix of NA indicates that such a sample is unavailable.

…

Additional arguments passed to other methods or to ref.grid. For example, vcov. may be used to override the default covariance estimate, and some models allow additional options. Some models require data to be given explicitly. See the help pages for ref.grid and models. In addition, if the model formula contains references to variables that are not predictors, you must provide a params argument with a list of their names; see the example below for Oatsq.lm.

Value

When specs is a character vector or one-sided formula, an object of class lsmobj. A number of methods are provided for further analysis, including summary, confint, test, contrast, pairs, and cld.

When specs is a list or a formula having a left-hand side, the return value is an lsm.list object, which is simply a list of lsmobj objects. Methods for lsm.list objects are the same as those for lsmobj, but they apply to only one member of the list, determined by its which argument.

Side effect: When object is a model, a reference grid is constructed and it is saved as .Last.ref.grid in the user's environment (unless this is disabled via lsm.option(save.ref.grid = FALSE)). This makes it possible to check what reference grid was used, or to use it as the object in future lsmeans calls (and bypass reconstructing it). Similarly, lstrends also saves its reference grid (but for predicting difference quotients) as .Last.ref.grid.

Details

Least-squares means (also called predicted marginal means) are predictions from a linear model over a reference grid, or marginal averages thereof. They have been popularized by SAS (SAS Institute, 2012). The ref.grid function identifies/creates the reference grid upon which lsmeans is based.

For those who prefer the term “predicted marginal means”, courtesy wrappers pmmeans, pmtrends, and pmmobj are provided that behave identically to those that start with ls, except that estimates are relabeled accordingly (e.g., lsmean becomes pmmean).

If specs is a formula, it should be of the form ~ specs, ~ specs | by, contr ~ specs, or contr ~ specs | by. The formula is parsed and the variables therein are used as the arguments specs, by, and contr as indicated. The left-hand side is optional, but if specified it should be the name of a contrast family (e.g., pairwise) or of a sub-list of contr.list. Operators like * or : are necessary to delineate names in the formulas, but otherwise are ignored.

In the special case where the mean (or weighted mean) of all the predictions is desired, specify specs as ~ 1 or "1".

A number of standard contrast families are provided. They can be identified as functions having names ending in .lsmc -- use

ls("package:lsmeans", pat=".lsmc")

to list them. See the documentation for pairwise.lsmc and its siblings for details. You may write your own .lsmc function for custom contrasts.

The function lsmobj may be used to construct an object just like one returned by lsmeans from user-specified coefficients, covariance matrix, levels (or row labels), linear functions for each row, and degrees of freedom. After the object is constructed, it is updateed with any additional arguments in ….

References

SAS Institute Inc. (2012) Online documentation; Shared concepts; LSMEANS statement, http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_introcom_a0000003362.htm, accessed August 15, 2012.

Examples

Run this code

# NOT RUN {
require(lsmeans)

### Covariance example (from Montgomery Design (8th ed.), p.656)
# Uses supplied dataset 'fiber'
fiber.lm <- lm(strength ~ diameter + machine, data = fiber)

# adjusted means and comparisons, treating machine C as control
( fiber.lsm <- lsmeans (fiber.lm, "machine") )
contrast(fiber.lsm, "trt.vs.ctrlk")
# Or get both at once using
#     lsmeans (fiber.lm, "machine", contr = "trt.vs.ctrlk")


### Factorial experiment
warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks)
( warp.lsm <- lsmeans (warp.lm,  ~ wool | tension,
    options = list(estName = "pred.breaks")) )
pairs(warp.lsm) # remembers 'by' structure
contrast(warp.lsm, method = "poly", by = "wool")


### Unbalanced split-plot example ###
#-- The imbalance is imposed deliberately to illustrate that
#-- the variance estimates become biased
require(nlme)
Oats.lme <- lme(yield ~ factor(nitro) + Variety, 
    random = ~1 | Block/Variety, 
    subset = -c(1,2,3,5,8,13,21,34,55), data = Oats)
(Oats.anal <- lsmeans(Oats.lme, list(poly ~ nitro, pairwise ~ Variety)))


### Issues with lists of specs
test(Oats.anal)                # Uses 1st element by default
confint(Oats.anal, which = 4)  # or confint(Oats.anal[[4]])

# Using 'pmmeans' wrapper ...
pmmeans(warp.lm, ~ wool, 
    options = list(infer = c(TRUE, TRUE), null = 22, side = ">"))

### Weights
# See what's being averaged over in the above
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = "show.levels")

# Give three times the weight to Marvellous
lsmeans(Oats.lme, ~ nitro, cov.reduce = FALSE, weights = c(1,3,1))

# Overall mean
lsmeans(Oats.lme, ~ 1, weights = "equal")
lsmeans(Oats.lme, "1", weights = "cells")


### Model with a quadratic trend for 'nitro'
# Also illustrates use of `params` argument to list non-predictors
deg = 2
Oatsq.lm <- lm(yield ~ Block + poly(nitro, deg) + Variety, data = Oats)
# Predictions at each unique 'nitro' value in the dataset
lsmeans(Oatsq.lm, ~ nitro, cov.reduce = FALSE, params = "deg")


### Trends
fiber.lm <- lm(strength ~ diameter*machine, data=fiber)
# Obtain slopes for each machine ...
( fiber.lst <- lstrends(fiber.lm, "machine", var="diameter") )
# ... and pairwise comparisons thereof
pairs(fiber.lst)

# Suppose we want trends relative to sqrt(diameter)...
lstrends(fiber.lm, ~ machine | diameter, var = "sqrt(diameter)", 
    at = list(diameter = c(20,30)))

# Given summary statistics for 4 cities computed elsewhere, 
# obtain multiple comparisons of their means using the
# Satterthwaite method
ybar <- c(47.6, 53.2, 88.9, 69.8)
s <-    c(12.1, 19.5, 22.8, 13.2)
n <-    c(44,   11,   37,   24)
se2 = s^2 / n
Satt.df <- function(x, dfargs)
    sum(x * dfargs$v)^2 / sum((x * dfargs$v)^2 / (dfargs$n - 1))
city.pmm <- pmmobj(bhat = ybar, V = diag(se2),
    levels = list(city = LETTERS[1:4]), linfct = diag(c(1,1,1,1)),
    df = Satt.df, dfargs = list(v = se2, n = n), estName = "mean")
city.pmm
contrast(city.pmm, "revpairwise")
    
    
# See also many other examples in documentation for 
# 'contrast', 'cld', 'glht', 'lsmip', 'ref.grid', 'MOats',
# 'nutrition', etc., and in the vignettes
# }

Run the code above in your browser using DataLab

State of Data and AI Literacy Report 2025