Learn R Programming

heplots (version 1.8.1)

boxM: Box's M-test for Homogeneity of Covariance Matrices

Description

boxM() performs the Box's (1949) M-test for homogeneity of covariance matrices obtained from multivariate normal data according to one or more classification factors. The test compares the product of the log determinants of the separate covariance matrices to the log determinant of the pooled covariance matrix, analogous to a likelihood ratio test. The test statistic uses a chi-square approximation.

Usage

boxM(Y, ...)

# S3 method for formula boxM(Y, data, ...)

# S3 method for lm boxM(Y, ...)

# S3 method for default boxM(Y, group, ...)

# S3 method for boxM print(x, ...)

# S3 method for boxM summary( object, digits = getOption("digits") - 2, cov = FALSE, quiet = FALSE, ... )

Value

A list with class c("boxM", "htest") containing the following components:

statistic

the chi-square (approximate) statistic for Box's M test, where large values imply the covariance matrices differ.

parameter

the degrees of freedom for the test statistic.

p.value

the p-value of the test

ngroups

the number of levels of the group variable

cov

a list of the group covariance matrices, of length ngroups

pooled

the pooled covariance matrix

means

a matrix whose ngroups+1 rows are the means of the variables, followed by those for pooled data.

logDet

a vector of length ngroups+1 containing the natural logarithm of each matrix in cov, followed by that for the pooled covariance matrix

df

a vector of the degrees of freedom for all groups, followed by that for the pooled covariance matrix

data.name

a character string giving the names of the data, as extracted from the call

method

the character string "Box's M-test for Homogeneity of Covariance Matrices"

Arguments

Y

The response variable matrix for the default method, or a "mlm" or "formula" object for a multivariate linear model. If Y is a linear-model object or a formula, the variables on the right-hand-side of the model must all be factors and must be completely crossed, e.g., A:B

...

Other arguments passed down

data

A data frame containing the variables in the model. Used only for the formula method.

group

A vector specifying the groups. Used only for the default method.

x

a class "boxM" object, for the print() method

object

A "boxM" object, result of a call to boxM

digits

Number of digits in printed output

cov

Logical; if TRUE, the covariance matrices for each group and the pooled covariance matrix are printed

quiet

Logical; if TRUE, suppress printed output

Author

The default method was taken from the biotools package, Anderson Rodrigo da Silva anderson.agro@hotmail.com

Generalized by Michael Friendly and John Fox

Details

As an object of class "boxM", a few methods are available: print.boxM(), summary.boxM() and plot.boxM().

There is no general provision as yet for handling missing data. Missing data are simply removed, with a warning.

As well, the computation assumes that the covariance matrix for each group is non-singular, so that \(\log det(S_i)\) can be calculated for each group. At the minimum, this requires that \(n > p\) for each group.

Box's M test for a multivariate linear model highly sensitive to departures from multivariate normality, just as the analogous univariate test. It is also affected adversely by unbalanced designs. Some people recommend to ignore the result unless it is very highly significant, e.g., p < .0001 or worse.

In general, heterogeneity of covariance matrices can be more easily seen and understood by plotting the covariance ellipses using covEllipses.

The summary method prints a variety of additional statistics based on the eigenvalues of the covariance matrices. These are returned invisibly, as a list containing the following components:

logDet

the vector of log determinants

eigs

eigenvalues of the covariance matrices

eigstats

statistics computed on the eigenvalues for each covariance matrix:

product

the product of eigenvalues, \(\prod{\lambda_i}\)

sum

the sum of eigenvalues, \(\sum{\lambda_i}\)

precision

the average precision of eigenvalues, \(1/\sum(1/\lambda_i)\)

max

the maximum eigenvalue, \(\lambda_1\)

References

Box, G. E. P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36, 317-346.

Morrison, D.F. (1976) Multivariate Statistical Methods.

See Also

leveneTest carries out homogeneity of variance tests for univariate models with better statistical properties.

plot.boxM, a simple dot plot of the log determinants compared with that of the pooled covariance matrix, and also of other quantities computed from their eigenvalues

covEllipses plots covariance ellipses in variable space for several groups.

Examples

Run this code

data(iris)

# default method, using `Y`, `group` 
res <- boxM(iris[, 1:4], iris[, "Species"])
res

# summary method gives details
summary(res)

# visualize (this is what is done in the plot method)
dets <- res$logDet
ng <- length(res$logDet)-1
dotchart(dets, xlab = "log determinant")
points(dets , 1:4, cex=c(rep(1.5, ng), 2.5), pch=c(rep(16, ng), 15),
       col= c(rep("blue", ng), "red"))

# plot method gives confidence intervals for logDet
plot(res, gplabel="Species")

# formula method
boxM( cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species,
      data=iris)

### Skulls data
data(Skulls)

# lm method
skulls.mod <- lm(cbind(mb, bh, bl, nh) ~ epoch, data=Skulls)
skulls.boxM <- boxM(skulls.mod) |>
  print()
summary(skulls.boxM)

Run the code above in your browser using DataLab