boxM: Box's M-test

Description

boxM performs the Box's (1949) M-test for homogeneity of covariance matrices obtained from multivariate normal data according to one or more classification factors. The test compares the product of the log determinants of the separate covariance matrices to the log determinant of the pooled covariance matrix, analogous to a likelihood ratio test. The test statistic uses a chi-square approximation.

Usage

boxM(Y, ...)
# S3 method for formula
boxM(Y, data, ...)
# S3 method for lm
boxM(Y, ...)
# S3 method for default
boxM(Y, group, ...)
# S3 method for boxM
summary(object, 
     digits = getOption("digits"),
     cov=FALSE, quiet=FALSE, ...)

Arguments

The response variable matrix for the default method, or a "mlm" or "formula" object for a multivariate linear model. If Y is a linear-model object or a formula, the variables on the right-hand-side of the model must all be factors and must be completely crossed, e.g., A:B

data

a numeric data.frame or matrix containing n observations of p variables; it is expected that n > p.

group

a factor defining groups, or a vector of length n doing the same.

object

a "boxM" object for the summary method

digits

number of digits to print for the summary method

cov

logical; if TRUE the covariance matrices are printed.

quiet

logical; if TRUE printing from the summary is suppressed

...

Arguments passed down to methods.

Value

A list with class c("htest", "boxM") containing the following components:

statistic

an approximated value of the chi-square distribution.

parameter

the degrees of freedom related of the test statistic in this case that it follows a Chi-square distribution.

p.value

the p-value of the test.

cov

a list containing the within covariance matrix for each level of grouping.

pooled

the pooled covariance matrix.

logDet

a vector containing the natural logarithm of each matrix in cov, followed by the value for the pooled covariance matrix

means

a matrix of the means for all groups, followed by the grand means

a vector of the degrees of freedom for all groups, followed by that for the pooled covariance matrix

data.name

a character string giving the names of the data.

method

the character string "Box's M-test for Homogeneity of Covariance Matrices".

Details

As an object of class "htest", the statistical test is printed normally by default. As an object of class "boxM", a few methods are available.

There is no general provision as yet for handling missing data. Missing data are simply removed, with a warning.

As well, the computation assumes that the covariance matrix for each group is non-singular, so that \(log det(S_i)\) can be calculated for each group. At the minimum, this requires that \(n > p\) for each group.

Box's M test for a multivariate linear model highly sensitive to departures from multivariate normality, just as the analogous univariate test. It is also affected adversely by unbalanced designs. Some people recommend to ignore the result unless it is very highly significant, e.g., p < .0001 or worse.

The summary method prints a variety of additional statistics based on the eigenvalues of the covariance matrices. These are returned invisibly, as a list containing the following components:

logDet - log determinants
eigs - eigenvalues of the covariance matrices
eigstats - statistics computed on the eigenvalues for each covariance matrix: product: the product of eigenvalues, \(\prod{\lambda_i}\); sum: the sum of eigenvalues, \(\sum{\lambda_i}\); precision: the average precision of eigenvalues, \(1/\sum(1/\lambda_i)\); max: the maximum eigenvalue, \(\lambda_1\)

References

Box, G. E. P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36, 317-346.

Morrison, D.F. (1976) Multivariate Statistical Methods.

Examples

Run this code

# NOT RUN {
data(iris)

# default method
res <- boxM(iris[, 1:4], iris[, "Species"])
res

summary(res)

# visualize (what is done in the plot method) 
dets <- res$logDet
ng <- length(res$logDet)-1
dotchart(dets, xlab = "log determinant")
points(dets , 1:4,  
	cex=c(rep(1.5, ng), 2.5), 
	pch=c(rep(16, ng), 15),
	col= c(rep("blue", ng), "red"))

# formula method
boxM( cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data=iris)

### Skulls dat
data(Skulls)
# lm method
skulls.mod <- lm(cbind(mb, bh, bl, nh) ~ epoch, data=Skulls)
boxM(skulls.mod)



# }

Run the code above in your browser using DataLab