na.test: Little's Missing Completely at Random (MCAR) Test

Description

This function performs Little's Missing Completely at Random (MCAR) test

Usage

na.test(x, digits = 2, p.digits = 3, as.na = NULL, check = TRUE, output = TRUE)

Arguments

a matrix or data frame with incomplete data, where missing values are coded as NA.

digits

an integer value indicating the number of decimal places to be used for displaying results.

p.digits

an integer value indicating the number of decimal places to be used for displaying the p-value.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis.

check

logical: if TRUE, argument specification is checked.

output

logical: if TRUE, output is shown.

Value

Returns an object of class misty.object, which is a list with following entries: function call (call), type of analysis type, matrix or data frame specified in x (data), specification of function arguments (args), list with results (result).

Details

Little (1988) proposed a multivariate test of Missing Completely at Random (MCAR) that tests for mean differences on every variable in the data set across subgroups that share the same missing data pattern by comparing the observed variable means for each pattern of missing data with the expected population means estimated using the expectation-maximization (EM) algorithm (i.e., EM maximum likelihood estimates). The test statistic is the sum of the squared standardized differences between the subsample means and the expected population means weighted by the estimated variance-covariance matrix and the number of observations within each subgroup (Enders, 2010). Under the null hypothesis that data are MCAR, the test statistic follows asymptotically a chi-square distribution with \(\sum k_j - k\) degrees of freedom, where \(k_j\) is the number of complete variables for missing data pattern \(j\), and \(k\) is the total number of variables. A statistically significant result provides evidence against MCAR.

Note that Little's MCAR test has a number of problems (see Enders, 2010). First, the test does not identify the specific variables that violates MCAR, i.e., the test does not identify potential correlates of missingness (i.e., auxiliary variables). Second, the test is based on multivariate normality, i.e., under departure from the normality assumption the test might be unreliable unless the sample size is large and is not suitable for categorical variables. Third, the test investigates mean differences assuming that the missing data pattern share a common covariance matrix, i.e., the test cannot detect covariance-based deviations from MCAR stemming from a Missing at Random (MAR) or Missing Not at Random (MNAR) mechanism because MAR and MNAR mechanisms can also produce missing data subgroups with equal means. Fourth, simulation studies suggest that Little's MCAR test suffers from low statistical power, particularly when the number of variables that violate MCAR is small, the relationship between the data and missingness is weak, or the data are MNAR (Thoemmes & Enders, 2007). Fifth, the test can only reject, but cannot prove the MCAR assumption, i.e., a statistically not significant result and failing to reject the null hypothesis of the MCAR test does not prove the null hypothesis that the data is MCAR. Finally, under the null hypothesis the data are actually MCAR or MNAR, while a statistically significant result indicates that missing data are MAR or MNAR, i.e., MNAR cannot be ruled out regardless of the result of the test.

References

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

Thoemmes, F., & Enders, C. K. (2007, April). A structural equation model for testing whether data are missing completely at random. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.

Little, R. J. A. (1988). A test of Missing Completely at Random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202. https://doi.org/10.2307/2290157

Examples

Run this code

# NOT RUN {
na.test(airquality)
# }

Run the code above in your browser using DataLab