Performs the multisample crossmatch (MCM) test (Petrie, 2016).
Petrie(X1, X2, ..., dist.fun = stats::dist, dist.args = NULL, seed = 42)
An object of class htest
with the following components:
Observed value of the test statistic
Asymptotic p value
Observed multisample edge-count
The alternative hypothesis
Description of the test
The dataset names
Standard deviation under the null
Expectation under the null
First dataset as matrix or data.frame
Second dataset as matrix or data.frame
Optionally more datasets as matrices or data.frames
Function for calculating a distance matrix on the pooled dataset (default: stats::dist
, Euclidean distance).
Named list of further arguments passed to dist.fun
(default: NULL
).
Random seed (default: 42)
Target variable? | Numeric? | Categorical? | K-sample? |
No | Yes | Yes | Yes |
The test is an extension of the Rosenbaum (2005) crossmatch test to multiple samples that uses the crossmatch count of all pairs of samples.
The observed cross-counts are calculated using the functions distancematrix
and nonbimatch
from the nbpMatching package.
High values of the multisample crossmatch statistic indicate similarity between the datasets. Thus, the test rejects the null hypothesis of equal distributions for low values of the test statistic.
Mukherjee, S., Agarwal, D., Zhang, N. R. and Bhattacharya, B. B. (2022). Distribution-Free Multisample Tests Based on Optimal Matchings With Applications to Single Cell Genomics, Journal of the American Statistical Association, 117(538), 627-638, tools:::Rd_expr_doi("10.1080/01621459.2020.1791131")
Rosenbaum, P. R. (2005). An Exact Distribution-Free Test Comparing Two Multivariate Distributions Based on Adjacency. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(4), 515-530.
Petrie, A. (2016). Graph-theoretic multisample tests of equality in distribution for high dimensional data. Computational Statistics & Data Analysis, 96, 145-158, tools:::Rd_expr_doi("10.1016/j.csda.2015.11.003")
Stolte, M., Kappenberg, F., Rahnenführer, J., Bommert, A. (2024). Methods for quantifying dataset similarity: a review, taxonomy and comparison. Statist. Surv. 18, 163 - 298. tools:::Rd_expr_doi("10.1214/24-SS149")
MMCM
, Rosenbaum
# Draw some data
X1 <- matrix(rnorm(1000), ncol = 10)
X2 <- matrix(rnorm(1000, mean = 0.5), ncol = 10)
# Perform MCM test
if(requireNamespace("nbpMatching", quietly = TRUE)) {
Petrie(X1, X2)
}
Run the code above in your browser using DataLab