This package conducts a Direction-Projection-Permutation (DiProPerm) test. DiProPerm is a two-sample hypothesis test for comparing two high-dimensional distributions. The DiProPerm test is exact, i.e., the type I error is guaranteed to be controlled at the nominal level for any sample size. For more details see Wei et al. (2016).
DiProPerm(
X,
y,
B = 1000,
classifier = "dwd",
univ.stat = "md",
balance = TRUE,
alpha = 0.05,
cores = 2
)
An nxp
data matrix.
A vector of n
binary class labels -1 and 1.
The number of permutations for the DiProPerm test. The default is 1000.
A string designating the binary linear classifier. classifier="dwd", distance weighted discrimination (DWD), is the default. classifier="dwd" implements a generalized DWD model from the genDWD
function in the DWDLargeR
package.
The penalty parameter, C
, in the genDWD
function is calculated using the penaltyParameter
function in DWDLargeR
. The genDWD
and penaltyParameter
functions have several arguments which are set to recommended default values. More details on the algorithm used to calculate the DWD solution can be found in Lam et al. (2018).
Other options for the binary classifier include the "md", mean difference direction, and "svm", support vector machine. The "svm" option uses the default implementation from svm
.
A string indicating the univariate statistic used for the projection step. univ.stat="md", the mean difference, is the default.
A logical indicator for whether a balanced permutation design should be implemented. The default is TRUE.
An integer indicating the level of significance. The default is 0.05.
An integer indicating the number of cores to be used for parallel processing. The default is 2. Note, parallel processing is only available on MacOS and Ubuntu operating systems at this time. Windows users will default to using 1 core.
A list containing:
X
The observed nxp
data matrix.
y
The observed vector of n
binary class labels -1 and 1.
obs_teststat
The observed univariate test statistic.
xw
Projection scores used to compute the specified univariate statistic.
w
The loadings of the binary classification.
Z
The Z
score of the observed test statistic.
cutoff_value
The cutoff value to achieve an alpha level of significance.
pvalue
The pvalue from the permutation test.
perm_dist
A list containing the permuted projection scores and permuted class labels for each permutation.
perm_stats
A B
dimensional vector of univariate test statistics.
Lam, X. Y., Marron, J. S., Sun, D., & Toh, K.-C. (2018). Fast Algorithms for Large-Scale Generalized Distance Weighted Discrimination. Journal of Computational and Graphical Statistics, 27(2), 368<U+2013>379. 10.1080/10618600.2017.1366915
Wei, S., Lee, C., Wichers, L., & Marron, J. S. (2016). Direction-Projection-Permutation for High-Dimensional Hypothesis Tests. Journal of Computational and Graphical Statistics, 25(2), 549<U+2013>569. 10.1080/10618600.2015.1027773
# NOT RUN {
data(mushrooms)
X <- Matrix::t(mushrooms$X)
y <- mushrooms$y
dpp <- DiProPerm(X=X,y=y,B=10)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab