A data frame with 42 observations on the following 30 variables that include information on whether or not the method fulfills the theoretical criteria of Stolte et al. (2024).
Some criteria are only fulfilled for certain parameter choices of the method ("Conditionally Fulfilled") or do not apply to the method.
NA
values mean that there is no information available on whether or not the respective criterion is fulfilled.
Method
a character vector giving the reference or method name
Implementation
a character vector giving the function name of the implementation in the DataSimilarity package
Target.Inclusion
a character vector. Can the method handle datasets that include a target variable in a meaningful way?
Numeric
a character vector. Can the method handle numeric data?
Categorical
a character vector. Can the method handle categorical data?
Unequal.Sample.Sizes
a character vector. Can the method handle datasets of different sample sizes?
p.Larger.N
a character vector. Can the method handle datasets with more variables than observations?
Multiple.Samples
a character vector. Can the method handle \(k > 2\) datasets simultaneously?
Without.training
a character vector. Does the method work without holding out training data?
No.assumptions
a character vector. Does the method work without further assumptions?
No.parameters
a character vector. Does the method work without the specification or tuning of additional parameters?
Implemented
a character vector. Is the method implemented elsewhere? (NA if no other implementations are known)
Complexity
a character vector giving the computational complexity of the method.
Interpretable.units
a character vector. Can a one unit increase of the output value be interpreted?
Lower.bound
a character vector. Are the output values lower bounded? If known the lower bound is given.
Upper.bound
a character vector. Are the output values upper bounded? If known the upper bound is given.
Rotation.invariant
a character vector. Is the method invariant to rotation of all datasets?
Location.change.invariant
a character vector. Is the method invariant to shifting all datasets?
Homogeneous.scale.invariant
a character vector. Is the method invariant to scaling all datasets?
Positive.definite
a character vector. Is the method positive definite, i.e. \(d(F_1, F_2) \ge 0\) and \(d(F_1, F_2) = 0 \Leftrightarrow F_1 = F_2\) for any two distributions \(F_1, F_2\)?
Symmetric
a character vector. Ist the method symmetric, i.e. \(d(F_1, F_2) = d(F_2, F_1)\) for any two distributions \(F_1, F_2\)?
Triangle.inequality
a character vector. Does the method fulfill the triangle inequality, i.e. \(d(F_1, F_2) \le d(F_1, F_3) + d(F_3, F_2)\) for any three distributions \(F_1, F_2, F_3\)?
Consistency.N
a character vector. Is the corresponding test consistent for \(N\to\infty\)?
Consistency.p
a character vector. Is the corresponding test consistent for \(p\to\infty\)?
Number.Fulfilled
a numeric vector. Number of fulfilled criteria.
Number.Cond.Fulfilled
a numeric vector. Number of conditionally fulfilled criteria.
Number.Unfulfilled
a numeric vector. Number of unfulfilled criteria.
Number.NA
a numeric vector. Number of criteria for which it is unknown if they are fulfilled.
Class
a character vector. Class of the taxonomy of Stolte et al. (2024) that the method is assigned to based on its underlying idea.
Subclass
a character vector. Subclass of the taxonomy of Stolte et al. (2024) that the method is assigned to based on its underlying idea.