The median difference is defined as follows:
\(Z\) is the combined \(X\) and \(Y\) values into a single vector
or matrix.
Number of columns is the dimension, and these need to be equal
for \(X\) and \(Y\). Then, if the Laplacian kernel is used,
$$ m = \textnormal{median} \{ || x_i - x_j ||_1; \,\, i>j, \,\,
i=1, 2,\dots, n+m,\,\,\textnormal{ and } j=1, 2,\dots, i-1 \}, $$
where \( || z_i - z_j ||_1\) is the 1-norm, and so if the data
are d
-dimensional then
$$ || z_i - z_j ||_1 = \sum_{k=1}^{d} |z_{i,k} - z_{j,k}|. $$
If the Gaussian kernel is specified, then the square of the two-norm is
used.
The median heuristic is defined as beta = 1/m
.
Naive method will compute all distinct pairs, of which there are
\(N(N+1)/2\) differences. These are then sorted using
a \(O(N \log N)\) algorithm, so overall \(O(N^2 \log N)\).
The fast method is \(O(N \log N)\) is from Croux and Rousseeuw (1992),
which is based on Johnson and Mizoguchi (1978).