ojaMedian(X, alg = "evolutionary", sp = 1, na.action = na.fail,
control = ojaMedianControl(...), ...)
ojaMedianEvo(X, control = ojaMedianControl(...), ...)
ojaMedianGrid(X, control = ojaMedianControl(...), ...)
ojaMedianEx(X, control = ojaMedianControl(...), ...)"exact", "evolutionary" and "grid". Default is
"evolutionary". See Details.ojaMedianControl and see its help page.control.sigmaAdaption to 1. As a second
possibility you could limit the number of subsets used to a small number. If you use all subsets, there are in total $n$ choose $k$, with $n$ number of datapoints and $k$ dimensions.
If you are interested in a precise solution, the following options have turned out to be useful:
initialSigma: 0.5, sigmaAdaptation: 20, adaptationFactor: 0.5, sigmaLog20Decrease: 10.
Tests have been made in the bivariate case, but these values should work for every dimension.
In the bivariate case it is possible to calculate the Oja median for more than $22*10^6$ datapoints. In the 10-dimensional case the algorithm is still able to calculate an approximative solution
for $10^6$ datapoints.
Before the algorithm starts itself we transform the data with ICS in order to get a more stable version (with respect to the location of the data) and improve the quality of the approximation.
Another reason for this was to get an affine invariant way of the approximation.
The third algorithm calculates the Oja median by means of a grid. The grid points are possible approximations of the Oja median. Every grid point is tested to be
the Oja median. If the test results are not unique the algorithm will take a bigger sample of subsets into account and test it again. In comparison to the evolutionary algorithm it is
slower and less precise. Only in special data situations it might be useful. The algorithm constitutes an earlier heuristical solution to the Oja median problem and is included mainly for historical reasons.
The exact algorithm and the grid algorithm are also described in Ronkainen et al. (2002).
A lot of calculation time in the ojaMedian function might be spend for checking the input and for transforming it. So if you do time-critical calculations, e.g. with loops, you might want to take the variants ojaMedianEx,
ojaMedianEvo or ojaMedianGrid. Please use this only if you know what you are doing, because there are no checks, just the .Call to the algorithm itself.
If the dimension of your data is too big or if there are too many observations, it is possible that the exact algorithm will crash R. On a common PC with a 32-bit operating system the following
combinations of dimension:amount will work fine: 2:1200, 3:300, 4:100, 5:63, 6:38, 7:24. Bigger datasets might be possible, depending on your system.
There is a demo available which demonstrates graphically the Oja median in simple data situations in the bivariate case. To view the demo run demo(ojaMedianDemo).data(biochem)
X <- as.matrix(biochem[,1:2])
ojaMedian(X)
ojaMedian(X, alg = "grid")
ojaMedian(X, alg = "exact")Run the code above in your browser using DataLab