The Dimodal package uses the spacing of data, or difference between order statistics, to detect and locate modes or the transition between them. Consistent spacing with stable values appears within a mode, while it increases at the anti-modes. The package contains parametric and non-parametric tests of features detected within the spacing, and can use any changepoint detectors installed on the system as a separate check.
The package has three top-level commands.
Dimodal runs the modality analysis. It supports print,
summary, and plot methods.
Diopt provides a persistent database of options
controlling the feature detectors, tests, and display of results.
Ditrack displays the position and probability of features
as filter sizes change, to help in selecting the best size for analysis.
It has a plot method to graphically show the results.
The package has five feature detectors. They work with any data, not just the spacing.
find.runs identifies fuzzy runs, sequences of nearly-equal
numeric values or of equal discrete values or symbols.
find.peaks identifies local extrema, merging small minor
peaks into larger.
find.flats identifies flat or consistent stretches of
values.
find.cpt is a majority voting scheme using external
changepoint algorithms to identify a common set of points where the behavior
of data changes.
find.level.sections is an inversion of the modehunt
changepoint detector and can be added to the voting list.
Dimodal includes three groups of tests to evaluate features.
Dipeak.test and Diflat.test are parametric
models of the peak and flat distributions after low-pass filtering.
Dipeak.critval and Diflat.critval provide
critical values of the peak (height) and flat (length) for a significance
level.
Dinrun.test and Dirunlen.test are runs-based
tests (up, down, equal trends) performed on the signed difference of a
signal. They include the Kaplansky-Riordan test of the number of runs and
a Markov chain model for the longest run.
Dipermht.test and Diexcurht.test are
bootstrap tests simulating a feature from the actual data. They include a
permutation test of runs and a general excursion test from the difference
of a signal.
The kirkwood dataset has the multi-modal distribution of
asteroid orbital radii, where modes identify families of asteroids within
the main belt and anti-modes the Kirkwood gaps cleared out by regular
perturbation from Jupiter.
The return value of each command, feature detector, and test is given an S3 class that supports printing, summarizing, and perhaps plotting. Links can be found in the return value section of the command.
The package includes several functions to help work with the results of the analysis.
midquantile uses piecewise linear segments to convert
quantiles of discrete or heavily quantized data back to data.
runs.as.rle converts the find.runs result
to the "rle" class.
select.peaks returns just the local maxima from the
extrema detected by find.peaks.
center.diw shifts the indices of features in the interval
spacing, normally located at the end of the interval, into the center, to
align with the low-pass features and actual data.
match.features uses distance and overlap criteria to
identify features found in both the low-pass and interval spacing.
shiftID.place moves the results of
find.peaks, find.flats, and
find.cpt to the original data grid and converts any indices
into raw values using the midquantile approximation.
Greg Kreider.
The package compiles by default with the PCG random number generator, written by Melissa O'Neill, for sampling during the excursion tests.
The kirkwood dataset is taken from the Lowell Observatory asteroid
ephemeris.