find.runs: Fuzzy run detector

Description

A runs length detector that handles nearly equal real values and skips invalid values.

Usage

find.runs(x, feps)

Value

find.runs returns a list with elements:

runs: an integer vector of length x with the length of the run starting at each point or 0 if the point does not start a run
nskip: an integer vector of length x with the number of skipped data points within the run
stats: a vector of three integer values, nrun the total number of runs, maxrun the longest run, and nx the length of the original data and of runs and stats

Arguments

x: a vector
feps: fractional (0 - 1 excl.) relative difference to treat values as same

Details

This runs finder looks for sequences of real values that almost match, considering them as a run of same values. Two points match if the difference between their values as a fraction of their average is less than the threshold: |x[i] - x[j]| / ((|x[i]| + |x[j]|)/2) < feps. For each point in x the detector scans forward until a point fails to be close, with the run taken over the matching interval. The base of the comparison is always the first point in the run, and not a chain of adjacent values. The detector treats infinities as equal to each other but not to finite values and skips over any NA or NaN values while counting them. Values within double precision tolerance of each other always match, irrespective of the relative threshold.

Note that this relative difference depends on the data not averaging close to zero, especially at zero-crossing points, and is sensitive to a constant shift. The first is not a problem for spacing, which is always positive, but may require shifting other types of data. That would also help with the second issue.

For integer, logical, and character data, values must match exactly and feps is not used.

Other data types are not supported.

Although not exported from the Dimodal package, this detector may be useful outside the spacing analysis for any signal. Within Dimodal it is called by the run count and length tests and internally within the C code.

The detector returns two vectors, each with the same length as the original data. runs stores the length of the run starting at the data point and nskip the number of skipped elements within the run. To process the runs, start at index i=runs[1L], or if that is zero i=skip[1L]. The next run starts at i + runs[i] + nskip[i], until i goes beyond the length of the data.

The detector is O(n) in both time and memory, albeit with several passes through the data.

Description

Usage

Value

Arguments

Details

See Also