# Ckmeans.1d.dp v4.3.0

Monthly downloads

## Optimal, Fast, and Reproducible Univariate Clustering

Fast, optimal, and reproducible weighted univariate
clustering by dynamic programming. Four types of problem including
univariate k-means, k-median, k-segments, and multi-channel
weighted k-means are solved with guaranteed optimality and
reproducibility. The core algorithm minimizes the sum of (weighted)
within-cluster distances using respective metrics. Its advantage
over heuristic clustering in efficiency and accuracy is pronounced
at a large number of clusters k. Weighted k-means can also process
time series to perform peak calling. Multi-channel weighted k-means
groups multiple univariate signals into k clusters. An auxiliary
function generates histograms that are adaptive to patterns in data.
This package provides a powerful set of tools for univariate data
analysis with guaranteed optimality, efficiency, and reproducibility.

## Readme

# README

The **Ckmeans.1d.dp** package provides fast, optimal, and reproducible univariate clustering by dynamic programming. It is practical to cluster millions of sample points with optional weights in seconds on a typical desktop computer using a single processor core.

Four types of problem including univariate $k$-means, $k$-median, $k$-segments, and multi-channel weighted $k$-means are solved with guaranteed optimality and reproducibility. The core algorithm minimizes the (weighted) sum of within-cluster distances using respective metrics. Its advantage over heuristic clustering in efficiency and accuracy is increasingly pronounced as the number of clusters $k$ increases. Weighted $k$-means can also optimally segment time series to perform peak calling. An auxiliary function generates histograms that are adaptive to patterns in data. This package provides a powerful set of tools for univariate data analysis with guaranteed optimality, efficiency, and reproducibility.

The Ckmeans.1d.dp algorithm clusters (weighted) univariate data given by a numeric vector $x$ into $k$ groups by dynamic programming (Wang and Song, 2011). It guarantees the optimality of clustering---the total of within-cluster sums of squares is always the minimum given the number of clusters $k$. In contrast, heuristic univariate clustering algorithms may be non-optimal or inconsistent from run to run. As unequal non-negative weights are supported for each point, the algorithm can also segment a time course using the time points as input and the values at each time point as weight. Utilizing the optimal clusters, a function can generate histograms adaptive to patterns in data.

Apart from the time for sorting $x$, the default weighted univariate clustering algorithm takes a runtime of $O(kn)$, linear in both sample size $n$ and the number of clusters $k$, using a new divide-and-conquer strategy integrating a previous theoretical result on matrix search (Aggarwal et al., 1987) and a novel in-place search space reduction method. The space complexity is $O(kn)$. This method is numerically stable.

This package provides a powerful alternative to heuristic clustering and also new functionality for weighted clustering, segmentation, and peak calling with guaranteed optimality.

## Functions in Ckmeans.1d.dp

Name | Description | |

Univariate Clustering | Optimal (Weighted) Univariate Clustering | |

plot.Ckmeans.1d.dp | Plot Optimal Univariate Clustering Results | |

MultiChannel.WUC | Optimal Multi-channel Weighted Univariate Clustering | |

plotBIC | Plot Bayesian Information Criterion as a Function of Number of Clusters | |

Univariate Segmentation | Optimal Univariate Segmentation | |

Ckmeans.1d.dp-package | Optimal, Fast, and Reproducible Univariate Clustering | |

plot.Cksegs.1d.dp | Plot Optimal Univariate Segmentation Results | |

ahist | Adaptive Histograms | |

print.Ckmeans.1d.dp | Print Optimal Univariate Clustering Results | |

print.Cksegs.1d.dp | Print Optimal Univariate Segmentation Results | |

No Results! |

## Vignettes of Ckmeans.1d.dp

Name | ||

Ckmeans.1d.dp.Rmd | ||

Weights.Rmd | ||

ahist.Rmd | ||

No Results! |

## Last month downloads

## Details

Type | Package |

Date | 2019-09-06 |

License | LGPL (>= 3) |

Encoding | UTF-8 |

LazyData | true |

LinkingTo | Rcpp |

NeedsCompilation | yes |

RdMacros | Rdpack |

VignetteBuilder | knitr |

Packaged | 2019-09-07 03:03:25 UTC; joesong |

Repository | CRAN |

Date/Publication | 2019-09-07 11:20:07 UTC |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/Ckmeans.1d.dp)](http://www.rdocumentation.org/packages/Ckmeans.1d.dp)
```