# Ckmeans.1d.dp v4.3.3

Monthly downloads

## Optimal, Fast, and Reproducible Univariate Clustering

Fast, optimal, and reproducible weighted univariate
clustering by dynamic programming. Four problem are solved, including
univariate k-means (Wang & Song 2011) <doi:10.32614/RJ-2011-015>
(Song & Zhong 2020) <doi:10.1093/bioinformatics/btaa613>, k-median,
k-segments, and multi-channel weighted k-means. Dynamic programming
is used to minimize the sum of (weighted) within-cluster distances
using respective metrics. Its advantage over heuristic clustering in
efficiency and accuracy is pronounced at a large number of clusters.
Weighted k-means can also process time series to perform peak
calling. Multi-channel weighted k-means groups multiple univariate
signals into k clusters. An auxiliary function generates histograms
adaptive to patterns in data. This package provides a powerful set
of tools for univariate data analysis with guaranteed optimality,
efficiency, and reproducibility.

## Readme

# The 'Ckmeans.1d.dp' R package

### Overview

The package provides a powerful set of tools for fast, optimal, and reproducible univariate clustering by dynamic programming. It is practical to cluster millions of sample points into a few clusters in seconds using a single core on a typical desktop computer. It solves four types of problem, including univariate $k$-means, $k$-median, $k$-segments, and multi-channel weighted $k$-means. Dynamic programming is used to minimize the (weighted) sum of within-cluster distances using respective metrics. Its advantage over heuristic clustering in efficiency and accuracy is increasingly pronounced as the number of clusters $k$ increases. Weighted $k$-means can also optimally segment time series to perform peak calling. An auxiliary function generates histograms that are adaptive to patterns in data. The package is used to identify dysregulated genomic zones in human cancers (Song and Zhong, 2020) <10.1093 bioinformatics="" btaa613="">.

### The main method

The Ckmeans.1d.dp algorithm clusters (weighted) univariate data given by a numeric vector $x$ into $k$ groups by dynamic programming (Wang and Song, 2011)

Excluding the time for sorting $x$, the default weighted univariate clustering algorithm takes a runtime of $O(kn)$ (Song and Zhong, 2020) <10.1093 bioinformatics="" btaa613="">, linear in both sample size $n$ and the number of clusters $k$, using a new divide-and-conquer strategy based on a theoretical result on matrix search (Aggarwal et al., 1987)

### When to use the package

This package provides a powerful alternative to heuristic clustering and also new functionality for weighted clustering, segmentation, and peak calling with guaranteed optimality.

### To download and install the package

```
install.packages("Ckmeans.1d.dp")
```

## Functions in Ckmeans.1d.dp

Name | Description | |

print.Cksegs.1d.dp | Print Optimal Univariate Segmentation Results | |

ahist | Adaptive Histograms | |

Univariate Clustering | Optimal (Weighted) Univariate Clustering | |

print.Ckmeans.1d.dp | Print Optimal Univariate Clustering Results | |

plotBIC | Plot Bayesian Information Criterion as a Function of Number of Clusters | |

plot.Ckmeans.1d.dp | Plot Optimal Univariate Clustering Results | |

plot.Cksegs.1d.dp | Plot Optimal Univariate Segmentation Results | |

Ckmeans.1d.dp-package | Optimal, Fast, and Reproducible Univariate Clustering | |

Univariate Segmentation | Optimal Univariate Segmentation | |

MultiChannel.WUC | Optimal Multi-channel Weighted Univariate Clustering | |

No Results! |

## Vignettes of Ckmeans.1d.dp

Name | ||

Ckmeans.1d.dp.Rmd | ||

Weights.Rmd | ||

ahist.Rmd | ||

No Results! |

## Last month downloads

## Details

Type | Package |

Date | 2020-07-21 |

License | LGPL (>= 3) |

Encoding | UTF-8 |

LazyData | true |

LinkingTo | Rcpp |

NeedsCompilation | yes |

RdMacros | Rdpack |

VignetteBuilder | knitr |

Packaged | 2020-07-22 03:03:50 UTC; joesong |

Repository | CRAN |

Date/Publication | 2020-07-22 13:42:15 UTC |

suggests | knitr , rmarkdown , testthat |

imports | Rcpp , Rdpack (>= 0.6-1) |

Contributors | Hua Zhong, Haizhou Wang |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/Ckmeans.1d.dp)](http://www.rdocumentation.org/packages/Ckmeans.1d.dp)
```