discretization()
supports multiple rules for discretization. Below is the list of supported rules. IQR() stands for interquartile range.
fd
stands for the Freedman Diaconis rule. The number of bins is given by $$\frac{range(x) * n^{1/3}}{2 * IQR(x)}$$ The Freedman Diaconis rule is known to be less sensitive than the Scott's rule to outlier.
doane
stands for doane's rule. The number of bins is given by
$$1 + \log_{2}{n} + \log_{2}{1+\frac{|g|}{\sigma_{g}}}$$
is a modification of Sturges' formula which attempts to improve its performance with non-normal data.
sqrt
The number of bins is given by:
$$\sqrt(n)$$
cencov
stands for Cencov's rule. The number of bins is given by:
$$n^{1/3}$$
rice
stands for Rice' rule. The number of bins is given by:
$$2 n^{1/3}$$
terrell-scott
stands for Terrell-Scott's rule. The number of bins is given by: $$(2 n)^{1/3}$$
This is known that Cencov, Rice and Terrell-Scott rules over-estimates k, compared to other rules due to his simplicity.
sturges
stands for Sturges's rule. The number of bins is given by: $$1 + \log_2(n)$$
scott
stands for Scott's rule. The number of bins is given by:
$$range(x) / \sigma(x) n^{-1/3}$$