fmx_cluster: Naive Estimates of Finite Mixture Distribution via Clustering

Description

Naive estimates for finite mixture distribution fmx via clustering.

Usage

fmx_cluster(
  x,
  K,
  distname = c("GH", "norm", "sn"),
  constraint = character(),
  ...
)

Value

Function fmx_cluster() returns an fmx object.

Arguments

x: numeric vector, observations
K: integer scalar, number of mixture components
distname: character scalar, name of parametric distribution of the mixture components
constraint: character vector, parameters (\(g\) and/or \(h\) for Tukey \(g\)-&-\(h\) mixture) to be set at 0. See function fmx_constraint for details.
...: additional parameters, currently not in use

Details

First of all, if the specified number of components \(K\geq 2\), trimmed \(k\)-means clustering with re-assignment will be performed; otherwise, all observations will be considered as one single cluster. The standard \(k\)-means clustering is not used since the heavy tails of Tukey \(g\)-&-\(h\) distribution could be mistakenly classified as individual cluster(s).

In each of the one or more clusters,

letterValue-based estimates of Tukey \(g\)-&-\(h\) distribution (Hoaglin, 2006) are calculated, for any \(K\geq 1\), serving as the starting values for QLMD algorithm. These estimates are provided by function fmx_cluster().
the median and mad will serve as the starting values for \(\mu\) and \(\sigma\) (or \(A\) and \(B\) for Tukey \(g\)-&-\(h\) distribution, with \(g = h = 0\)), for QLMD algorithm when \(K = 1\).