Computes Optimally Tuned Robust Improper Maximum Likelihood Clustering
(OTRIMLE), see otrimle
,
together with the
density-based cluster quality statistics Q (Hennig and Coretto 2021)
for a range of values of the number of clusters.
otrimleg(dataset, G=1:6, multicore=TRUE, ncores=detectCores(logical=FALSE)-1,
erc=20, beta0=0, fixlogicd=NULL, monitor=1, dmaxq=qnorm(0.9995))
something that can be coerced into an observations times variables matrix. The dataset.
vector of integers (normally starting from 1). Numbers of clusters to be considered.
logical. If TRUE
, parallel computing is used
through the function mclapply
from package
parallel
; read warnings there if you intend to use this; it
won't work on Windows.
integer. Number of cores for parallelisation.
A number larger or equal than one specifying the maximum
allowed ratio between within-cluster covariance matrix
eigenvalues. See otrimle
.
A non-negative constant, penalty term for noise, to be
passed as beta
to otrimle
, see documentation
there.
0 or 1. If 1, progress messages are printed on screen.
numeric. Passed as maxq
to
kerndensmeasure
. The interval considered for the
one-dimensional density estimator is (-maxq,maxq)
.
otrimleg
returns a list
containing the components solution, iloglik, ibic, criterion,
logicd, noiseprob, denscrit, ddpm
. All of these are lists or
vectors of which the component number is the number of clusters.
vector of improper likelihood values from
otrimle
.
vector of improper BIC-values (small is good) computed
from iloglik
and the numbers of parameters. Note that the
behaviour of the improper likelihood is not compatible with the standard
use of the BIC, so this is experimental and should not be trusted
for choosing the number of clusters.
vector of values of OTRIMLE criterion, see
otrimle
.
vector of estimated noise proportions,
exproportion[1]
from otrimle
.
vector of density-based cluster quality statistics Q
(Hennig and Coretto 2021) as provided by the
measure
-component of
kerndensmeasure
.
list of the vector of cluster-wise density-based cluster
quality measures as provided by the
ddpm
-component of kerndensmeasure
.
For estimating the number of clusters this is meant to be called by
otrimlesimg
. The output of otrimleg
is not
meant to be used directly for estimating the number of clusters, see
Hennig and Coretto (2021).
Coretto, P. and C. Hennig (2016). Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering. Journal of the American Statistical Association, Vol. 111(516), pp. 1648-1659. doi: 10.1080/01621459.2015.1100996
P. Coretto and C. Hennig (2017). Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering. Journal of Machine Learning Research, Vol. 18(142), pp. 1-39. https://jmlr.org/papers/v18/16-382.html
Hennig, C. and P.Coretto (2021). An adequacy approach for deciding the number of clusters for OTRIMLE robust Gaussian mixture based clustering. To appear in Australian and New Zealand Journal of Statistics, https://arxiv.org/abs/2009.00921.
# NOT RUN {
data(banknote)
selectdata <- c(1:30,101:110,117:136,160:161)
x <- banknote[selectdata,5:7]
obanknote <- otrimleg(x,G=1:2,multicore=FALSE)
# }
Run the code above in your browser using DataLab