mmpca_clust

an NxV <code><a rd-options="" href="/link/DocumentTermMatrix?package=MoMPCA&version=1.0.1" data-mini-rdoc="MoMPCA::DocumentTermMatrix">DocumentTermMatrix</a></code> with term-frequency
weighting.

The number of topics (latent space dimension)

A given model in which to take the controls for the VE-steps in
the greedy procedure. If NULL, a model of class
<code><a href="/link/?package=MoMPCA&version=1.0.1" mmpcaclust'="" data-mini-rdoc="MoMPCA::"></a>'>mmpcaClust</code> is created with default controls (see
<code><a href="/link/?package=MoMPCA&version=1.0.1" mmpcaclustcontrol'="" data-mini-rdoc="MoMPCA::"></a>'>mmpcaClustcontrol</code> class for more details).

model

Parameter for the initialization of Y. It can be either:<ul>
<li>a string or a function specifying the initialization
procedure. It should be one of ('random', 'kmeans_lda'). See
<code><a rd-options="" href="/link/benchmarks-functions?package=MoMPCA&version=1.0.1" data-mini-rdoc="MoMPCA::benchmarks-functions">benchmarks-functions</a></code> functions for more details.</li>
<li>A
vector of length N with Q modalities, specifying the initialization
clustering.</li>
</ul>

Yinit

The clustering algorithm to be used. Only "BBCVEM" is available
: it corresponds to the branch and bound C-VEM of the original article.

method

Parameter for the initialization of the matrix beta. It can
be either:<ul>
<li>a string specifying the initialization
procedure. It should be one of ('random', 'lda'). See
<code><a rd-options="" href="/link/initializeBeta?package=MoMPCA&version=1.0.1" data-mini-rdoc="MoMPCA::initializeBeta">initializeBeta</a></code>() for more details.</li>
<li>A KxV matrix with
each row summing to 1.</li>
</ul>

init.beta

The evolution of the bound is tracked every <code>keep</code> iteration

keep

Specifies the maximum number of pass allowed on the whole
dataset.

max.epochs

verbose

number of runs of the algorithm (default to 1) : the run
achieving the best evidence lower bound is selected.

nruns

The number of CPUs to use when fitting in parallel the different
models (only for non-Windows platforms). Default is the number of available
cores minus 1.

mc.cores

Perform clustering of count data using the MMPCA model.

Cluster any count data matrix with a fixed number of variables, such as document/term matrices. It integrates the dimension reduction aspect of topic models in the mixture models framework. Inference is done by means of a greedy Classification Variational Expectation Maximisation (C-VEM) algorithm. An Integrated Classication Likelihood (ICL) model selection is designed for selecting the latent dimension (number of topics) and the number of clusters. For more details, see the article of Jouvin et. al. (2020) <arxiv:1909.00721>.

Nicolas Jouvin

MoMPCA

Inference and Clustering for Mixture of Multinomial Principal
Component Analysis

mmpca_clust function

an NxV <code><a rd-options='' href='DocumentTermMatrix'>DocumentTermMatrix</a></code> with term-frequency
weighting.

A given model in which to take the controls for the VE-steps in
the greedy procedure. If NULL, a model of class
<code><a href='<a href='mmpcaClust'></a>'>mmpcaClust</a></code> is created with default controls (see
<code><a href='<a href='mmpcaClustcontrol'></a>'>mmpcaClustcontrol</a></code> class for more details).

Parameter for the initialization of Y. It can be either:<ul>
<li>a string or a function specifying the initialization
procedure. It should be one of ('random', 'kmeans_lda'). See
<code><a rd-options='' href='benchmarks-functions'>benchmarks-functions</a></code> functions for more details.</li>
<li>A
vector of length N with Q modalities, specifying the initialization
clustering.</li>
</ul>

Parameter for the initialization of the matrix beta. It can
be either:<ul>
<li>a string specifying the initialization
procedure. It should be one of ('random', 'lda'). See
<code><a rd-options='' href='initializeBeta'>initializeBeta</a></code>() for more details.</li>
<li>A KxV matrix with
each row summing to 1.</li>
</ul>

mmpca_clust: Greedy procedures for joint inference and clustering in MMPCA

Description

Usage

Arguments

Value