textCluster

Combine documents (columns) into k clusters that have texts that are most
similar based on their text distance. Documents with no terms are assigned
to the last cluster.

Functions to extract and handle commonly occurring principal phrases
obtained from collections of texts. Major speed improvements - core functions
rewritten in C++ for faster phrase-document parsing, clustering, and text
distance computations. Based on, Small, E., & Cabrera, J. (2025). Principal
phrase mining, an automated method for extracting meaningful phrases from
text. International Journal of Computers and Applications, 47(1), 84–92.

Ellie Small

Phrase Mining

textCluster function

<dl><dt>M</dt>
<dd>A term document matrix with terms on the rows and documents on 
the columns.</dd>
<dt>k</dt>
<dd>A positive integer with the number of clusters needed</dd>
<dt>mx</dt>
<dd>Maximum number of times to iterate (default 100)</dd>
<dt>md</dt>
<dd>Maximum number of documents to use for the initial setup (default 
10*<code>k</code>).</dd>
<dt>silent</dt>
<dd>TRUE if you do not want progress messages.</dd></dl>

Arguments

Cluster a Term-Document Matrix — textCluster

<dl>

<dt>M</dt>
<dd>A term document matrix with terms on the rows and documents on 
the columns.</dd>


<dt>k</dt>
<dd>A positive integer with the number of clusters needed</dd>


<dt>mx</dt>
<dd>Maximum number of times to iterate (default 100)</dd>


<dt>md</dt>
<dd>Maximum number of documents to use for the initial setup (default 
10*<code>k</code>).</dd>


<dt>silent</dt>
<dd>TRUE if you do not want progress messages.</dd>

</dl>

textCluster: Cluster a Term-Document Matrix

Description

Usage

Value

Arguments

Examples