Learn R Programming

phm (version 2.1.2)

Phrase Mining

Description

Functions to extract and handle commonly occurring principal phrases obtained from collections of texts. Major speed improvements - core functions rewritten in C++ for faster phrase-document parsing, clustering, and text distance computations. Based on, Small, E., & Cabrera, J. (2025). Principal phrase mining, an automated method for extracting meaningful phrases from text. International Journal of Computers and Applications, 47(1), 84–92.

Copy Link

Version

Install

install.packages('phm')

Monthly Downloads

262

Version

2.1.2

License

GPL-3

Maintainer

Ellie Small

Last Published

September 30th, 2025

Functions in phm (2.1.2)

Phrases that are not Principal Phrases

Show Cluster Contents

Calculate a Text Distance Matrix

Calculate Text Distance (dense version)

print.phraseDoc

Print a phraseDoc Object

textDist_sparse

Calculate Text Distance (sparse version)

Find Informative Documents in a Corpus

Create a data table from a text file in PubMed format

Display Frequency Matrix for Documents

Display Frequency Matrix for Phrases

Create a DFSource object from a data frame

getElem.DFSource

Obtain the current row of the content of a DFSource

Calculate Canberra Distance

as.matrix.phraseDoc

Convert a phraseDoc Object to a Matrix

Remove Phrases from phraseDoc Object

Display Frequent Principal Phrases

Calculate a Distance Matrix

phraseDoc Creation

Words that Principal Phrases do not End with

Create a PlainTextDocument from a row in a data frame

print.textCluster

Print a textCluster Object

Words that Principal Phrases do not Start with

Cluster a Term-Document Matrix