Learn R Programming

textmineR (version 2.1.3)

DepluralizeDtm: Run the CorrectS function on columns of a document term matrix.

Description

Turns pluralizations of words in the columns of a document term matrix to their singular form. Then aggregates all columns that now have the same token. See example below.

Usage

DepluralizeDtm(dtm, ...)

Arguments

dtm

A document term matrix of class dgCMatrix whose colnames are tokens

...

Other arguments to pass to TmParallelApply. See note, below.

Value

Returns a document term matrix of class dgCMatrix. The columns index the de-pluralized tokens of the input document term matrix. In other words, there will generally be fewer columns in the returned matrix than the input matrix

Examples

Run this code
# NOT RUN {
myvec <- c("the quick brown fox eats chickens", 
           "the slow gray fox eats the slow chicken", 
           "look at my horse", "my horses are amazing")
           
names(myvec) <- paste("doc", 1:length(myvec), sep="_")

dtm <- Vec2Dtm(vec = myvec, min.n.gram = 1, max.n.gram = 1)

dtm_new <- DepluralizeDtm(dtm = dtm)
#' 
# }

Run the code above in your browser using DataLab