Learn R Programming

RemixAutoML (version 0.11.0)

AutoWordFreq: Automated Word Frequency and Word Cloud Creation

Description

This function builds a word frequency table and a word cloud. It prepares data, cleans text, and generates output.

Usage

AutoWordFreq(data, TextColName = "DESCR",
  GroupColName = "ClusterAllNoTarget", GroupLevel = 0,
  RemoveEnglishStopwords = TRUE, Stemming = TRUE,
  StopWords = c("bla", "bla2"))

Arguments

data

Source data table

TextColName

A string name for the column

GroupColName

Set to NULL to ignore, otherwise set to Cluster column name (or factor column name)

GroupLevel

Must be set if GroupColName is defined. Set to cluster ID (or factor level)

RemoveEnglishStopwords

Set to TRUE to remove English stop words, FALSE to ignore

Stemming

Set to TRUE to run stemming on your text data

StopWords

Add your own stopwords, in vector format

See Also

Other EDA: ProblematicFeatures

Examples

Run this code
# NOT RUN {
data <- data.table::data.table(
DESCR = c("Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru, Gru,
           Urkle, Urkle, Urkle, Urkle, Urkle, Urkle, Urkle, Gru, Gru, Gru,
           bears, bears, bears, bears, bears, bears, smug, smug, smug, smug,
           smug, smug, smug, smug, smug, smug, smug, smug, smug, smug, smug,
           eats, eats, eats, eats, eats, eats, beats, beats, beats, beats,
           beats, beats, beats, beats, beats, beats, beats, science, science,
           Dwigt, Dwigt, Dwigt, Dwigt, Dwigt, Dwigt, Dwigt, Dwigt, Dwigt, Dwigt,
           Schrute, Schrute, Schrute, Schrute, Schrute, Schrute, Schrute,
           James, James, James, James, James, James, James, James, James, James,
           Halpert, Halpert, Halpert, Halpert, Halpert, Halpert, Halpert, Halpert"))
data <- AutoWordFreq(data,
                     TextColName = "DESCR",
                     GroupColName = NULL,
                     GroupLevel = NULL,
                     RemoveEnglishStopwords = FALSE,
                     Stemming = FALSE,
                     StopWords = c("Bla"))
# }

Run the code above in your browser using DataLab