extract_sentiment_terms: Extract Sentiment Words

Description

Extract the sentiment words from a text.

Usage

extract_sentiment_terms(
  text.var,
  polarity_dt = lexicon::hash_sentiment_jockers_rinker,
  hyphen = "",
  retention_regex = "\\d:\\d|\\d\\s|[^[:alpha:]',;: ]",
  ...
)

Arguments

text.var

The text variable.

polarity_dt

A data.table of positive/negative words and weights with x and y as column names.

hyphen

The character string to replace hyphens with. Default replaces with nothing so 'sugar-free' becomes 'sugarfree'. Setting hyphen = " " would result in a space between words (e.g., 'sugar free').

retention_regex

A regex of what characters to keep. All other characters will be removed. Note that when this is used all text is lower case format. Only adjust this parameter if you really understand how it is used. Note that swapping the \\{p} for [^[:alpha:];:,\'] may retain more alpha letters but will likely decrease speed.

…

Ignored.

Value

Returns a data.table with columns of positive and negative terms. In addition, the attributes $counts and $elements return an aggregated count of the usage of the words and a detailed sentiment score of each word use. See the examples for more.

Examples

Run this code

# NOT RUN {
library(data.table)
set.seed(10)
x <- get_sentences(sample(hu_liu_cannon_reviews[[2]], 1000, TRUE))
sentiment(x)

pol_words <- extract_sentiment_terms(x)
pol_words
pol_words$sentence
pol_words$neutral
data.table::as.data.table(pol_words)

attributes(extract_sentiment_terms(x))$counts
attributes(extract_sentiment_terms(x))$elements

# }
# NOT RUN {
library(wordcloud)
library(data.table)

set.seed(10)
x <- get_sentences(sample(hu_liu_cannon_reviews[[2]], 1000, TRUE))
sentiment_words <- extract_sentiment_terms(x)

sentiment_counts <- attributes(sentiment_words)$counts
sentiment_counts[polarity > 0,]

par(mfrow = c(1, 3), mar = c(0, 0, 0, 0))
## Positive Words
with(
    sentiment_counts[polarity > 0,],
    wordcloud(words = words, freq = n, min.freq = 1,
          max.words = 200, random.order = FALSE, rot.per = 0.35,
          colors = brewer.pal(8, "Dark2"), scale = c(4.5, .75)
    )
)
mtext("Positive Words", side = 3, padj = 5)

## Negative Words
with(
    sentiment_counts[polarity < 0,],
    wordcloud(words = words, freq = n, min.freq = 1,
          max.words = 200, random.order = FALSE, rot.per = 0.35,
          colors = brewer.pal(8, "Dark2"), scale = c(4.5, 1)
    )
)
mtext("Negative Words", side = 3, padj = 5)

sentiment_counts[, 
    color := ifelse(polarity > 0, 'red', 
        ifelse(polarity < 0, 'blue', 'gray70')
    )]

## Positive & Negative Together
with(
    sentiment_counts[polarity != 0,],
    wordcloud(words = words, freq = n, min.freq = 1,
          max.words = 200, random.order = FALSE, rot.per = 0.35,
          colors = color, ordered.colors = TRUE, scale = c(5, .75)
    )
)
mtext("Positive (red) & Negative (blue) Words", side = 3, padj = 5)
# }

Run the code above in your browser using DataLab