sentimentr (version 2.6.1)

as_key: Create/Manipulate Hash Keys

Description

as_key - Create your own hash keys from a data frame for use in key arguments such as polarity_dt in the sentiment function.

update_key - Add/remove terms to a current key.

update_polarity_table - Wrapper for update_key specifically for updating polarity tables.

update_valence_shifter_table - Wrapper for update_key specifically for updating valence shifter tables.

is_key - Logical check if an object is a key.

Usage

as_key(x, comparison = lexicon::hash_valence_shifters,
  sentiment = TRUE, ...)

update_key(key, drop = NULL, x = NULL, comparison = lexicon::hash_valence_shifters, sentiment = FALSE, ...)

update_polarity_table(key, drop = NULL, x = NULL, comparison = lexicon::hash_valence_shifters, sentiment = FALSE, ...)

update_valence_shifter_table(key, drop = NULL, x = NULL, comparison = lexicon::hash_sentiment_jockers_rinker, sentiment = FALSE, ...)

is_key(key, sentiment = TRUE)

Arguments

x

A data.frame with the first column containing polarized words and the second containing polarity values.

comparison

A data.frame to compare to x. If elements in x's column 1 matches comparison's column 1 the accompanying row will be removed from x. This is useful to ensure polarity_dt words are not also found in valence_shifters_dt in sentiment. Use comparison = NULL to skip this comparison.

sentiment

logical. If TRUE checking expects column 2 of the input keys/data.frame are expected to be numeric.

key

A sentimentr hash key.

drop

A vector of terms to drop.

ignored.

Value

Returns a data.table object that can be used as a hash key.

Details

For updating keys via update_key note that a polarity_dt and valence_shifters_dt are the primary dictionary keys used in the sentimentr package. The polarity_dt takes a 2 column data.frame (named x and y) with the first column being character and containing the words and the second column being numeric values that are positive or negative. valence_shifters_dt takes a 2 column data.frame (named x and y) with the first column being character and containing the words and the second column being integer corresponding to: (1) negators, (2) amplifiers, (3) de-amplifiers, and (4) dversative conjunctions (i.e., 'but', 'however', and 'although'). Also, note that if you are updating a valence_shifters_dt you need an appropriate comparison; most likely, comparison = sentimentr::polarity_dt.

Examples

Run this code
# NOT RUN {
key <- data.frame(
    words = sample(letters),
    polarity = rnorm(26),
    stringsAsFactors = FALSE
)

(mykey <- as_key(key))

## Looking up values
mykey[c("a", "k")][[2]]

## Drop terms from key
update_key(mykey, drop = c("f", "h"))

## Add terms to key
update_key(mykey, x = data.frame(x = c("dog", "cat"), y = c(1, -1)))

## Add terms & drop to/from a key
update_key(mykey, drop = c("f", "h"), x = data.frame(x = c("dog", "cat"), y = c(1, -1)))

## Explicity key type (wrapper for `update_key` for sentiment table.
## See `update_valence_shifter_table` a corresponding valence shifter updater.
library(lexicon)
updated_hash_sentiment <- sentimentr:::update_polarity_table(lexicon::hash_sentiment_huliu,
    x = data.frame(
        words = c('frickin', 'hairy'),
        polarity = c(-1, -1),
        stringsAsFactors = FALSE
    )
)

## Checking if you have a key
is_key(mykey)
is_key(key)
is_key(mtcars)
is_key(update_key(mykey, drop = c("f", "h")))

## Using syuzhet's sentiment lexicons
# }
# NOT RUN {
library(syuzhet)
(bing_key <- as_key(syuzhet:::bing))
as_key(syuzhet:::afinn)
as_key(syuzhet:::syuzhet_dict)

sam <- gsub("Sam-I-am", "Sam I am", sam_i_am)
sentiment(sam, , polarity_dt = bing_key)

## The nrc dictionary in syuzhet requires a bit of data wrangling before it 
## is in the correct shape to convert to a key.  

library(syuzhet)
library(tidyverse)

nrc_key <- syuzhet:::nrc %>% 
    dplyr::filter(
        sentiment %in% c('positive', 'negative'),
        lang == 'english'
    ) %>%
    dplyr::select(-lang) %>% 
    mutate(value = ifelse(sentiment == 'negative', value * -1, value)) %>%
    dplyr::group_by(word) %>%
    dplyr::summarize(y = mean(value)) %>%
    sentimentr::as_key()
    
sentiment(sam, polarity_dt = nrc_key)

## The lexicon package contains a preformatted nrc sentiment hash table that 
## can be used instead.
sentiment(sam, polarity_dt = lexicon::hash_sentiment_nrc)
# }
# NOT RUN {
## Using 2 vectors of words
# }
# NOT RUN {
install.packages("tm.lexicon.GeneralInquirer", repos="http://datacube.wu.ac.at", type="source")
require("tm.lexicon.GeneralInquirer")

positive <- terms_in_General_Inquirer_categories("Positiv")
negative <- terms_in_General_Inquirer_categories("Negativ")

geninq <- data.frame(
    x = c(positive, negative),
    y = c(rep(1, length(positive)), rep(-1, length(negative))),
    stringsAsFactors = FALSE
) %>%
    as_key()

geninq_pol <- with(presidential_debates_2012,
    sentiment_by(dialogue,
    person,
    polarity_dt = geninq
))

geninq_pol %>% plot()
# }

Run the code above in your browser using DataCamp Workspace