Learn R Programming

sentimentr (version 0.4.0)

as_key: Create/Manipulate Hash Keys

Description

as_key - Create your own hash keys from a data frame for use in key arguments such as polarity_dt in the sentiment function.

update_key - Add/remove terms to a current key.

update_polarity_table - Wrapper for update_key specifically for updating polarity tables.

update_valence_shifter_table - Wrapper for update_key specifically for updating valence shifter tables.

is_key - Logical check if an object is a key.

Usage

as_key(x, comparison = sentimentr::valence_shifters_table, sentiment = TRUE, ...)
update_key(key, drop = NULL, x = NULL, comparison = sentimentr::valence_shifters_table, sentiment = FALSE, ...)
update_polarity_table(key, drop = NULL, x = NULL, comparison = sentimentr::valence_shifters_table, sentiment = FALSE, ...)
update_valence_shifter_table(key, drop = NULL, x = NULL, comparison = sentimentr::polarity_table, sentiment = FALSE, ...)
is_key(key, sentiment = TRUE)

Arguments

x
A data.frame with the first column containing polarized words and the second containing polarity values.
comparison
A data.frame to compare to x. If elements in x's column 1 matches comparison's column 1 the accompanying row will be removed from x. This is useful to ensure polarity_dt words are not also found in valence_shifters_dt in sentiment. Use comparison = NULL to skip this comparison.
sentiment
logical. If TRUE checking expects column 2 of the input keys/data.frame are expected to be numeric.
key
A sentimentr hash key.
drop
A vector of terms to drop.
...
ignored.

Value

Returns a data.table object that can be used as a hash key.

Details

For updating keys via update_key note that a polarity_dt and valence_shifters_dt are the primary dictionary keys used in the sentimentr package. The polarity_dt takes a 2 column data.frame (named x and y) with the first column being character and containing the words and the second column being numeric values that are positive or negative. valence_shifters_dt takes a 2 column data.frame (named x and y) with the first column being character and containing the words and the second column being integer coresponding to: (1) negators, (2) amplifiers, (3) de-amplifiers, and ``but'' conjunction (4). Also, note that if you are updating a valence_shifters_dt you need an appropriate comparison; most likely, comparison = sentimentr::polarity_dt.

Examples

Run this code
key <- data.frame(
    words = sample(letters),
    polarity = rnorm(26),
    stringsAsFactors = FALSE
)

(mykey <- as_key(key))

## Looking up values
mykey[c("a", "k")][[2]]

## Drop terms from key
update_key(mykey, drop = c("f", "h"))

## Add terms to key
update_key(mykey, x = data.frame(x = c("dog", "cat"), y = c(1, -1)))

## Add terms & drop to/from a key
update_key(mykey, drop = c("f", "h"), x = data.frame(x = c("dog", "cat"), y = c(1, -1)))

## Checking if you have a key
is_key(mykey)
is_key(key)
is_key(mtcars)
is_key(update_key(mykey, drop = c("f", "h")))

## Using syuzhet's sentiment lexicons
## Not run: 
# library(syuzhet)
# as_key(syuzhet:::bing)
# as_key(syuzhet:::afinn)
# nrc <- data.frame(
#     words = rownames(syuzhet:::nrc),
#     polarity = syuzhet:::nrc[, "positive"] - syuzhet:::nrc[, "negative"],
#     stringsAsFactors = FALSE
# )
# 
# as_key(nrc[nrc[["polarity"]] != 0, ])
# 
# sentiment(gsub("Sam-I-am", "Sam I am", sam_i_am), as_key(syuzhet:::bing))
# ## End(Not run)

## Using 2 vectors of words
## Not run: 
# install.packages("tm.lexicon.GeneralInquirer", repos="http://datacube.wu.ac.at", type="source")
# require("tm.lexicon.GeneralInquirer")
# 
# positive <- terms_in_General_Inquirer_categories("Positiv")
# negative <- terms_in_General_Inquirer_categories("Negativ")
# 
# geninq <- data.frame(
#     x = c(positive, negative),
#     y = c(rep(1, length(positive)), rep(-1, length(negative))),
#     stringsAsFactors = FALSE
# ) %>%
#     as_key()
# 
# geninq_pol <- with(presidential_debates_2012,
#     sentiment_by(dialogue,
#     person,
#     polarity_dt = geninq
# ))
# 
# geninq_pol %>% plot()
# ## End(Not run)

Run the code above in your browser using DataLab