Learn R Programming

SentimentAnalysis (version 1.1-0)

analyzeSentiment: Sentiment analysis

Description

Performs sentiment analysis of given object (vector of strings, document-term matrix, corpus).

Usage

analyzeSentiment(x, language = "english", aggregate = NULL,
  rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

# S3 method for Corpus analyzeSentiment(x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

# S3 method for character analyzeSentiment(x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

# S3 method for data.frame analyzeSentiment(x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

# S3 method for TermDocumentMatrix analyzeSentiment(x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

# S3 method for DocumentTermMatrix analyzeSentiment(x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, ...)

Arguments

x
A vector of characters, a data.frame, an object of type Corpus, TermDocumentMatrix or DocumentTermMatrix
language
Language used for preprocessing operations (default: English)
aggregate
A factor variable by which documents can be grouped. This helpful when joining e.g. news from the same day or move reviews by the same author
rules
A named list containing individual sentiment metrics. Therefore, each entry connsists itself of a list with first a method, followed by an optional dictionary.
removeStopwords
Flag indicating whether to remove stopwords or not (default: yes)
...
Additional parameters passed to function for e.g. preprocessing

Value

Result is a matrix with sentiment values for each document across all defined rules

Details

This function returns a data.frame with continuous values. If one desires other formats, one needs to convert these. Common examples of such formats are binary response values (positive / negative) or tertiary (positive, neutral, negative). Hence, consider using the functions convertToBinaryResponse and convertToDirection, which can convert a vector of continuous sentiment scores into a factor object.

See Also

compareToResponse for evaluating the results, convertToBinaryResponse and convertToDirection for for getting binary results, generateDictionary for dictionary generation, plotSentiment and plotSentimentResponse for visualization

Examples

Run this code
# via vector of strings
corpus <- c("Positive text", "Neutral but uncertain text", "Negative text")
sentiment <- analyzeSentiment(corpus)
compareToResponse(sentiment, c(+1, 0, -2))

# via Corpus from tm package
library(tm)
reut21578 <- system.file("texts", "crude", package="tm")
reuters <- Corpus(DirSource(reut21578),
                  readerControl=list(reader=readReut21578XML))
    
# via DocumentTermMatrix (with stemmed entries)
dtm <- DocumentTermMatrix(Corpus(VectorSource(c("posit posit", "negat neutral")))) 
sentiment <- analyzeSentiment(dtm)
compareToResponse(sentiment, convertToBinaryResponse(c(+1, -1)))

# By adapting the parameter rules, one can incorporate customized dictionaries
# e.g. in order to adapt to arbitrary languages
dictionaryAmplifiers <- SentimentDictionary(c("more", "much"))
sentiment <- analyzeSentiment(corpus,
                              rules=list("Amplifiers"=list(ruleRatio,
                                                           dictionaryAmplifiers)))

# On can also restrict the number of computed methods to the ones of interest
# in order to achieve performance optimizations
sentiment <- analyzeSentiment(corpus,
                              rules=list("SentimentLM"=list(ruleSentiment, 
                                                            loadDictionaryLM())))
sentiment

Run the code above in your browser using DataLab