SentimentAnalysis (version 1.3-4)

analyzeSentiment: Sentiment analysis

Description

Performs sentiment analysis of given object (vector of strings, document-term matrix, corpus).

Usage

analyzeSentiment(
  x,
  language = "english",
  aggregate = NULL,
  rules = defaultSentimentRules(),
  removeStopwords = TRUE,
  stemming = TRUE,
  ...
)

# S3 method for Corpus analyzeSentiment( x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, stemming = TRUE, ... )

# S3 method for character analyzeSentiment( x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, stemming = TRUE, ... )

# S3 method for data.frame analyzeSentiment( x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, stemming = TRUE, ... )

# S3 method for TermDocumentMatrix analyzeSentiment( x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, stemming = TRUE, ... )

# S3 method for DocumentTermMatrix analyzeSentiment( x, language = "english", aggregate = NULL, rules = defaultSentimentRules(), removeStopwords = TRUE, stemming = TRUE, ... )

Value

Result is a matrix with sentiment values for each document across all defined rules

Arguments

x

A vector of characters, a data.frame, an object of type Corpus, TermDocumentMatrix or DocumentTermMatrix

language

Language used for preprocessing operations (default: English)

aggregate

A factor variable by which documents can be grouped. This helpful when joining e.g. news from the same day or move reviews by the same author

rules

A named list containing individual sentiment metrics. Therefore, each entry consists itself of a list with first a method, followed by an optional dictionary.

removeStopwords

Flag indicating whether to remove stopwords or not (default: yes)

stemming

Perform stemming (default: TRUE)

...

Additional parameters passed to function for e.g. preprocessing

Details

This function returns a data.frame with continuous values. If one desires other formats, one needs to convert these. Common examples of such formats are binary response values (positive / negative) or tertiary (positive, neutral, negative). Hence, consider using the functions convertToBinaryResponse and convertToDirection, which can convert a vector of continuous sentiment scores into a factor object.

See Also

compareToResponse for evaluating the results, convertToBinaryResponse and convertToDirection for for getting binary results, generateDictionary for dictionary generation, plotSentiment and plotSentimentResponse for visualization

Examples

Run this code
if (FALSE) {
library(tm)

# via vector of strings
corpus <- c("Positive text", "Neutral but uncertain text", "Negative text")
sentiment <- analyzeSentiment(corpus)
compareToResponse(sentiment, c(+1, 0, -2))

# via Corpus from tm package
data("crude")
sentiment <- analyzeSentiment(crude)
    
# via DocumentTermMatrix (with stemmed entries)
dtm <- DocumentTermMatrix(VCorpus(VectorSource(c("posit posit", "negat neutral")))) 
sentiment <- analyzeSentiment(dtm)
compareToResponse(sentiment, convertToBinaryResponse(c(+1, -1)))

# By adapting the parameter rules, one can incorporate customized dictionaries
# e.g. in order to adapt to arbitrary languages
dictionaryAmplifiers <- SentimentDictionary(c("more", "much"))
sentiment <- analyzeSentiment(corpus,
                              rules=list("Amplifiers"=list(ruleRatio,
                                                           dictionaryAmplifiers)))
                                                           
# On can also restrict the number of computed methods to the ones of interest
# in order to achieve performance optimizations
sentiment <- analyzeSentiment(corpus,
                              rules=list("SentimentLM"=list(ruleSentiment, 
                                                            loadDictionaryLM())))
sentiment
}

Run the code above in your browser using DataLab