jst_reversed: Run a reversed Joint Sentiment Topic model

Description

Estimates a reversed joint sentiment topic model using a Gibbs sampler, see Details for model description.

Usage

jst_reversed(
  dfm,
  sentiLexInput = NULL,
  numSentiLabs = 3,
  numTopics = 10,
  numIters = 3,
  updateParaStep = -1,
  alpha = -1,
  beta = -1,
  gamma = -1,
  excludeNeutral = FALSE
)

Arguments

dfm

A quanteda dfm object

sentiLexInput

Optional: A quanteda dictionary object for semi-supervised learning. If a dictionary is used, numSentiLabs will be overridden by the number of categories in the dictionary object. An extra category will by default be added for neutral words. This can be turned off by setting excludeNeutral = TRUE.

numSentiLabs

Integer, the number of sentiment labels (defaults to 3)

numTopics

Integer, the number of topics (defaults to 10)

numIters

Integer, the number of iterations (defaults to 3 for test runs, optimize by hand)

updateParaStep

Integer. The number of iterations between optimizations of hyperparameter alpha

alpha

Double, hyperparameter for (defaults to .05*(average docsize/number of topics))

beta

Double, hyperparameter for (defaults to .01, with multiplier .9/.1 for sentiment dictionary presence)

gamma

Double, hyperparameter for (defaults to .05 * (average docsize/number of sentitopics)

excludeNeutral

Boolean. If a dictionary is used, an extra category is added for neutral words. Words in the dictionary receive a low probability of being allocated there. If this is set to TRUE, the neutral sentiment category will be omitted. The variable is irrelevant if no dictionary is used. Defaults to FALSE.

Value

A JST_reversed.result object containing a data.frame for each estimated parameter

Details

Lin, C., He, Y., Everson, R. and Ruger, S., 2012. Weakly supervised joint sentiment-topic detection from text. IEEE Transactions on Knowledge and Data engineering, 24(6), pp.1134-1145.

Examples

Run this code

# NOT RUN {
model <- jst(quanteda::dfm(quanteda::data_corpus_irishbudget2010),
             paradigm(),
             numTopics = 5,
             numIters = 150)

# }

Run the code above in your browser using DataLab