This function initialize a Joint Sentiment/Topic model.
JST(
x,
lexicon = NULL,
S = 3,
K = 5,
gamma = 1,
alpha = 5,
beta = 0.01,
gammaCycle = 0,
alphaCycle = 0
)
An S3 list containing the model parameter and the estimated mixture.
This object corresponds to a Gibbs sampler estimator with zero iterations.
The MCMC can be iterated using the fit()
function.
tokens
is the tokens object used to create the model
vocabulary
contains the set of words of the corpus
it
tracks the number of Gibbs sampling iterations
za
is the list of topic assignment, aligned to the tokens
object with
padding removed
logLikelihood
returns the measured log-likelihood at each iteration,
with a breakdown of the likelihood into hierarchical components as
attribute
The topWords()
function easily extract the most probables words of each
topic/sentiment.
tokens object containing the texts. A coercion will be attempted if x
is not a tokens.
a quanteda
dictionary with positive and negative categories
the number of sentiments
the number of topics
the hyperparameter of sentiment-document distribution
the hyperparameter of topic-document distribution
the hyperparameter of vocabulary distribution
integer specifying the cycle size between two updates of the hyperparameter alpha
integer specifying the cycle size between two updates of the hyperparameter alpha
Olivier Delmarcelle
The rJST.LDA
methods enable the transition from a previously
estimated LDA model to a sentiment-aware rJST
model. The function
retains the previously estimated topics and randomly assigns sentiment to
every word of the corpus. The new model will retain the iteration count of
the initial LDA model.
Lin, C. and He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM conference on Information and knowledge management, 375--384.
Lin, C., He, Y., Everson, R. and Ruger, S. (2012). Weakly Supervised Joint Sentiment-Topic Detection from Text. IEEE Transactions on Knowledge and Data Engineering, 24(6), 1134–-1145.
Fitting a model: fit()
,
extracting top words: topWords()
Other topic models:
LDA()
,
rJST()
,
sentopicmodel()
# creating a JST model
JST(ECB_press_conferences_tokens)
# estimating a JST model including a lexicon
jst <- JST(ECB_press_conferences_tokens, lexicon = LoughranMcDonald)
jst <- fit(jst, 100)
Run the code above in your browser using DataLab