"frequency"
ranks top words according to their frequency
within a topic. This method also reports the overall frequency of
each word. When returning a plot, the overall frequency is
represented with a grey bar.
"probability"
uses the estimated topic-word mixture \(\phi\) to
rank top words.
"term-score"
implements the re-ranking method from Blei and
Lafferty (2009). This method down-weights terms that have high
probability in all topics using the following score:
$$\text{term-score}_{k,v} = \phi_{k, v}\log\left(\frac{\phi_{k,
v}}{\left(\prod^K_{j=1}\phi_{j,v}\right)^{\frac{1}{K}}}\right),$$ for
topic \(k\), vocabulary word \(v\) and number of topics \(K\).
"FREX"
implements the re-ranking method from Bischof and Airoldi
(2012). This method used the weight \(w\) to balance between
topic-word probability and topic exclusivity using the following
score:
$$\text{FREX}_{k,v}=\left(\frac{w}{\text{ECDF}\left(
\frac{\phi_{k,v}}{\sum_{j=1}^K\phi_{k,v}}\right)}
+ \frac{1-w}{\text{ECDF}\left(\phi_{k,v}\right)} \right),$$ for
topic \(k\), vocabulary word \(v\), number of topics \(K\) and
weight \(w\), where \(\text{ECDF}\) is the empirical cumulative
distribution function.