perplexity
Methods for Function perplexity
Determine the perplexity of a fitted model.
- Keywords
- methods
Usage
perplexity(object, newdata, ...)
"perplexity"(object, newdata, control, ...)
"perplexity"(object, newdata, control, use_theta = TRUE,
estimate_theta = TRUE, ...)
"perplexity"(object, newdata, control, use_theta = TRUE,
estimate_theta = TRUE, ...)
Arguments
- object
- Object of class
"TopicModel"
or"Gibbs_list"
. - newdata
- If missing, the perplexity for the data to which the
model was fitted is determined. For objects fitted using Gibbs sampling
newdata
needs to be specified. - control
- If missing, the
control
of the fitted model is used with suitable changes of the relevant parameters (see Details). - use_theta
- Object of class
"logical"
. IfTRUE
the estimated topic distributions for the documents are used. Otherwise equal weights are assigned to the topics for each document. - estimate_theta
- Object of class
"logical"
. IfFALSE
the data provided is assumed to be the same as the data used for fitting the model. The topic distributions therefore do not need to be estimated and the data innewdata
is used for weighting the term-document occurrences. - ...
- Further arguments passed to the different methods.
Details
The specified control is modified to ensure that (1)
estimate.beta=FALSE
and (2) nstart=1
.
For "Gibbs_list"
objects the control
is further modified
to have (1) iter=thin
and (2) best=TRUE
and the model is
fitted to the new data with this control for each available
iteration. The perplexity is then determined by averaging over the
same number of iterations.
If a list
is supplied as object
, it is assumed that it
consists of several models which were fitted using different starting
configurations.
Value
-
A numeric value.
References
Blei D.M., Ng A.Y., Jordan M.I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993--1022.
Griffiths T.L., Steyvers, M. (2004). Finding Scientific Topics. Proceedings of the National Academy of Sciences of the United States of America, 101, Suppl. 1, 5228--5235. Newman D., Asuncion A., Smyth P., Welling M. (2009). Distributed Algorithms for Topic Models. Journal of Machine Learning Research, 10, 1801--1828.