Estimate a LDA model using for example the VEM algorithm or Gibbs Sampling.
Usage
LDA(x, k, method = "VEM", control = NULL, model = NULL, ...)
Arguments
x
Object of class "DocumentTermMatrix" with
term-frequency weighting or an object coercible to a
"simple_triplet_matrix" with integer entries.
k
Integer; number of topics.
method
The method to be used for fitting; currently
method = "VEM" or method= "Gibbs" are
supported.
control
A named list of the control parameters for estimation
or an object of class "LDAcontrol".
model
Object of class "LDA" for initialization.
…
Optional arguments. For method = "Gibbs" an
additional argument seedwords can be specified as a matrix or
an object of class "simple_triplet_matrix"; the default is
NULL.
Value
LDA() returns an object of class "".
Details
The C code for LDA from David M. Blei and co-authors is used to
estimate and fit a latent dirichlet allocation model with the VEM
algorithm. For Gibbs Sampling the C++ code from Xuan-Hieu Phan and
co-authors is used. When Gibbs sampling is used for fitting the model, seed words with
their additional weights for the prior parameters can be specified in
order to be able to fit seeded topic models.
References
Blei D.M., Ng A.Y., Jordan M.I. (2003).
Latent Dirichlet Allocation.
Journal of Machine Learning Research, 3, 993--1022. Phan X.H., Nguyen L.M., Horguchi S. (2008).
Learning to Classify Short and Sparse Text & Web with Hidden Topics
from Large-scale Data Collections.
In Proceedings of the 17th International World Wide Web Conference
(WWW 2008), pages 91--100, Beijing, China. Lu, B., Ott, M., Cardie, C., Tsou, B.K. (2011).
Multi-aspect Sentiment Analysis with Topic Models.
In Proceedings of the 2011 IEEE 11th International Conference on Data
Mining Workshops, pages 81--88.