Learn R Programming

seededlda (version 0.8.4)

divergence: [Experimental] Compute the divergence of topics

Description

Compute the divergence of topics. This can be used to search the optimal number of topics for LDA.

Usage

divergence(x, weighted = TRUE, min_size = 0.01, select = NULL)

Arguments

x

a LDA model fitted by textmodel_seededlda() or textmodel_lda().

weighted

if TRUE weight the divergence scores by the sizes of topics.

min_size

the minimum size of topics that can increase the average divergence. Ignored when weighted = FALSE.

select

names of topics for which the divergence is computed.

Details

divergence() computes the average Jensen-Shannon divergence between all the pairs of topic vectors in x$phi. The divergence score maximizes when the chosen number of topic k is optimal (Deveaud et al., 2014).

References

Deveaud, Romain et al. (2014). "Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval". doi:10.3166/DN.17.1.61-84. Document Numérique.

See Also

sizes