Plot perplexity score of various LDA models.
plot_perplexity(
df,
start = 2,
end = 5,
stopwords = stopwords_miretrieve,
method = "gibbs",
control = NULL,
col.abstract = Abstract,
col.pmid = PMID,
title = NULL
)
Data frame containing abstracts and PubMed-IDs.
Integer. Minimum amount of k
topics for the LDA model to fit. Must
be >=2.
Integer. Maximum amount of k
topics for the LDA model to fit.
Data frame containing stop words.
String. Either "gibbs"
or "VEM"
.
Control parameters for LDA modeling. For more information,
see the documentation of the LDAcontrol
class in the topicmodels
package.
Column containing abstracts.
Column containing PubMed-ID.
String. Plot title.
Elbow plot displaying perplexity scores of different LDA models.
Plot perplexity score of various LDA models. plot_perplexity()
fits
different LDA models for k
topics in the range
between start
and end
. For each
LDA model, the perplexity score is plotted against the corresponding value of
k
.
Plotting the perplexity score of various LDA models
can help in identifying the optimal number of topics to fit an LDA model for.
plot_perplexity()
is based on LDA()
from the package
topicmodels.
Other LDA functions:
assign_topic_lda()
,
fit_lda()
,
plot_lda_term()