This is primarily intended to be called internally by screen_topics
, but is made available for users to generate their own topic models with the same properties as those in revtools. It bascially takes any words in the title, keywords and abstracts of the supplied references, and uses them to construct a DTM.
This function uses some standard tools like stemming, converting words to lower case, and removal of numbers or punctuation. It also replaces stemmed words with the most common full word, which doesn't affect the calculations, but makes the resulting analyses easier to interpret. It doesn't use part-of-speech tagging.
Words that occur in 2 entries or fewer are always removed by make_dtm
, so values of min_freq
that result in a threshold below this will not affect the result. Arguments to max_freq
are passed as is.
This function is synonymous with the earlier function make_DTM
, which will be removed from future versions of revtools
.