Usage
manyTopics(documents, vocab, K,
prevalence, content, data=NULL,
max.em.its=100, verbose=TRUE, init.type =
"LDA",
emtol= 1e-05, seed=NULL,runs=50, frexw=.7,
net.max.em.its=2, netverbose=FALSE, M=10,...)
Arguments
documents
The documents to be modeled. Object must be a list of with each
element corresponding to a document. Each document is represented
as an integer matrix w
ith two rows, and columns equal to the number of unique vocabulary
words in
vocab
Character vector specifying the words in the corpus in the order of
the vocab indices in documents. Each term in the vocabulary index must
appear at least
once in the documents. See prepDocuments
K
A vector of positive integers representing the desired
number of topics for separate runs of selectModel.
prevalence
A formula object with no response variable or a matrix containing
topic prevalence covariates. Use s()
, ns()
or
bs()
to specify smoo
th terms. See details for more information.
content
A formula containing a single variable, a factor variable or
something which can be coerced to a factor indicating the
category of the content variable fo
r each document.
runs
Total number of STM runs used in the cast net stage. Approximately 15 percent of these runs will be used for running a STM until convergence.
data
Dataset which contains prevalence and content covariates.
init.type
The method of initialization. See stm
. seed
Seed for the random number generator. stm
saves the seed
it uses on every run so that any result can be exactly
reproduced. When attempting to reproduce a result with that seed,
it should be specified here.
max.em.its
The maximum number of EM iterations. If convergence has not
been met at this point, a message will be printed.
emtol
Convergence tolerance.
verbose
A logical flag indicating whether information should be
printed to the screen.
frexw
Weight used to calculate exclusivity
net.max.em.its
Maximum EM iterations used when casting the net
netverbose
Whether verbose should be used when calculating net models.
M
Number of words used to calculate semantic coherence and
exclusivity. Defaults to 10.
...
Additional options described in details of stm.