Learn R Programming

stm (version 1.1.3)

searchK: Computes diagnostic values for models with different values of K (number of topics).

Description

With user-specified initialization, this function runs selectModel for different user-specified topic numbers and computes diagnostic properties for the returned model. These include exclusivity, semantic coherence, heldout likelihood, bound, lbound, and residual.

Usage

searchK(documents, vocab, K, init.type = "Spectral", N=floor(.1*length(documents)), proportion=.5, heldout.seed=NULL, M=10,...)

Arguments

documents
The documents to be used for the stm model
vocab
The vocabulary to be used for the stmmodel
K
A vector of different topic numbers
init.type
The method of initialization. See stm for options. Note that the default option here is different from the main function.
N
Number of docs to be partially held out
proportion
Proportion of docs to be held out.
heldout.seed
If desired, a seed to use when holding out documents for later heldout likelihood computation
M
M value for exclusivity computation
...
Other diagnostics parameters.

Value

exclus
Exclusivity of each model.
semcoh
Semantic coherence of each model.
heldout
Heldout likelihood for each model.
residual
Residual for each model.
bound
Bound for each model.
lbound
lbound for each model.
em.its
Total number of EM iterations used in fiting the model.

Details

See the vignette for interepretation of each of these measures.

See Also

plot.searchK make.heldout

Examples

Run this code


## Not run: 
# 
# K<-c(5,10,15) 
# temp<-textProcessor(documents=gadarian$open.ended.response,metadata=gadarian)
# out <- prepDocuments(temp$documents, temp$vocab, temp$meta)
# documents <- out$documents
# vocab <- out$vocab
# meta <- out$meta
# set.seed(02138)
# K<-c(5,10,15) 
# kresult <- searchK(documents, vocab, K, prevalence=~treatment + s(pid_rep), data=meta)
# plot(kresult)
# 
# ## End(Not run)
 

Run the code above in your browser using DataLab