Learn R Programming

topicmodels (version 0.1-4)

ldaformat2dtm: Transform data from and for use with the lda package

Description

Data from the lda package is transformed to a document-term matrix. This data format can be used to fit topic models using package topicmodels.

Data in form of a document-term matrix is transformed to the LDA format used by package lda.

Usage

ldaformat2dtm(documents, vocab)
dtm2ldaformat(x)

Arguments

documents
A list where each entry corresponds to a document; for each document the number of terms occurring in the document are stored in a matrix with two rows such that in each column the first entry corresponds to the vocab
vocab
A "character" vector of the terms in the vocabulary.
x
An object of class "DocumentTermMatrix" as defined in package tm.

Value

  • An object of class "DocumentTermMatrix" is returned by ldaformat2dtm() and a list with components "documents" and "vocab" by dtm2ldaformat().

Examples

Run this code
if (require("lda")) {
  data("cora.documents", package = "lda")
  data("cora.vocab", package = "lda")
  dtm <- ldaformat2dtm(cora.documents, cora.vocab)
  cora <- dtm2ldaformat(dtm)
  all.equal(cora, list(documents = cora.documents,
                       vocab = cora.vocab))
}

Run the code above in your browser using DataLab