Learn R Programming

polmineR (version 0.6.1)

as.TermDocumentMatrix: as.TermDocumentMatrix / as.DocumentTermMatrix

Description

Method for type conversion, to generate the classes "TermDocumentMatrix" or "DocumentTermMatrix" contained in the "tm" package. The classes inherit from the "simple_triplet_matrix"-class defined in the "slam"-package. A "DocumentTermMatrix" is required as input by the "topicmodels" package, for instance.

Usage

as.TermDocumentMatrix(x, ...)
"as.TermDocumentMatrix"(x, pAttribute, sAttribute, from = NULL, to = NULL, strucs = NULL, rmBlank = TRUE, verbose = TRUE, robust = FALSE, mc = FALSE)
"as.TermDocumentMatrix"(x, col, pAttribute = NULL, verbose = TRUE)
"as.DocumentTermMatrix"(x, col)
"as.TermDocumentMatrix"(x, pAttribute = NULL, col = NULL, verbose = TRUE)
"as.DocumentTermMatrix"(x, pAttribute = NULL, col = NULL, verbose = TRUE)

Arguments

x
some object
...
to make the check happy
pAttribute
the p-attribute
sAttribute
the s-attribute
from
bla
to
bla
strucs
bla
rmBlank
bla
verbose
bla
robust
bla
mc
logical
col
the column to use of assembling the matrix

Value

a TermDocumentMatrix

Details

The type conversion-method can be applied on object of the class "bundle", or classes inheriting from the "bundle" class. If counts or some other measure is present in the "stat" slots of the objects in the bundle, then the values in the column indicated by "col" will be turned into the values of the sparse matrix that is generated. A special case is the generation of the sparse matrix based on a "partitionBundle" that does not yet include counts. In this case, a "pAttribute" needs to be provided, then counting will be performed, too.

Examples

Run this code
if (require(polmineR.sampleCorpus) && require(rcqp)){
   use("polmineR.sampleCorpus")
   p <- partition("PLPRBTTXT", text_date=".*", regex=TRUE)
   pB <- partitionBundle(p, def=list(text_date=NULL))
   pB <- enrich(pB, pAttribute="word")
   tdm <- as.TermDocumentMatrix(pB, col="count")
   
   pB2 <- partitionBundle(p, def=list(text_date=NULL))
   tdm <- as.TermDocumentMatrix(pB2, pAttribute="word")
}

Run the code above in your browser using DataLab