tm (version 0.5-10)

makeChunks: Split a Corpus into Chunks

Description

Split a corpus into equally sized chunks conserving document boundaries.

Usage

makeChunks(corpus, chunksize)

Arguments

corpus
The corpus to be split into chunks.
chunksize
The chunk size.

Value

  • A corpus consisting of the chunks. Note that corpus meta data is not passed on to the newly created chunk corpus.

Examples

Run this code
txt <- system.file("texts", "txt", package = "tm")
ovid <- Corpus(DirSource(txt))
sapply(ovid, length)
ovidChunks <- makeChunks(ovid, 5)
sapply(ovidChunks, length)

Run the code above in your browser using DataLab