powered by
This function can be used to chunk a corpus in order to control sample sizes.
chunk_texts(corpus, size)
A quanteda corpus object where each text is a chunk of the size requested.
quanteda
A quanteda corpus.
The size of the chunks in number of tokens.
corpus <- quanteda::corpus(c("The cat sat on the mat", "The dog sat on the chair")) quanteda::docvars(corpus, "author") <- c("A", "B") chunk_texts(corpus, size = 2)
Run the code above in your browser using DataLab