Learn R Programming

idiolect (version 1.0.1)

chunk_texts: Chunk a corpus

Description

This function can be used to chunk a corpus in order to control sample sizes.

Usage

chunk_texts(corpus, size)

Value

A quanteda corpus object where each text is a chunk of the size requested.

Arguments

corpus

A quanteda corpus.

size

The size of the chunks in number of tokens.

Examples

Run this code
corpus <- quanteda::corpus(c("The cat sat on the mat", "The dog sat on the chair"))
quanteda::docvars(corpus, "author") <- c("A", "B")
chunk_texts(corpus, size = 2)


Run the code above in your browser using DataLab