quanteda (version 2.1.2)

tokens_sample: Randomly sample documents from a tokens object

Description

Sample tokenized documents randomly from a tokens object, with or without replacement. Works just as sample() works, for document-level units (and their associated document-level variables).

Usage

tokens_sample(x, size = ndoc(x), replace = FALSE, prob = NULL)

Arguments

x

the tokens object whose documents will be sampled

size

a positive number, the number of documents or features to select

replace

logical; should sampling be with replacement?

prob

a vector of probability weights for obtaining the elements of the vector being sampled.

Value

A tokens object with number of documents or features equal to size, drawn from the tokens x.

See Also

sample

Examples

Run this code
# NOT RUN {
set.seed(10)
toks <- tokens(data_corpus_inaugural[1:10])
head(toks)
head(tokens_sample(toks))
head(tokens_sample(toks, replace = TRUE))
# }

Run the code above in your browser using DataCamp Workspace