Learn R Programming

quanteda (version 0.9.2-0)

texts: get corpus texts

Description

Get the texts in a quanteda corpus object, with grouping options. Works for plain character vectors too, if groups is a factor.

Usage

texts(x, groups = NULL, ...)

## S3 method for class 'corpus': texts(x, groups = NULL, ...)

## S3 method for class 'character': texts(x, groups = NULL, ...)

## S3 method for class 'corpusSource': texts(x, groups = NULL, ...)

Arguments

x
A quanteda corpus object
groups
character vector containing the names of document variables in a corpus, or a factor equal in length to the number of documents, used for aggregating the texts through concatenation. If x is of type character, then groups must
...
unused

Value

  • For texts, a character vector of the texts in the corpus. For texts <-, the corpus with the updated texts.

Examples

Run this code
nchar(texts(subset(inaugCorpus, Year < 1806)))

# grouping on a document variable
nchar(texts(subset(inaugCorpus, Year < 1806), groups = "President"))

# grouping a character vector using a factor
nchar(inaugTexts[1:5])
nchar(texts(inaugTexts[1:5], groups = as.factor(inaugCorpus[1:5, "President"])))

Run the code above in your browser using DataLab