Get or replace the texts in a corpus, with grouping options.
Works for plain character vectors too, if groups
is a factor.
texts(x, groups = NULL, spacer = " ")texts(x) <- value
# S3 method for corpus
as.character(x, ...)
a corpus or character object
either: a character vector containing the names of document variables to be used for grouping; or a factor or object that can be coerced into a factor equal in length or rows to the number of documents. See groups for details.
when concatenating texts by using groups
, this will be the
spacing added between texts. (Default is two spaces.)
character vector of the new texts
unused
For texts
, a character vector of the texts in the corpus.
For texts <-
, the corpus with the updated texts.
for texts <-
, a corpus with the texts replaced by value
as.character(x)
is equivalent to texts(x)
as.character(x)
where x
is a corpus is equivalent to
calling texts(x)
# NOT RUN {
nchar(texts(corpus_subset(data_corpus_inaugural, Year < 1806)))
# grouping on a document variable
nchar(texts(corpus_subset(data_corpus_inaugural, Year < 1806), groups = "President"))
# grouping a character vector using a factor
nchar(data_char_ukimmig2010[1:5])
nchar(texts(data_corpus_inaugural[1:5],
groups = as.factor(data_corpus_inaugural[1:5, "President"])))
BritCorpus <- corpus(c("We must prioritise honour in our neighbourhood.",
"Aluminium is a valourous metal."))
texts(BritCorpus) <-
stringi::stri_replace_all_regex(texts(BritCorpus),
c("ise", "([nlb])our", "nium"),
c("ize", "$1or", "num"),
vectorize_all = FALSE)
texts(BritCorpus)
texts(BritCorpus)[2] <- "New text number 2."
texts(BritCorpus)
# }
Run the code above in your browser using DataLab