quanteda (version 1.3.0)

docvars: Get or set document-level variables

Description

Get or set variables associated with a document in a corpus, tokens or dfm object.

Usage

docvars(x, field = NULL)

docvars(x, field = NULL) <- value

Arguments

x

corpus, tokens, or dfm object whose document-level variables will be read or set

field

string containing the document-level variable name

value

the new values of the document-level variable

Value

docvars returns a data.frame of the document-level variables, dropping the second dimension to form a vector if a single docvar is returned.

docvars<- assigns value to the named field

Index access to docvars in a corpus

Another way to access and set docvars is through indexing of the corpus j element, such as data_corpus_irishbudget2010[, c("foren", "name"]; or, for a single docvar, data_corpus_irishbudget2010[["name"]]. The latter also permits assignment, including the easy creation of new document variables, e.g. data_corpus_irishbudget2010[["newvar"]] <- 1:ndoc(data_corpus_irishbudget2010). See [.corpus for details.

Examples

Run this code
# NOT RUN {
# retrieving docvars from a corpus
head(docvars(data_corpus_inaugural))
tail(docvars(data_corpus_inaugural, "President"), 10)

# assigning document variables to a corpus
corp <- data_corpus_inaugural
docvars(corp, "President") <- paste("prez", 1:ndoc(corp), sep = "")
head(docvars(corp))

# alternative using indexing
head(corp[, "Year"])
corp[["President2"]] <- paste("prezTwo", 1:ndoc(corp), sep = "")
head(docvars(corp))

# }

Run the code above in your browser using DataCamp Workspace