Learn R Programming

quanteda (version 0.7.2-1)

summary.corpus: summarize a corpus or a vector of texts

Description

Displays information about a corpus or vector of texts. For a corpus, this includes attributes and metadata such as date of number of texts, creation and source. For texts, prints to the console a desription of the texts, including number of types, tokens, and sentences.

Usage

## S3 method for class 'corpus':
summary(object, n = 100, verbose = TRUE,
  showmeta = FALSE, ...)

## S3 method for class 'character': summary(object, verbose = TRUE, ...)

describeTexts(object, verbose = TRUE, ...)

Arguments

object
corpus or texts to be summarized
n
maximum number of texts to describe, default=100
verbose
set to FALSE to turn off printed output, for instance if you simply want to assign the output to a data.frame
showmeta
for a corpus, set to TRUE to include document-level meta-data
...
additional arguments affecting the summary produced

Examples

Run this code
# summarize corpus information
summary(inaugCorpus, n=10)
mycorpus <- corpus(ukimmigTexts, docvars=data.frame(party=names(ukimmigTexts)), enc="UTF-8")
summary(mycorpus, showmeta=TRUE, n=10)  # show the meta-data
mysummary <- summary(mycorpus, verbose=FALSE)  # (quietly) assign the results
mysummary$Types / mysummary$Tokens             # crude type-token ratio
#
# summarize texts
summary(c("testing this text", "and this one"))
summary(ukimmigTexts)
myTextSummaryDF <- summary(ukimmigTexts, verbose=FALSE)

Run the code above in your browser using DataLab