Learn R Programming

quanteda (version 0.7.2-1)

ntoken: count the number of tokens

Description

Return the count of tokens in a text or corpus. "tokens" here means all words, not unique words, and these are not cleaned prior to counting.

Usage

ntoken(x, block.size = 200, verbose = TRUE)

## S3 method for class 'corpus': ntoken(x, block.size = 200, verbose = TRUE)

## S3 method for class 'character': ntoken(x, block.size = 200, verbose = TRUE)

Arguments

x
texts or corpus whose tokens will be counted
block.size
how many texts to process at a time; experimentation indicates that for bery large collections of texts, 200 seems fastest
verbose
if TRUE print progress indicator and time elapsed

Value

  • scalar count of the total tokens

Examples

Run this code
ntoken(inaugTexts, verbose=FALSE)
ntoken(inaugCorpus, verbose=FALSE)

Run the code above in your browser using DataLab