sylcount (version 0.2-1)

doc_counts: doc_counts

Description

Computes some basic document counts (see the 'Value' section below for details).

The function is vectorized by document, and scores are computed in parallel via OpenMP. You can control the number of threads used with the nthreads parameter.

Usage

doc_counts(s, nthreads = sylcount.nthreads())

Arguments

s

A character vector (vector of strings).

nthreads

Number of threads to use. By default it will use the total number of cores + hyperthreads.

Value

A dataframe containing:

chars the total numberof characters
wordchars the number of alphanumeric characters
words text tokens that are probably English language words
nonwords text tokens that are probably not English language words
sents the number of sentences recognized in the text
sylls the total number of syllables (ignores all non-words)
polys the total number of "polysyllables", or words with 3+ syllables

Details

The function is essentially just readability() without the readability scores.

See Also

readability

Examples

Run this code
# NOT RUN {
library(sylcount)
a <- "I am the very model of a modern major general."
b <- "I have information vegetable, animal, and mineral."

doc_counts(c(a, b), nthreads=1)

# }

Run the code above in your browser using DataLab