Learn R Programming

quanteda.textstats: textual statistics for quanteda

About

Contains the textstat functions formerly in quanteda. For more details, see https://quanteda.io.

How to Install

The normal way from CRAN, using your R GUI or

install.packages("quanteda.textstats") 

Or for the latest development version:

# devtools package required to install quanteda from Github 
remotes::install_github("quanteda/quanteda.textstats") 

Because this compiles some C++ and Fortran source code, you will need to have installed the appropriate compilers.

If you are using a Windows platform, this means you will need also to install the Rtools software available from CRAN.

If you are using macOS, you should install the macOS tools, namely the Clang 6.x compiler and the GNU Fortran compiler (as quanteda.textstats requires gfortran to build). If you are still getting errors related to gfortran, follow the fixes here.

How to cite

Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. (2018) “quanteda: An R package for the quantitative analysis of textual data”. Journal of Open Source Software. 3(30), 774. https://doi.org/10.21105/joss.00774.

For a BibTeX entry, use the output from citation(package = "quanteda.textstats").

Copy Link

Version

Install

install.packages('quanteda.textstats')

Monthly Downloads

4,691

Version

0.97.2

License

GPL-3

Maintainer

Kenneth Benoit

Last Published

September 3rd, 2024

Functions in quanteda.textstats (0.97.2)

nsyllable.tokens

nsyllable methods for tokens
quanteda.textstats-package

quanteda.textstats: Textual Statistics for the Quantitative Analysis of Textual Data
textstat_collocations

Identify and score multi-word expressions
textstat_readability

Calculate readability
textstat_proxy

[Experimental] Compute document/feature proximity
textstat_summary

Summarize documents as syntactic and lexical feature counts
nscrabble

Count the Scrabble letter values of text
textstat_entropy

Compute entropies of documents or features
textstat_simil

Similarity and distance computation between documents or features
textstat_select

Select rows of textstat objects by glob, regex or fixed patterns
head.textstat_proxy

Return the first or last part of a textstat_proxy object
data_char_wordlists

Word lists for readability statistics
get_docvars

Internal function to extract docvars
check_dots

Check arguments passed to other functions via ...
compute_lexdiv_stats

Compute lexical diversity from a dfm or tokens
as.list.textstat_proxy

textstat_simil/dist coercion methods
compute_msttr

Compute the Mean Segmental Type-Token Ratio (MSTTR)
dfm_split_hyphenated_features

Split a dfm's hyphenated features into constituent parts
as.matrix,textstat_simil_sparse-method

as.matrix method for textstat_simil_sparse
diag2na

convert same-value pairs to NA in a textstat_proxy object
compute_mattr

Compute the Moving-Average Type-Token Ratio (MATTR)
textstat_frequency

Tabulate feature frequencies
textstat_keyness

Calculate keyness statistics
textstat_lexdiv

Calculate lexical diversity
textstat_proxy-class

textstat_simil/dist classes