About
An R package for managing and analyzing text, created by Kenneth Benoit. Supported by the European Research Council grant ERC-2011-StG 283794-QUANTESS.
For more details, see https://quanteda.io.
How to Install
The normal way from CRAN, using your R GUI or
install.packages("quanteda")
Or for the latest development version:
# devtools package required to install quanteda from Github
devtools::install_github("quanteda/quanteda")
Because this compiles some C++ and Fortran source code, you will need to have installed the appropriate compilers.
If you are using a Windows platform, this means you will need also to install the Rtools software available from CRAN.
If you are using macOS, you should install the macOS tools, namely the Clang 6.x compiler and the GNU Fortran compiler (as quanteda requires gfortran to build). If you are still getting errors related to gfortran, follow the fixes here.
quanteda version 3: New major release
quanteda 3.0 is a major release that improves functionality, completes the modularisation of the package begun in v2.0, further improves function consistency by removing previously deprecated functions, and enhances workflow stability and consistency by deprecating some shortcut steps built into some functions.
See https://github.com/quanteda/quanteda/blob/master/NEWS.md#quanteda-30 for a full list of the changes.
The quanteda family of packages
As of v3.0, we have continued our trend of splitting quanteda into modular packages. These are now the following:
- quanteda: contains all of the core natural language processing and textual data management functions
- quanteda.textmodels: contains all of the text models and
supporting functions, namely the
textmodel_*()
functions. This was split from the main package with the v2 release - quanteda.textstats: statistics for textual data, namely the
textstat_*()
functions, split with the v3 release - quanteda.textplots: plots for textual data, namely the
textplot_*()
functions, split with the v3 release
We are working on additional package releases, available in the meantime from our GitHub pages:
- quanteda.sentiment: Functions and lexicons for sentiment analysis using dictionaries
- quanteda.tidy: Extensions for manipulating document variables in core quanteda objects using your favourite tidyverse functions
and more to come.
How to Use
See the quick start guide to learn how to use quanteda.
How to cite
Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. (2018) “quanteda: An R package for the quantitative analysis of textual data”. Journal of Open Source Software. 3(30), 774. https://doi.org/10.21105/joss.00774.
For a BibTeX entry, use the output from
citation(package = "quanteda")
.
Leaving Feedback
If you like quanteda, please consider leaving feedback or a testimonial here.
Contributing
Contributions in the form of feedback, comments, code, and bug reports are most welcome. How to contribute:
- Fork the source code, modify, and issue a pull request through the project GitHub page. See our Contributor Code of Conduct and the all-important quanteda Style Guide.
- Issues, bug reports, and wish lists: File a GitHub issue.
- Usage questions: Submit a question on the quanteda channel on StackOverflow.
- Contact the maintainer by email.