Learn R Programming

⚠️There's a newer version (4.2.0) of this package.Take me there.

quanteda (version 0.9.4)

Quantitative Analysis of Textual Data

Description

A fast, flexible toolset for for the management, processing, and quantitative analysis of textual data in R.

Copy Link

Version

Install

install.packages('quanteda')

Monthly Downloads

24,111

Version

0.9.4

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Kenneth Benoit

Last Published

February 21st, 2016

Functions in quanteda (0.9.4)

corpus

constructor for corpus objects
changeunits

change the document units of a corpus
ntoken

count the number of tokens or types
head.dfm

Return the first or last part of a dfm
summary.corpus

summarize a corpus or a vector of texts
kwic

List key words in context from a text or a corpus of texts.
docnames

get or set document names
segment

segment texts into component elements
dfm-class

Virtual class "dfm" for a document-feature matrix
ie2010Corpus

Irish budget speeches from 2010
corpusSource-class

corpus source classes
metacorpus

get or set corpus metadata
show,dictionary-method

print a dictionary object
textmodel

fit a text model
sample

Randomly sample documents or features
trim

Trim a dfm using threshold-based or random feature selection
print.dfm

print a dfm object
selectFeatures

select features from an object
ngrams

Create ngrams and skipgrams
stopwords

access built-in stopwords
metadoc

get or set document-level meta-data
applyDictionary

apply a dictionary or thesarus to an object
exampleString

A paragraph of text for testing various text-based functions
toLower

Convert texts to lower case
lexdiv

calculate lexical diversity
subset.corpus

extract a subset of a corpus
quanteda-package

An R package for the quantitative analysis of textual data
sort.dfm

sort a dfm by one or more margins
ndoc

get the number of documents or features
wordstem

stem words
encodedTexts

encoded texts for testing
scrabble

compute the Scrabble letter values of text
predict.textmodel_NB_fitted

prediction method for Naive Bayes classifier objects
textmodel_wordfish

wordfish text model
removeFeatures

remove features from an object
texts

get corpus texts
textmodel_ca

correspondence analysis of a document-feature matrix
tf

compute (weighted) term frequency from a dfm
wordlists

word lists used in some readability indexes
LBGexample

dfm with example data from Table 1 of Laver Benoit and Garry (2003)
docvars

get or set for document-level variables
features

extract the feature labels from a dfm
nsentence

count the number of sentences
similarity

compute similarities between documents and/or features
plot.kwic

plot a dispersion plot of key word(s)
ukimmigTexts

Immigration-related sections of 2010 UK party manifestos
topfeatures

list the most frequent features
dfm

create a document-feature matrix
plot.dfm

plot features as a wordcloud
readability

calculate readability
encoding

detect the encoding of texts
textmodel_fitted-class

the fitted textmodel classes
collocations

Detect collocations from text
inaugCorpus

A corpus of US presidential inaugural addresses from 1789-2013
convert

convert a dfm to a non-quanteda format
tfidf

compute tf-idf weights from a dfm
syllables

count syllables in a text
print.tokenizedTexts

print a tokenizedTexts objects
textmodel_NB

Naive Bayes classifier for texts
tokenize

tokenize a set of texts
docfreq

compute the (weighted) document frequency of a feature
dictionary

create a dictionary
phrasetotoken

convert phrases into single tokens
settings

Get or set the corpus settings
weight

weight the feature frequencies in a dfm by various methods
textfile

read a text corpus source from a file
textmodel_wordscores

Wordscores text model