Learn R Programming

polmineR (version 0.8.9)

Verbs and Nouns for Corpus Analysis

Description

Package for corpus analysis using the Corpus Workbench ('CWB', ) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.

Copy Link

Version

Install

install.packages('polmineR')

Monthly Downloads

342

Version

0.8.9

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Andreas Blaette

Last Published

October 29th, 2023

Functions in polmineR (0.8.9)

corpus-methods

Corpus class methods
cooccurrences

Get cooccurrence statistics.
capitalize

Capitalize character vector.
chisquare

Perform chisquare-text.
corpus-class

Corpus class initialization
context-class

Context class.
cooccurrences-class

Cooccurrences class.
context_bundle-class

S4 context_bundle class
count

Get counts.
cqp

Tools for CQP queries.
enrich

Enrich an object.
dispersion

Dispersion of a query or multiple queries.
decode-method

Decode corpus or subcorpus.
encoding

Get and set encoding.
features-class

Feature selection by comparison.
encodings

Conversion between corpus and native encoding.
context

Analyze context of a node word.
dotplot

dotplot
count_class

Count class.
cpos

Get corpus positions for a query or queries.
is_nested

Check whether s-attributes of corpus are nested
hits_class

S4 class to represent hits for queries.
href-function

Add hypertext reference to html document.
hits

Get hits for query
html

Generate html from object.
kwic-class

S4 kwic class
highlight-method

Highlight tokens in text output.
get_type

Get corpus/partition type.
get_token_stream

Get Token Stream.
features

Get features by comparison.
means

calculate means
kwic

Perform keyword-in-context (KWIC) analysis.
ll

Compute Log-likelihood Statistics.
ocpu_exec

Execute code on OpenCPU server
p_attributes

Get p-attributes.
ngrams

Get N-Grams
ngrams_class

Ngrams class.
partition

Initialize a partition.
partition_bundle-class

Bundle of partitions (partition_bundle class).
noise

detect noise
pmi

Calculate Pointwise Mutual Information (PMI).
polmineR-defunct

Defunct functionality
phrases-class

Manage and use phrases
partition_to_string

Decode as String.
partition_bundle

Generate bundle of partitions.
partition_class

Partition class and methods.
ranges-class

Ranges of query matches.
ranges

Get ranges for query.
polmineR-package

polmineR-package
size

Get Number of Tokens.
polmineR-generics

Generic methods defined in the polmineR package
reexports

Objects exported from other packages
read

Display full text.
registry_reset

Reset registry directory.
registry_get_name

Evaluate registry file.
registry_move

Get registry and data directories.
regions

Regions of a CWB corpus.
s_attributes

Get s-attributes.
subset-method

Subsetting corpora and subcorpora
renamed

Renamed Functions
terms

Get terms in partition or corpus.
get_template

Get template for formatting full text output.
slice

Virtual class slice.
subcorpus

The S4 subcorpus class.
subcorpus_bundle-class

Bundled subcorpora
tree_structure

Show the structure of s-attributes
trim

Trim an object.
weigh

Apply Weight to Matrix
tooltips-method

Add tooltips to text output.
textstat-class

S4 textstat superclass.
t_test

Perform t-test.
view

Inspect object using View().
use

Add corpora in R data packages to session registry.
as.sparseMatrix

Type conversion - get sparseMatrix.
as.TermDocumentMatrix

Generate TermDocumentMatrix / DocumentTermMatrix.
blapply

apply a function over a list or bundle
as.speeches

Split corpus or partition into speeches.
as.VCorpus

Get VCorpus.
bundle-class

Bundle Class
annotations

Annotation functionality
as.markdown

Get markdown-formatted full text of a partition.
Cooccurrences-class

Cooccurrences class for corpus/partition.
Cooccurrences,corpus-method

Get all cooccurrences in corpus/partition.