cwbtools v0.1.0


Monthly downloads



Tools to create, modify and manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', <>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, <>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package <>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (<>) and 'polmineR' (<>) offers a leightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.

Functions in cwbtools

Name Description
cwbtools-package cwbtools-package
s_attribute_encode Read, process and write data on structural attributes.
registry_file_parse Parse and create registry files.
p_attribute_encode Encode Positional Attribute(s).
pkg_utils Create and manage packages with corpus data.
corpus_install Install and manage corpora.
cwb_install Utilities to install Corpus Workbench.
CorpusData Manage Corpus Data and Encode CWB Corpus.
conll_get_regions Extract regions from NER annotations (CoNNL format).
get_encoding Get Encoding of Character Vector.
No Results!

Vignettes of cwbtools

No Results!

Last month downloads


Type Package
Date 2019-10-09
VignetteBuilder knitr
LazyData yes
License GPL-3
Language en-US
Encoding UTF-8
Collate 'cwbtools.R' 'pkg.R' 'utils.R' 'p_attribute.R' 's_attribute.R' 'registry_file.R' 'CorpusData.R' 'corpus.R' 'cwb.R' 'ner.R'
RoxygenNote 6.1.1
NeedsCompilation no
Packaged 2019-10-18 11:10:34 UTC; andreasblaette
Repository CRAN
Date/Publication 2019-10-21 14:10:03 UTC

Include our badge in your README