cwbtools v0.3.3

0

Monthly downloads

0th

Percentile

Tools to Create, Modify and Manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', <http://cwb.sourceforge.net/>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, <http://www.stefan-evert.de/PUB/EvertHardie2011.pdf>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package <https://CRAN.R-project.org/package=RcppCWB>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (<https://CRAN.R-project.org/package=RcppCWB>) and 'polmineR' (<https://CRAN.R-project.org/package=polmineR>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.

Functions in cwbtools

Name Description
p_attribute_encode Encode Positional Attribute(s).
cwb_install Utilities to install Corpus Workbench.
conll_get_regions Extract regions from NER annotations (CoNNL format).
cwb_corpus_dir Manage directories for indexed corpora
corpus_install Install and manage corpora.
get_encoding Get Encoding of Character Vector.
CorpusData Manage Corpus Data and Encode CWB Corpus.
cwbtools-package cwbtools-package
pkg_utils Create and manage packages with corpus data.
registry_file_parse Parse and create registry files.
s_attribute_encode Read, process and write data on structural attributes.
No Results!

Vignettes of cwbtools

Name
opennlp.Rmd
sentences.Rmd
vignette.Rmd
No Results!

Last month downloads

Details

Type Package
Date 2021-02-20
VignetteBuilder knitr
LazyData yes
License GPL-3
Language en-US
Encoding UTF-8
URL https://github.com/PolMine/cwbtools
BugReports https://github.com/PolMine/cwbtools/issues
Collate 'CorpusData.R' 'corpus.R' 'cwb.R' 'cwbtools.R' 'directories.R' 'encoding.R' 'ner.R' 'p_attribute.R' 'pkg.R' 'registry_file.R' 's_attribute.R'
RoxygenNote 7.1.1
NeedsCompilation no
Packaged 2021-02-20 01:34:11 UTC; andreasblaette
Repository CRAN
Date/Publication 2021-02-23 12:20:37 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/cwbtools)](http://www.rdocumentation.org/packages/cwbtools)