abbreviations

0th

Percentile

Abbreviations

Lists of common abbreviations.

Keywords
datasets
Usage
abbreviations_de
abbreviations_en
abbreviations_es
abbreviations_fr
abbreviations_it
abbreviations_pt
abbreviations_ru
Details

The abbreviations_ objects are character vectors of abbreviations. These are words or phrases containing full stops (periods, ambiguous sentence terminators) that require special handling for sentence detection and tokenization.

The original lists were compiled by the Unicode Common Locale Data Repository. We have tailored the English list by adding single-letter abbreviations and making a few other additions.

The built-in abbreviation lists are reasonable defaults, but they may require further tailoring to suit your particular task.

Format

A character vector of unique abbreviations.

See Also

text_filter.

Aliases
  • abbreviations
  • abbreviations_de
  • abbreviations_en
  • abbreviations_es
  • abbreviations_fr
  • abbreviations_it
  • abbreviations_pt
  • abbreviations_ru
Documentation reproduced from package corpus, version 0.10.0, License:

Community examples

Looks like there are no examples yet.