corpus (version 0.9.1)

abbreviations: Abbreviations

Description

Get a list of common abbreviations.

Usage

abbreviations(kind = "english")

Arguments

kind

the name (or names) of the abbreviation list(s), NA, or NULL. Allowed values are "english", "french", "german", "italian", "portuguese", and "russian"; these values retrieve the language-specific sentence break suppression lists compiled by the Unicode Common Locale Data Repository, with a few additions to the "english" list.

Value

A sorted character vector of unique abbreviations of the specified kind (or kinds if the kind argument is a vector), or NULL if kind = NULL or kind = NA.

Details

abbreviations returns a character vector of abbreviations. The main use of this function is to get a list of sentence break suppressions, terms ending in full stops (periods, ambiguous sentence terminators) that when followed by an upper-case letter, do not signal the end of a sentence.

The built-in abbreviation lists returned by this function are reasonable defaults, but they may require further tailoring to suit your particular task.

See Also

text_filter.

Examples

Run this code
# NOT RUN {
    head(abbreviations("english"))
    head(abbreviations("french"))
    abbreviations(NULL)

    # multiple kinds:
    head(abbreviations(c("italian", "portuguese")))

    # add words to the default list:
    my_abbrev <- c(abbreviations("english"), "Mon.", "Tues.", "Weds.")
# }

Run the code above in your browser using DataLab