quanteda (version 1.5.2)

char_tolower: Convert the case of character objects

Description

char_tolower and char_toupper are replacements for tolower and toupper based on the stringi package. The stringi functions for case conversion are superior to the base functions because they correctly handle case conversion for Unicode. In addition, the *_tolower functions provide an option for preserving acronyms.

Usage

char_tolower(x, keep_acronyms = FALSE)

char_toupper(x)

Arguments

x

the input object whose character/tokens/feature elements will be case-converted

keep_acronyms

logical; if TRUE, do not lowercase any all-uppercase words (applies only to *_tolower functions)

Examples

Run this code
# NOT RUN {
txt1 <- c(txt1 = "b A A", txt2 = "C C a b B")
char_tolower(txt1) 
char_toupper(txt1)

# with acronym preservation
txt2 <- c(text1 = "England and France are members of NATO and UNESCO", 
          text2 = "NASA sent a rocket into space.")
char_tolower(txt2)
char_tolower(txt2, keep_acronyms = TRUE)
char_toupper(txt2)
# }

Run the code above in your browser using DataLab