to_any_case: General case conversion

Description

Function to convert strings to any case

Usage

to_any_case(string, case = c("snake", "small_camel", "big_camel",
  "screaming_snake", "parsed", "mixed", "lower_upper", "upper_lower",
  "swap", "all_caps", "lower_camel", "upper_camel", "internal_parsing",
  "none", "flip", "sentence", "random", "title"), abbreviations = NULL,
  sep_in = "[^[:alnum:]]", parsing_option = 1,
  transliterations = NULL, numerals = c("middle", "left", "right",
  "asis", "tight"), sep_out = NULL, unique_sep = NULL,
  empty_fill = NULL, prefix = "", postfix = "")

Value

A character vector according the specified parameters above.

Arguments

string

A string (for example names of a data frame).

case

The desired target case, provided as one of the following:

snake_case: "snake"
lowerCamel: "lower_camel" or "small_camel"
UpperCamel: "upper_camel" or "big_camel"
ALL_CAPS: "all_caps" or "screaming_snake"
lowerUPPER: "lower_upper"
UPPERlower: "upper_lower"
Sentence case: "sentence"
Title Case: "title" - This one is basically the same as sentence case, but in addition it is wrapped into tools::toTitleCase and any abbreviations are always turned into upper case.

There are five "special" cases available:

"parsed": This case is underlying all other cases. Every substring a string consists of becomes surrounded by an underscore (depending on the parsing_option). Underscores at the start and end are trimmed. No lower or upper case pattern from the input string are changed.
"mixed": Almost the same as case = "parsed". Every letter which is not at the start or behind an underscore is turned into lowercase. If a substring is set as an abbreviation, it will be turned into upper case.
"swap": Upper case letters will be turned into lower case and vice versa. Also case = "flip" will work. Doesn't work with any of the other arguments except unique_sep, empty_fill, prefix and postfix.
"random": Each letter will be randomly turned into lower or upper case. Doesn't work with any of the other arguments except unique_sep, empty_fill, prefix and postfix.
"none": Neither parsing nor case conversion occur. This case might be helpful, when one wants to call the function for the quick usage of the other parameters. To suppress replacement of spaces to underscores set sep_in = NULL. Works with sep_in, transliterations, sep_out, prefix, postfix, empty_fill and unique_sep.
"internal_parsing": This case is returning the internal parsing (suppressing the internal protection mechanism), which means that alphanumeric characters will be surrounded by underscores. It should only be used in very rare use cases and is mainly implemented to showcase the internal workings of to_any_case()

abbreviations

character. (Case insensitive) matched abbreviations are surrounded by underscores. In this way, they can get recognized by the parser. This is useful when e.g. parsing_option 1 is needed for the use case, but some abbreviations but some substrings would require parsing_option 2. Furthermore, this argument also specifies the formatting of abbreviations in the output for the cases title, mixed, lower and upper camel. E.g. for upper camel the first letter is always in upper case, but when the abbreviation is supplied in upper case, this will also be visible in the output.

Use this feature with care: One letter abbreviations and abbreviations next to each other are hard to read and also not easy to parse for further processing.

sep_in

(short for separator input) if character, is interpreted as a regular expression (wrapped internally into stringr::regex()). The default value is a regular expression that matches any sequence of non-alphanumeric values. All matches will be replaced by underscores (additionally to "_" and " ", for which this is always true, even if NULL is supplied). These underscores are used internally to split the strings into substrings and specify the word boundaries.

parsing_option

An integer that will determine the parsing_option.

1: "RRRStudio" -> "RRR_Studio"
2: "RRRStudio" -> "RRRS_tudio"
3: "RRRStudio" -> "RRRSStudio". This will become for example "Rrrstudio" when we convert to lower camel case.
-1, -2, -3: These parsing_options's will suppress the conversion after non-alphanumeric values.
0: no parsing

transliterations

A character vector (if not NULL). The entries of this argument need to be elements of stringi::stri_trans_list() (like "Latin-ASCII", which is often useful) or names of lookup tables (currently only "german" is supported). In the order of the entries the letters of the input string will be transliterated via stringi::stri_trans_general() or replaced via the matches of the lookup table. When named character elements are supplied as part of `transliterations`, anything that matches the names is replaced by the corresponding value. You should use this feature with care in case of case = "parsed", case = "internal_parsing" and case = "none", since for upper case letters, which have transliterations/replacements of length 2, the second letter will be transliterated to lowercase, for example Oe, Ae, Ss, which might not always be what is intended. In this case you can make usage of the option to supply named elements and specify the transliterations yourself.

numerals

A character specifying the alignment of numerals ("middle", left, right, asis or tight). I.e. numerals = "left" ensures that no output separator is in front of a digit.

sep_out

(short for separator output) String that will be used as separator. The defaults are "_" and "", regarding the specified case. When length(sep_out) > 1, the last element of sep_out gets recycled and separators are incorporated per string according to their order.

unique_sep

A string. If not NULL, then duplicated names will get a suffix integer in the order of their appearance. The suffix is separated by the supplied string to this argument.

empty_fill

A string. If it is supplied, then each entry that matches "" will be replaced by the supplied string to this argument.

prefix

prefix (string).

postfix

postfix (string).

Author

Malte Grosser, malte.grosser@gmail.com

Examples

Run this code

### abbreviations
to_snake_case(c("HHcity", "newUSElections"), abbreviations = c("HH", "US"))
to_upper_camel_case("succesfullGMBH", abbreviations = "GmbH")
to_title_case("succesfullGMBH", abbreviations = "GmbH")

### sep_in (input separator)
string <- "R.St\u00FCdio: v.1.0.143"
to_any_case(string)
to_any_case(string, sep_in = ":|\\.")
to_any_case(string, sep_in = ":|(?"))

### prefix and postfix
to_upper_camel_case("some_path", sep_out = "//", 
  prefix = "USER://", postfix = ".exe")