qdapRegex (version 0.7.5)

rm_city_state_zip: Remove/Replace/Extract City, State, & Zip

Description

Remove/replace/extract city (single lower case word or multiple consecutive capitalized words before a comma and state) + state (2 consecutive capital letters) + zip code (5 digits or 5 + 4 digits) from a string.

Usage

rm_city_state_zip(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_city_state_zip",
  replacement = "",
  extract = FALSE,
  dictionary = getOption("regex.library"),
  ...
)

ex_city_state_zip( text.var, trim = !extract, clean = TRUE, pattern = "@rm_city_state_zip", replacement = "", extract = TRUE, dictionary = getOption("regex.library"), ... )

Value

Returns a character string with city, state, & zip removed.

Arguments

text.var

The text variable.

trim

logical. If TRUE removes leading and trailing white spaces.

clean

trim logical. If TRUE extra white spaces and escaped character will be removed.

pattern

A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Default, @rm_city_state_zip uses the rm_city_state_zip regex from the regular expression dictionary from the dictionary argument.

replacement

Replacement for matched pattern.

extract

logical. If TRUE the city, state, & zip are extracted into a list of vectors.

dictionary

A dictionary of canned regular expressions to search within if pattern begins with "@rm_".

...

Other arguments passed to gsub.

See Also

gsub, stri_extract_all_regex

Other rm_ functions: rm_abbreviation(), rm_between(), rm_bracket(), rm_caps_phrase(), rm_caps(), rm_citation_tex(), rm_citation(), rm_city_state(), rm_date(), rm_default(), rm_dollar(), rm_email(), rm_emoticon(), rm_endmark(), rm_hash(), rm_nchar_words(), rm_non_ascii(), rm_non_words(), rm_number(), rm_percent(), rm_phone(), rm_postal_code(), rm_repeated_characters(), rm_repeated_phrases(), rm_repeated_words(), rm_tag(), rm_time(), rm_title_name(), rm_url(), rm_white(), rm_zip()

Examples

Run this code
x <- paste0("I went to Washington Heights, NY 54321 for food! ", 
   "It's in West ven,PA 12345, near Bolly Bolly Bolly, CA12345-1234!", 
   "hello world")
rm_city_state_zip(x)
ex_city_state_zip(x)

Run the code above in your browser using DataLab