Return consistent version of a city names using stringr::str_*() functions.
Letters are capitalized, hyphens and underscores are replaced with
whitespace, other punctuation is removed, numbers are removed, and excess
whitespace is trimmed and squished. Optionally, geographic abbreviations
("MT") can be replaced with their long form ("MOUNT"). Invalid addresses from
a vector can be removed (possibly using invalid_city) as well as single
(repeating) character strings ("XXXXXX").
Usage
normal_city(city, abbs = NULL, states = NULL, na = c("", "NA"), na_rep = FALSE)
Value
A vector of normalized city names.
Arguments
city
A vector of city names.
abbs
A named vector or data frame of abbreviations passed to
expand_abbrev; see expand_abbrev for format of abb argument or use
the usps_city tibble.
states
A vector of state abbreviations ("VT") to remove from the
end (and only end) of city names ("STOWE VT").
na
A vector of values to make NA (useful with the invalid_city
vector).
na_rep
logical; If TRUE, replace all single digit (repeating)
strings with NA.