country_replace: country_replace

Description

A wrapper function for cat_replace() that only requires an inputted vector of messy countries. country_replace() uses a built in clean list of country names country.names as the reference clean vector.

Usage

country_replace(messy_countries, threshold = NA, p = 0)

Value

country_replace() returns a cleaned version of the bad vector, with each element replaced by the most similar element of the good vector.

Arguments

messy_countries: Vector containing the messy country names that will be replaced by the closest match from country.names
threshold: The maximum distance that will form a match. If this argument is specified, any element in the messy vector that has no match closer than the threshold distance will be replaced with NA. Default: NA
p: Only used with method "jw", the Jaro-Winkler penatly size. Default: 0

Details

Country names are often misspelled or abbreviated in datasets, especially datasets that have been manually digitized or created. country_replace() is a warpper function of cat_replace() that quickly solves this common issue of mispellings or different formats of country names across datasets. This wrapper function uses a built in clean list of country names country.names as the reference clean vector and replaces your inputted messy vector of names to their nearest match in country.names.

Examples

Run this code

if(interactive()){
 #EXAMPLE1
 lst <- c("Conagoa", "Blearaus", "Venzesual", "Uruagsya", "England")
 fixed <- country_replace(lst)
 }

Run the code above in your browser using DataLab