Learn R Programming

StandardizeText (version 1.0)

standardize.countrynames: Standardize Country Names

Description

Takes in a dataframe or vector containing a column of country names and returns the data structure with the names standardized.

Usage

standardize.countrynames(input, input.column = NULL, standard = "default", standard.column = NULL, only.names = FALSE, na.rm = FALSE, suggest = "prompt", print.changes = TRUE, verbose = FALSE)

Arguments

input
A dataframe or vector containing a column of country names
input.column
The column containing country names if input is a dataframe, identified by name or number; ignored if input a vector
standard
The name of an included name set (see details), or a dataframe or vector containing a column of standard names
standard.column
The column containing standard names if standard is a dataframe, identified by name or number; ignored if standard a vector or an included name set
only.names
Only return a vector of standardized names
na.rm
Remove any countries not contained in the standard set
suggest
Suggestions for inexact matches; "prompt" allows user to select desired suggestions, "auto" applies all, "none" applies none
print.changes
Print which names changed
verbose
Print full output, including names of nonidentified countries

Value

If input a dataframe, returns the identical dataframe with the country names column standardized; if input a vector of country names, returns the standardized vector

Details

Included name sets "default": Naming convention based on the ISO "imf": International Monetary Fund names "iso": International Standards Organization names "pwt": Penn World Tables names "wb": World Bank names "who: World Health Organization names

Examples

Run this code
library(StandardizeText)
sample.names <- c("Aland Is.","Brunei Daru.","Ivory Coast","The Gambia")
sample.std <- c("brunei","aland is","gambia, the","cote divoire")
sample.df <- data.frame(foo=2:5,bar=sample.names, baz=7:4, qux=sample.std)

#Standardize vector using iso names
out.a <- standardize.countrynames(sample.names,standard="iso",suggest="auto")
#Standardize vector using provided names
out.b <- standardize.countrynames(sample.names,standard=sample.std,suggest="auto")
#Standardize dataframe using wb names
out.c <- standardize.countrynames(sample.df,2,standard="wb",suggest="auto",verbose=TRUE)
#Standardize dataframe using provided names without suggestions
out.d <- standardize.countrynames(sample.df,"bar",sample.df,"qux",suggest="none",verbose=TRUE)

Run the code above in your browser using DataLab