tidygeocoder
A tidyverse-style geocoder interface for R. Utilizes US Census and Nominatim (OSM) geocoder services. Returns latitude and longitude in tibble format from addresses. You can find a demo I wrote up on R-Bloggers here.
Install
To install the stable version from CRAN (the official R package servers):
install.packages('tidygeocoder')
To install the development version from GitHub:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("jessecambon/tidygeocoder",build_vignettes=TRUE)
Usage
In this brief example, we will use the US Census API to geocode some addresses in the sample_addresses
dataset.
library(dplyr)
library(tidygeocoder)
lat_longs <- sample_addresses %>%
geocode(addr,lat=latitude,long=longitude)
Latitude and longitude columns are attached to our input dataset. Since we are using the US Census geocoder service, international locations and addresses which are not at the street level (such as cities) are not found.
name | addr | latitude | longitude |
---|---|---|---|
White House | 1600 Pennsylvania Ave Washington, DC | 38.89875 | -77.03535 |
Transamerica Pyramid | 600 Montgomery St, San Francisco, CA 94111 | 37.79470 | -122.40314 |
NA | Fake Address | NA | NA |
NA | NA | NA | NA |
NA | NA | ||
US City | Nashville,TN | NA | NA |
Willis Tower | 233 S Wacker Dr, Chicago, IL 60606 | 41.87851 | -87.63666 |
International City | Nairobi, Kenya | NA | NA |
Plot our geolocated points:
library(ggplot2)
library(maps)
library(ggrepel)
ggplot(lat_longs %>% filter(!is.na(longitude)),aes(longitude, latitude),color="grey98") +
borders("state") + theme_classic() + geom_point() +
theme(line = element_blank(),text = element_blank(),title = element_blank()) +
geom_label_repel(aes(label =name),show.legend=F) +
scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL)
To find international and non-street addresses, we must use the OSM service. We can use the ‘cascade’ method to attempt to use the US Census method for each address and only use the OSM service if the Census method fails (since OSM has a usage limit).
cascade_points <- sample_addresses %>%
geocode(addr,method='cascade')
name | addr | lat | long | geo_method |
---|---|---|---|---|
White House | 1600 Pennsylvania Ave Washington, DC | 38.898754 | -77.03535 | census |
Transamerica Pyramid | 600 Montgomery St, San Francisco, CA 94111 | 37.794700 | -122.40314 | census |
NA | Fake Address | NA | NA | NA |
NA | NA | NA | NA | NA |
NA | NA | NA | ||
US City | Nashville,TN | 36.162230 | -86.77435 | osm |
Willis Tower | 233 S Wacker Dr, Chicago, IL 60606 | 41.878513 | -87.63666 | census |
International City | Nairobi, Kenya | -1.283253 | 36.81724 | osm |
References
- US Census Geocoder
- Nominatim Geocoder
- Nominatim Address Check
- tmaptools package (used for OSM geocoding)
- dplyr
- tidyr