Learn R Programming

⚠️There's a newer version (1.0.6) of this package.Take me there.

tidygeocoder

Introduction

Tidygeocoder makes getting data from geocoder services easy. A unified interface is provided for the supported geocoder services listed below. All results are returned in tibble format.

Batch geocoding (geocoding multiple addresses per query) is used by default if possible when multiple addresses are provided. Duplicate, missing/NA, and blank address data is handled elegantly - only unique addresses are passed to geocoder services, but the rows in the original data are preserved by default.

In addition to the usage example below you can refer to the following references:

Installation

To install the stable version from CRAN (the official R package servers):

install.packages('tidygeocoder')

Alternatively, you can install the latest development version from GitHub:

if(!require(devtools)) install.packages("devtools")
devtools::install_github("jessecambon/tidygeocoder")

Geocoder Services

The supported geocoder services are shown in the table below with their geographic limitations, if they support batch geocoding (geocoding multiple addresses in a single query), if an API key is required, and the usage rate limitations. Refer to the website for each geocoder service for the most up-to-date details on costs, capabilities, and usage limitations.

ServiceGeographyBatch GeocodingAPI Key RequiredQuery Rate Limit
US CensusUSYesNoN/A
Nominatim (OSM)WorldwideNoNo1/second
GeocodioUS and CanadaYesYes1000/minute (free tier)
Location IQWorldwideNoYes2/second (free tier)
GoogleWorldwideNoYes50/second

Note that:

  • The US Census service supports street-level addresses only (ie. “11 Wall St New York, NY” is OK but “New York, NY” is not).
  • Nominatim (OSM) and Geocodio both support a maximum of 10,000 addresses per batch query.
  • The Census and OSM services are free while Geocodio and Location IQ are commercial services that offer both free and paid usage tiers. The Google service bills per query.

Usage

In this example we will geocode a few addresses using the geocode() function and plot them on a map with ggplot.

library(dplyr)
library(tibble)
library(tidygeocoder)

# create a dataframe with addresses
some_addresses <- tribble(
~name,                  ~addr,
"White House",          "1600 Pennsylvania Ave NW, Washington, DC",
"Transamerica Pyramid", "600 Montgomery St, San Francisco, CA 94111",     
"Willis Tower",         "233 S Wacker Dr, Chicago, IL 60606"                                  
)

# geocode the addresses
lat_longs <- some_addresses %>%
  geocode(addr, method = 'census', lat = latitude , long = longitude)

The geocode() function attaches latitude and longitude columns to our input dataset of addresses. The US Census geocoder is used here, but other services can be specified with the method argument. See the geo() function documentation for details.

nameaddrlatitudelongitude
White House1600 Pennsylvania Ave NW, Washington, DC38.89875-77.03535
Transamerica Pyramid600 Montgomery St, San Francisco, CA 9411137.79470-122.40314
Willis Tower233 S Wacker Dr, Chicago, IL 6060641.87851-87.63666

Now that we have the longitude and latitude coordinates, we can use ggplot to plot our addresses on a map.

library(ggplot2)
library(maps)
library(ggrepel)

ggplot(lat_longs, aes(longitude, latitude), color = "grey99") +
  borders("state") + geom_point() + 
  geom_label_repel(aes(label = name)) + 
  theme_void()

To return the full results from a geocoder service (not just latitude and longitude) you can use full_results = TRUE. Additionally, for the Census geocoder you can use return_type = 'geographies' to return geography columns (state, county, Census tract, and Census block).

full <- some_addresses %>%
  geocode(addr, method = 'census', full_results = TRUE, return_type = 'geographies')

glimpse(full)
#> Rows: 3
#> Columns: 15
#> $ name            <chr> "White House", "Transamerica Pyramid", "Willis Tower"
#> $ addr            <chr> "1600 Pennsylvania Ave NW, Washington, DC", "600 Mont…
#> $ lat             <dbl> 38.89875, 37.79470, 41.87851
#> $ long            <dbl> -77.03535, -122.40314, -87.63666
#> $ id              <int> 1, 2, 3
#> $ input_address   <chr> "1600 Pennsylvania Ave NW, Washington, DC, , , ", "60…
#> $ match_indicator <chr> "Match", "Match", "Match"
#> $ match_type      <chr> "Exact", "Exact", "Exact"
#> $ matched_address <chr> "1600 PENNSYLVANIA AVE NW, WASHINGTON, DC, 20500", "6…
#> $ tiger_line_id   <chr> "76225813", "192281262", "112050003"
#> $ tiger_side      <chr> "L", "R", "L"
#> $ state_fips      <chr> "11", "06", "17"
#> $ county_fips     <chr> "001", "075", "031"
#> $ census_tract    <chr> "980000", "061101", "839100"
#> $ census_block    <chr> "1034", "2014", "2008"

For further documentation, refer to the Getting Started Vignette and the function documentation.

Contributing

Contributions to the tidygeocoder package are welcome. File an issue for bug fixes or suggested features. If you would like to add support for a new geocoder service, reference this post for instructions.

Copy Link

Version

Install

install.packages('tidygeocoder')

Monthly Downloads

7,365

Version

1.0.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Jesse Cambon

Last Published

January 18th, 2021

Functions in tidygeocoder (1.0.2)

extract_results

Extract geocoder results
api_parameter_reference

Geocoder service API parameter reference
query_api

Execute a geocoder API query
geo_cascade

Convenience function for calling the geo function with method = 'cascade'
geo

Geocode addresses
get_api_query

Construct a geocoder API query
geocode

Geocode addresses in a dataframe
louisville

Louisville, Kentucky street addresses
tidygeocoder-package

tidygeocoder makes getting data from geocoder services easy.
geo_census

Convenience functions for calling the geo function with a specified method
sample_addresses

Some sample addresses for testing