Provides access to the US Census Bureau batch endpoints for locations and
geographies. The function implements iteration and optional parallelization
in order to geocode datasets larger than the API limit of 1,000 and more
efficiently than sending 10,000 per request. It also supports multiple outputs,
including (optionally, if sf
is installed,) sf
class objects.
cxy_geocode(
.data,
id = NULL,
street,
city = NULL,
state = NULL,
zip = NULL,
return = "locations",
benchmark = "Public_AR_Current",
vintage = NULL,
timeout = 30,
parallel = 1,
class = "dataframe",
output = "simple"
)
A data.frame or sf object containing geocoded results
data.frame containing columns with structured address data
Optional String - Name of column containing unique ID
String - Name of column containing street address
Optional String - Name of column containing city
Optional String - Name of column containing state
Optional String - Name of column containing zip code
One of 'locations' or 'geographies' denoting returned information from the API. If you would like Census geography data, you must specify a valid vintage for your benchmark.
Optional Census benchmark to geocode against. To obtain current
valid benchmarks, use the cxy_benchmarks()
function.
Optional Census vintage to geocode against. You may use the
cxy_vintages()
function to obtain valid vintages.
Numeric, in minutes, how long until request times out
Integer, number of cores greater than one if parallel requests are desired. All operating systems now use a SOCK cluster, and the dependencies are not longer suggested packages. Instead, they are installed by default. Note that this value may not represent more cores than the system reports are available. If it is larger, the maximum number of available cores will be used.
One of 'dataframe' or 'sf' denoting the output class. 'sf' will only return matched addresses.
One of 'simple' or 'full' denoting the returned columns. Simple returns just coordinates.
Parallel requests are supported across platforms. If supported (POSIX platforms) the process is forked, otherwise a SOCK cluster is used (Windows). You may not specify more cores than the system reports are available
# load data
x <- stl_homicides[1:10,]
# geocode
cxy_geocode(x, street = 'street_address', city = 'city', state = 'state', zip = 'postal_code',
return = 'locations', class = 'dataframe', output = 'simple')
Run the code above in your browser using DataLab