census_helper_new
links user-input dataset with Census geographic data.
census_helper_new(
key = Sys.getenv("CENSUS_API_KEY"),
voter.file,
states = "all",
geo = c("tract", "block", "block_group", "county", "place", "zcta"),
age = FALSE,
sex = FALSE,
year = "2020",
census.data = NULL,
retry = 3,
use.counties = FALSE,
skip_bad_geos = FALSE
)
Output will be an object of class data.frame
. It will
consist of the original user-input data with additional columns of
Census data.
A character string containing a valid Census API key, which can be requested from the U.S. Census API key signup page.
By default, attempts to find a census key stored in an
environment variable named CENSUS_API_KEY
.
An object of class data.frame
. Must contain field(s) named
county
, tract
, block
, and/or place
specifying geolocation. These should be character variables that match up with
U.S. Census categories. County should be three characters (e.g., "031" not "31"),
tract should be six characters, and block should be four characters.
Place should be five characters if it is included.
A character vector specifying which states to extract
Census data for, e.g. c("NJ", "NY")
. Default is "all"
, which extracts
Census data for all states contained in user-input data.
A character object specifying what aggregation level to use.
Use "county"
, "tract"
, "block"
, or "place"
.
Default is "tract"
. Warning: extracting block-level data takes very long.
A TRUE
/FALSE
object indicating whether to condition on
age or not. If FALSE
(default), function will return Pr(Geolocation | Race).
If TRUE
, function will return Pr(Geolocation, Age | Race).
If sex
is also TRUE
, function will return Pr(Geolocation, Age, Sex | Race).
A TRUE
/FALSE
object indicating whether to condition on
sex or not. If FALSE
(default), function will return Pr(Geolocation | Race).
If TRUE
, function will return Pr(Geolocation, Sex | Race).
If age
is also TRUE
, function will return Pr(Geolocation, Age, Sex | Race).
A character object specifying the year of U.S. Census data to be downloaded.
Use "2010"
, or "2020"
. Default is "2020"
.
A optional census object of class list
containing
pre-saved Census geographic data. Can be created using get_census_data
function.
If census.data
is provided, the year
element must
have the same value as the year
option specified in this function
(i.e., "2010"
in both or "2020"
in both).
If census.data
is provided, the age
and the sex
elements must be FALSE
. This corresponds to the defaults of census_geo_api
.
If census.data
is missing, Census geographic data will be obtained via Census API.
The number of retries at the census website if network interruption occurs.
A logical, defaulting to FALSE. Should census data be filtered by counties available in census.data?
Logical. Option to have the function skip any geolocations that are not present
in the census data, returning a partial data set. Default is set to FALSE
, which case it will
break and provide error message with a list of offending geolocations.
This function allows users to link their geocoded dataset (e.g., voter file) with U.S. Census data (2010 or 2020). The function extracts Census Summary File data at the county, tract, block, or place level. Census data calculated are Pr(Geolocation | Race) where geolocation is county, tract, block, or place.
data(voters)
if (FALSE) census_helper_new(voter.file = voters, states = "nj", geo = "block")
if (FALSE) census_helper_new(voter.file = voters, states = "all", geo = "tract")
if (FALSE) census_helper_new(voter.file = voters, states = "all", geo = "place",
year = "2020")
Run the code above in your browser using DataLab