Learn R Programming

geogenr

The American Community Survey (ACS) offers geodatabases with geographic information and associated data of interest to researchers in the area. The goal of geogenr is to facilitate access to this information through functions that allow us to select the geodatabases that interest us, download them, access the information they contain, filter it and export it in various formats so that we can process it with other tools if required.

Installation

You can install the released version of geogenr from CRAN with:

install.packages("geogenr")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("josesamos/geogenr")

Example

Each ACS geodatabase is structured in layers: a geographic layer, a metadata layer, and the rest are data layers. Accessing data with this structure is not trivial. The goal of the geogenr package is to make it easier.

First, we select and download the ACS geodatabases that we need. We can use the functions offered by the package or download and decompress them by other means. We create an object of class acs_5yr indicating the work folder.

library(geogenr)

dir <- system.file("extdata/acs_5yr", package = "geogenr")

ac <- acs_5yr(dir)

We can query the available geodatabases by area, subject and year using the methods offered by the object. We also download the geodatabases of the areas and years that we need.

ac |>
  get_area_groups()
#> [1] "Legal and Administrative Areas" "Statistical Areas"

ac |>
  get_areas(group = "Legal and Administrative Areas")
#>  [1] "American Indian/Alaska Native/Native Hawaiian Area"
#>  [2] "Alaska Native Regional Corporation"                
#>  [3] "Congressional District (116th Congress)"           
#>  [4] "County"                                            
#>  [5] "Place"                                             
#>  [6] "Elementary School District"                        
#>  [7] "Secondary School District"                         
#>  [8] "Unified School District"                           
#>  [9] "State"                                             
#> [10] "State Legislative Districts Upper Chamber"         
#> [11] "State Legislative Districts Lower Chamber"         
#> [12] "Code Tabulation Area"

ac |>
  get_area_years(area = "Alaska Native Regional Corporation")
#> [1] "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" "2021"

ac <- ac |>
  select_area_files("Alaska Native Regional Corporation", 2020:2021)

files <- ac |>
  download_selected_files(unzip = FALSE)
#> [1] TRUE TRUE

We unzip the files and check that the data is available.

files <- ac |>
  unzip_files()

ac |>
  get_available_areas()
#> [1] "Alaska Native Regional Corporation"

ac |>
  get_available_area_years(area = "Alaska Native Regional Corporation")
#> [1] "2020" "2021"

We consult the themes available in the selected area and also select one or more themes by creating an object of class acs_5yr_topic.

ac |>
  get_available_area_topics("Alaska Native Regional Corporation")
#>  [1] "X01 Age And Sex"                     "X02 Race"                           
#>  [3] "X03 Hispanic Or Latino Origin"       "X04 Ancestry"                       
#>  [5] "X05 Foreign Born Citizenship"        "X06 Place Of Birth"                 
#>  [7] "X07 Migration"                       "X08 Commuting"                      
#>  [9] "X09 Children Household Relationship" "X10 Grandparents Grandchildren"     
#> [11] "X11 Household Family Subfamilies"    "X12 Marital Status And History"     
#> [13] "X13 Fertility"                       "X14 School Enrollment"              
#> [15] "X15 Educational Attainment"          "X16 Language Spoken At Home"        
#> [17] "X17 Poverty"                         "X18 Disability"                     
#> [19] "X19 Income"                          "X20 Earnings"                       
#> [21] "X21 Veteran Status"                  "X22 Food Stamps"                    
#> [23] "X23 Employment Status"               "X24 Industry Occupation"            
#> [25] "X25 Housing Characteristics"         "X26 Group Quarters"                 
#> [27] "X27 Health Insurance"                "X28 Computer And Internet Use"      
#> [29] "X99 Imputation"

act <- ac |>
  as_acs_5yr_topic("Alaska Native Regional Corporation",
                   topic = "X01 Age And Sex")

Once a topic has been selected, we can consult the available reports or subreports. We can focus on a report or subreport, we can also work with all the reports of the topic.

act |>
  get_report_names()
#> [1] "B01001-Sex By Age"        "B01002-Median Age By Sex"
#> [3] "B01003-Total Population"

We can export the reports of the selected topic to various formats such as GeoPackage, also flat_table or star_database of the rolap package. In this case we are going to obtain a GeoPackage.

geo <- act |>
  as_acs_5yr_geo()

dir <- tempdir()
file <- geo |>
  as_GeoPackage(dir)

sf::st_layers(file)
#> Driver: GPKG 
#> Available layers:
#>   layer_name geometry_type features fields crs_name
#> 1       data Multi Polygon       12   1453    NAD83
#> 2   metadata            NA     1436     12     <NA>
#> 3     origin            NA        2      6     <NA>

This format also allows us to perform simple queries using the metadata and the geographic layer.

metadata <- geo |>
  get_metadata()

metadata
#> # A tibble: 1,436 × 12
#>    variable year  Short_Name Full_Name   report subreport report_var report_desc
#>    <chr>    <chr> <chr>      <chr>       <chr>  <chr>          <int> <chr>      
#>  1 V0001    2020  B01001Ae1  Sex By Age… B01001 A                  1 Sex By Age…
#>  2 V0002    2020  B01001Ae10 Sex By Age… B01001 A                 10 Sex By Age…
#>  3 V0003    2020  B01001Ae11 Sex By Age… B01001 A                 11 Sex By Age…
#>  4 V0004    2020  B01001Ae12 Sex By Age… B01001 A                 12 Sex By Age…
#>  5 V0005    2020  B01001Ae13 Sex By Age… B01001 A                 13 Sex By Age…
#>  6 V0006    2020  B01001Ae14 Sex By Age… B01001 A                 14 Sex By Age…
#>  7 V0007    2020  B01001Ae15 Sex By Age… B01001 A                 15 Sex By Age…
#>  8 V0008    2020  B01001Ae16 Sex By Age… B01001 A                 16 Sex By Age…
#>  9 V0009    2020  B01001Ae17 Sex By Age… B01001 A                 17 Sex By Age…
#> 10 V0010    2020  B01001Ae18 Sex By Age… B01001 A                 18 Sex By Age…
#> # ℹ 1,426 more rows
#> # ℹ 4 more variables: measure <chr>, item1 <chr>, item2 <chr>, group <chr>

metadata <-
  dplyr::filter(
    metadata,
    item2 == "Female" &
      group == "People Who Are American Indian And Alaska Native Alone" &
      measure == "estimate"
  )

geo2 <- geo |>
  set_metadata(metadata)

geo2 |>
  get_metadata()
#> # A tibble: 2 × 12
#>   variable year  Short_Name Full_Name    report subreport report_var report_desc
#>   <chr>    <chr> <chr>      <chr>        <chr>  <chr>          <int> <chr>      
#> 1 V0671    2020  B01002Ce3  Median Age … B01002 C                  3 Median Age…
#> 2 V1389    2021  B01002Ce3  Median Age … B01002 C                  3 Median Age…
#> # ℹ 4 more variables: measure <chr>, item1 <chr>, item2 <chr>, group <chr>

geo_layer <- geo2 |> 
  get_geo_layer()

geo_layer$faiana21vs20 <- 100 * (geo_layer$V1389 - geo_layer$V0671) / geo_layer$V0671
plot(sf::st_shift_longitude(geo_layer[, "faiana21vs20"]))

In GeoPackage format we can also easily perform queries with other tools such as QGIS.

Copy Link

Version

Install

install.packages('geogenr')

Monthly Downloads

187

Version

2.0.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Jose Samos

Last Published

January 9th, 2024

Functions in geogenr (2.0.1)

get_available_area_topics

Get available area topics (report groups)
as_geomultistar

As geomultistar::geomultistar object
as_acs_5yr_geo

Get an acs_5yr_geo object
as_acs_5yr_topic

As ACS census topic (report group)
get_available_area_years

Get available area years
get_report_names

Get report names
as_star_database

As rolap::star_database object
get_file_year

Get file year
get_geo_attribute_names

Get geographical attributes
unzip_files

Unzip files
download_selected_files

Download selected files
get_areas

Get area names of a group
get_area_years

Get available area years
transform_metadata_rest

Transform metadata layer
get_gbd_files

Get gbd files
get_file_area

Get file area (with year)
as_GeoPackage

Save as GeoPackage
get_names_of_other_topics

Get names of other topics (report groups)
get_name_from_area_code

Get name from area code
select_report

Select report
select_subreport

Select subreport
transform_metadata_basic

Transform metadata layer
get_available_areas

Get available area names
get_geo_layer.acs_5yr_geo

Get geographic layer
transform_layer

Transform layer according to metadata
anrc_2021_x01

"Alaska Native Regional Corporation", 2021, "X01 Age And Sex"
get_area_file_names

Get area file names
get_topic_name

Get topic name (report groups)
get_geo_layer_from_file

Get geo layer
get_topic_data

Select topic (report group)
select_topic

Select topic (report group)
get_selected_file_names

Get selected file names
name_to_title

Name to title
name_with_nexus

Name with nexus
get_layer_data

Get layer data
get_code_from_area_name

Get code from area name
new_acs_5yr_topic

New acs_5yr_topic object
select_area_files

Select area files
get_area_groups

Get area groups
set_metadata

Set metadata layer
get_metadata

Get the metadata layer
get_subreport_names

Get subreport names
url_file_exists

url_file_exists
validate_names

Validate names
get_too_heavy_file_names

Get too heavy file names
acs_5yr

acs_5yr S3 class
as_flat_table

As rolap::flat_table object
acs_5yr_md

Titles and Years of Selected Demographic and Economic Data