Learn R Programming

⚠️There's a newer version (2.0.1) of this package.Take me there.

geogenr

The American Community Survey (ACS) offers geodatabases with geographic information and associated data of interest to researchers in the area. The goal of geogenr is to generate geomultistar objects from those geodatabases automatically, once the focus of attention is selected.

Installation

You can install the released version of geogenr from CRAN with:

install.packages("geogenr")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("josesamos/geogenr")

Example

Each ACS geodatabase is structured in layers: a geographic layer, a metadata layer, and the rest are data layers. The data layers have a matrix form, the rows are indexed by instances of the geographic layer, the columns by variables defined in the metadata layer, the cells are numeric values. Here we have an example:

GEOID B01001e1 B01001m1 B01001e2 B01001m2 B01001e3 B01001m3 B01001e4 B01001m4 ...

16000US0100100 218 165 92 114 10 16 18 30

16000US0100124 2582 24 1313 98 45 37 14 19

16000US0100460 4374 24 1963 158 144 76 105 68

16000US0100484 641 159 326 89 10 17 16 11

16000US0100676 295 102 143 55 7 11 14 17

16000US0100820 32878 57 16236 453 1159 257 1151 209

...

Some of the defined variables are shown below.

Short Name Full Name

B01001e1 SEX BY AGE: Total: Total Population -- (Estimate)

B01001m1 SEX BY AGE: Total: Total Population -- (Margin of Error)

B01001e2 SEX BY AGE: Male: Total Population -- (Estimate)

B01001m2 SEX BY AGE: Male: Total Population -- (Margin of Error)

B01001e3 SEX BY AGE: Male: Under 5 years: Total Population -- (Estimate)

B01001m3 SEX BY AGE: Male: Under 5 years: Total Population -- (Margin of Error)

B01001e4 SEX BY AGE: Male: 5 to 9 years: Total Population -- (Estimate)

B01001m4 SEX BY AGE: Male: 5 to 9 years: Total Population -- (Margin of Error)

...

First, we select and download the ACS geodatabases using the functions offered by the package. Once we have them in a folder (in this case some examples are included in the package), this is a basic example which shows you how to solve a common problem:

library(geogenr)

folder <- system.file("extdata", package = "geogenr")
folder <- stringr::str_replace_all(paste(folder, "/", ""), " ", "")
ua <- uscb_acs_5ye(folder = folder)

(sa <- ua |> get_statistical_areas())
#>  [1] "Combined New England City and Town Area"   
#>  [2] "Combined Statistical Area"                 
#>  [3] "Metropolitan Division"                     
#>  [4] "Metropolitan/Micropolitan Statistical Area"
#>  [5] "New England City and Town Area"            
#>  [6] "New England City and Town Area Division"   
#>  [7] "Public Use Microdata Area"                 
#>  [8] "Tribal Block Group"                        
#>  [9] "Tribal Census Tract"                       
#> [10] "Urban Area"

(y <- ua |> get_available_years_downloaded(geodatabase = sa[6]))
#> [1] 2014 2015

ul <- uscb_layer(uscb_acs_metadata, ua = ua, geodatabase = sa[6], year = 2015)
(layers <- ul |> get_layer_names())
#> [1] "X00_COUNTS"      "X01_AGE_AND_SEX" "X02_RACE"

ul <- ul |> get_layer(layers[2])
(layer_groups <- ul |> get_layer_group_names())
#> [1] "001 - SEX BY AGE"        "002 - MEDIAN AGE BY SEX"
#> [3] "003 - TOTAL POPULATION"

ul <- ul |> get_layer_group(layer_groups[1])

gms <- ul |> get_geomultistar()
#> Warning in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, : GDAL
#> Message 6: driver OpenFileGDB does not support open option METHOD

For a folder, we get the years for which we have one area geodatabases downloaded. We select the geodatabase for a specific year (uscb_layer). From among the layers and groups of variables available, we select a layer (get_layer) and one of its groups (get_layer_group). From the selected variables we generate a geomultistar object (get_geomultistar).

The first rows of the dimension and fact tables are shown below.

when_keyyear
12015
where_keycnectafpnectafpnctadvfpgeoidnamenamelsadlsadmtfccalandawaterintptlatintptlonshape_lengthshape_areageoid_data
171571650716547165071654Boston-Cambridge-Newton, MABoston-Cambridge-Newton, MA NECTA DivisionM7G32203.668e+097.13e+08+42.2933266-071.01819297.6530.478335500US7165071654
271571650721047165072104Brockton-Bridgewater-Easton, MABrockton-Bridgewater-Easton, MA NECTA DivisionM7G32203527991758831197+42.0216172-071.02671701.0770.0393235500US7165072104
371571650731047165073104Framingham, MAFramingham, MA NECTA DivisionM7G322053251631424039093+42.2761738-071.48220081.7380.0607335500US7165073104
471571650736047165073604Haverhill-Newburyport-Amesbury Town, MA-NHHaverhill-Newburyport-Amesbury Town, MA-NH NECTA DivisionM7G322070208633340447613+42.8671722-071.02549821.4160.0817935500US7165073604
571571650742047165074204Lawrence-Methuen Town-Salem, MA-NHLawrence-Methuen Town-Salem, MA-NH NECTA DivisionM7G32202077357519917120+42.7282758-071.16307010.90940.0239235500US7165074204
671571650748047165074804Lowell-Billerica-Chelmsford, MA-NHLowell-Billerica-Chelmsford, MA-NH NECTA DivisionM7G322086314310627403003+42.6141693-071.48378212.4410.0977135500US7165074804
what_keyshort_namefull_nameinf_codegroup_codesubgroup_codespec_codeinfgroupsubgroupdemographic_agedemographic_sexdemographic_racedemographic_total_populationdemographic_total_population_spec
1B01001A_01SEX BY AGE (WHITE ALONE): Total: People Who Are White AloneB01001A1AGE AND SEXSEX BY AGEWHITE ALONEPeople Who Are White AloneTotal
2B01001A_02SEX BY AGE (WHITE ALONE): Male: People Who Are White AloneB01001A2AGE AND SEXSEX BY AGEWHITE ALONEMalePeople Who Are White Alone
3B01001A_03SEX BY AGE (WHITE ALONE): Male: Under 5 years: People Who Are White AloneB01001A3AGE AND SEXSEX BY AGEWHITE ALONEUnder 5 yearsMalePeople Who Are White Alone
4B01001A_04SEX BY AGE (WHITE ALONE): Male: 5 to 9 years: People Who Are White AloneB01001A4AGE AND SEXSEX BY AGEWHITE ALONE5 to 9 yearsMalePeople Who Are White Alone
5B01001A_05SEX BY AGE (WHITE ALONE): Male: 10 to 14 years: People Who Are White AloneB01001A5AGE AND SEXSEX BY AGEWHITE ALONE10 to 14 yearsMalePeople Who Are White Alone
6B01001A_06SEX BY AGE (WHITE ALONE): Male: 15 to 17 years: People Who Are White AloneB01001A6AGE AND SEXSEX BY AGEWHITE ALONE15 to 17 yearsMalePeople Who Are White Alone
when_keywhere_keywhat_keyestimatemargin_of_errornrow_agg
111213419644291
112103363430141
1135167110071
1145493212091
1155758512431
116361397711

Once we have a geomultistar object, we can use the functionality of starschemar and geomultistar packages to define multidimensional queries with geographic information.

library(starschemar)
library(geomultistar)

gms <- gms  |>
  define_geoattribute(
    attribute = c("name"),
    from_attribute = "geoid"
  )

gdqr <- dimensional_query(gms) |>
  select_dimension(name = "where",
                   attributes = c("name")) |>
  select_dimension(
    name = "what",
    attributes = c("short_name")
  ) |>
  select_fact(name = "sex_by_age",
              measures = c("estimate")) |>
  filter_dimension(name = "when", year == "2015") |>
  filter_dimension(name = "what",
                   demographic_age == "Under 5 years") |>
  run_geoquery()

The result is a vector layer that we can save, perform spatial analysis or queries on it, or we can see it as a map, using the functions associated with the sf class.

plot(gdqr[,"estimate"])

Once we have verified that the data for the reference year is what we need, we can expand our database considering the rest of the years available in the folder. The only requirement to consider a year is that its variable structure be the same as that of the reference year.

uf <- uscb_folder(ul)

cgms <- uf |> get_common_geomultistar()
#> Warning in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, : GDAL
#> Message 6: driver OpenFileGDB does not support open option METHOD

Instead of displaying all the tables, we focus on the table in the when dimension.

when_keyyear
12014
22015

Includes data for all available years.

Copy Link

Version

Install

install.packages('geogenr')

Monthly Downloads

187

Version

1.0.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Jose Samos

Last Published

October 11th, 2023

Functions in geogenr (1.0.1)

get_geodatabase_url

get_geodatabase_url
get_metadata

Get metadata
get_legal_and_administrative_areas

Get Legal and Administrative Area names
interpret_as_demographic_race

interpret_as_demographic_race
interpret_as_demographic_household

interpret_as_demographic_household
interpret_as_economic_health_insurance_coverage

interpret_as_economic_health_insurance_coverage
interpret_as_economic_food_stamps_snap

interpret_as_economic_food_stamps_snap
interpret_as_economic_work_status_last_year

interpret_as_economic_work_status_last_year
get_place

Get place
interpret_as_housing_units

interpret_as_housing_units
interpret_as_demographic_age

interpret_as_demographic_age
interpret_as_economic_poverty_status

interpret_as_economic_poverty_status
interpret_all

interpret_all
interpret_as_economic_journey_and_place_of_work

interpret_as_economic_journey_and_place_of_work
interpret_as_housing_computer_and_internet_use

interpret_as_housing_computer_and_internet_use
interpret_as_housing_rooms

interpret_as_housing_rooms
interpret_as_housing_value_of_home

interpret_as_housing_value_of_home
interpret_as_demographic_group_quarters_population

interpret_as_demographic_group_quarters_population
interpret_as

interpret_as
interpret_as_housing_tenure_owner_renter

interpret_as_housing_tenure_owner_renter
interpret_as_housing_year_structure_built

interpret_as_housing_year_structure_built
interpret_as_economic_industry_and_occupation

interpret_as_economic_industry_and_occupation
interpret_as_economic_income_and_earnings

interpret_as_economic_income_and_earnings
interpret_as_social_citizenship_status

interpret_as_social_citizenship_status
interpret_as_housing_plumbing_facilities

interpret_as_housing_plumbing_facilities
interpret_as_social_language_spoken_at_home

interpret_as_social_language_spoken_at_home
interpret_as_housing_occupants_per_room

interpret_as_housing_occupants_per_room
interpret_as_housing_occupancy_vacancy_status

interpret_as_housing_occupancy_vacancy_status
interpret_as_social_marital_status

interpret_as_social_marital_status
get_statistical_areas

Get Statistical Area names
interpret_as_social_veteran_status_military_service

interpret_as_social_veteran_status_military_service
get_year_from_filepath

Get year from filepath
interpret_as_social_fertility

interpret_as_social_fertility
interpret_metadata

Interpret metadata
interpret_as_social_school_enrollment

interpret_as_social_school_enrollment
interpret_as_social_grandparents_as_caregivers

interpret_as_social_grandparents_as_caregivers
new_uscb_layer

uscb_layer S3 class
interpret_code

interpret code
interpret_as_demographic_total_population

interpret_as_demographic_total_population
interpret_as_demographic_sex

interpret_as_demographic_sex
uscb_acs_5ye

uscb_acs_5ye S3 class
standardize_text2

Standardize text 2
new_uscb_folder

uscb_folder S3 class
new_uscb_acs_5ye

uscb_acs_5ye S3 class
interpret_as_housing_vehicles_available

interpret_as_housing_vehicles_available
new_uscb_metadata

uscb_metadata S3 class
show_fields

Show fields
uscb_acs_metadata

uscb_acs_metadata
interpret_as_housing_year_householder_moved_into_unit

interpret_as_housing_year_householder_moved_into_unit
standardize_text

Standardize text
interpret_as_housing_rent

interpret_as_housing_rent
interpret_as_social_place_of_birth

interpret_as_social_place_of_birth
interpret_social_ancestry

interpret_social_ancestry
interpret_as_social_migration_residence_1_year_ago

interpret_as_social_migration_residence_1_year_ago
uscb_layer

uscb_layer S3 class
url_file_exists

url_file_exists
interpret_as_social_educational_attainment

interpret_as_social_educational_attainment
uscb_metadata

uscb_metadata S3 class
interpret_as_social_year_of_entry

interpret_as_social_year_of_entry
scroll_level

scroll_level
interpret_as_social_disability_status

interpret_as_social_disability_status
interpret_as_survey

interpret_as_survey
uscb_folder

uscb_folder S3 class
reformat_metadata

Reformat metadata
interpret_values

interpret values
same_layer_group_columns

same_layer_group_columns
replace_numbers

Replace numbers
delete_empty_columns

Delete empty columns
get_available_years_in_the_web

Get available years in the web
get_common_flat_table

Get common flat table
get_available_years_downloaded

Get available years downloaded
get_common_geomultistar

Get common geomultistar
download_geodatabases

Download geodatabases
get_geomultistar

Get geomultistar
get_layer

Get layer
get_flat_table

Get flat table
get_geodatabase_file

get_geodatabase_file
define_geomultistar

define_geomultistar
get_field_values

get_field_values
assign_level

assign_level
get_basic_flat_table

Get basic flat table
add_value

add_value
get_layer_group_names

Get layer group names
get_layer_group

Get layer group
get_layer_names

Get layer names