Learn R Programming

ipumsr (version 0.9.0)

define_extract_nhgis: Define an IPUMS NHGIS extract request

Description

[Deprecated]

Define the parameters of an IPUMS NHGIS extract request to be submitted via the IPUMS API.

This function has been deprecated in favor of define_extract_agg(), which can be used to define extracts for both IPUMS aggregate data collections (IPUMS NHGIS and IPUMS IHGIS). Please use that function instead.

All NHGIS extract request parameters supported by define_extract_nhgis() are supported by define_extract_agg().

Learn more about the IPUMS API in vignette("ipums-api") and NHGIS extract definitions in vignette("ipums-api-agg").

Usage

define_extract_nhgis(
  description = "",
  datasets = NULL,
  time_series_tables = NULL,
  shapefiles = NULL,
  geographic_extents = NULL,
  breakdown_and_data_type_layout = NULL,
  tst_layout = NULL,
  data_format = NULL
)

Value

An object of class nhgis_extract containing the extract definition.

Arguments

description

Description of the extract.

datasets

List of dataset specifications for any datasets to include in the extract request. Use ds_spec() to create a ds_spec object containing a dataset specification. See examples.

time_series_tables

List of time series table specifications for any time series tables to include in the extract request. Use tst_spec() to create a tst_spec object containing a time series table specification. See examples.

shapefiles

Names of any shapefiles to include in the extract request.

geographic_extents

Vector of geographic extents to use for all of the datasets and time_series_tables in the extract definition (for instance, to obtain data within a specified state). By default, selects all available extents.

Use get_metadata() to identify the available extents for a given dataset or time series table, if any.

breakdown_and_data_type_layout

The desired layout of any datasets that have multiple data types or breakdown values.

  • "single_file" (default) keeps all data types and breakdown values in one file

  • "separate_files" splits each data type or breakdown value into its own file

Required if any datasets included in the extract definition consist of multiple data types (for instance, estimates and margins of error) or have multiple breakdown values specified. See get_metadata() to determine whether a requested dataset has multiple data types.

tst_layout

The desired layout of all time_series_tables included in the extract definition.

  • "time_by_column_layout" (wide format, default): rows correspond to geographic units, columns correspond to different times in the time series

  • "time_by_row_layout" (long format): rows correspond to a single geographic unit at a single point in time

  • "time_by_file_layout": data for different times are provided in separate files

Required when an extract definition includes any time_series_tables.

data_format

The desired format of the extract data file.

  • "csv_no_header" (default) includes only a minimal header in the first row

  • "csv_header" includes a second, more descriptive header row.

  • "fixed_width" provides data in a fixed width format

Note that by default, read_ipums_agg() removes the additional header row in "csv_header" files.

Required when an extract definition includes any datasets or time_series_tables.

See Also

get_metadata_catalog() to find data to include in an extract definition.

submit_extract() to submit an extract request for processing.

save_extract_as_json() and define_extract_from_json() to share an extract definition.

Examples

Run this code
# Previously, you could create an NHGIS extract definition like so:
nhgis_extract <- define_extract_nhgis(
  description = "Example NHGIS extract",
  datasets = ds_spec(
    "1990_STF3",
    data_tables = "NP57",
    geog_levels = c("county", "tract")
  )
)

# Now, use the following:
nhgis_extract <- define_extract_agg(
  collection = "nhgis",
  description = "Example NHGIS extract",
  datasets = ds_spec(
    "1990_STF3",
    data_tables = "NP57",
    geog_levels = c("county", "tract")
  )
)

Run the code above in your browser using DataLab