Learn R Programming

nadaverse: Browse Microdata Catalogs Using NADA REST API

nadaverse is the essential R package for researchers, policy analysts, and data enthusiasts seeking streamlined, programmatic access to vast collections of global microdata.

Many national and international organizations—including the World Bank, IHSN, FAO, UNHCR, and ILO—use the National Data Archive (NADA) software to manage and disseminate their survey and census data. While these catalogs are rich sources of information, interacting with them often requires tedious manual browsing or complex API construction.

nadaverse cuts through that complexity. It provides a unified, reliable, and user-friendly interface to search, filter, and retrieve crucial metadata and documentation (such as file lists and data dictionaries) directly into your R environment.

Features

  • Search across 8 major NADA catalogs using a unified interface
  • Retrieve study metadata, file inventories, and data dictionaries
  • Filter by year, keywords, country, catalog, and more
  • Built on httr2, ensuring fast and reliable API calls
  • Tidy-friendly: outputs are clean data frames ready for immediate use
  • Includes convenient helper functions for codes, catalogs, and recent entries

Installation

Install the CRAN release:

install.packages("nadaverse")

Or install the development version from GitHub:

devtools::install_github("guturago/nadaverse")

Searching

1. Catalog Discovery

The catalogs() function is the starting point, providing a complete, current list of the supported NADA repositories, along with their unique identifiers required for subsequent queries.

library(nadaverse)
library(tidyverse)
library(knitr)
catalogs()
#> 
#> ── List of Supported Catalogs ──
#> 
#> ℹ name: Link to the catalog
#> • df: Data First (<https://www.datafirst.uct.ac.za>)
#> • erf: Economic Research Forum (<https://erfdataportal.com>)
#> • fao: Food and Agriculture Organization (<https://microdata.fao.org>)
#> • ihsn: International Household Survey Network (<https://catalog.ihsn.org>)
#> • ilo: International Labour Organization (<https://www.ilo.org/surveyLib>)
#> • india: Government of India (<https://microdata.gov.in>)
#> • unhcr: United Nations High Commissioner for Refugees
#> (<https://microdata.unhcr.org>)
#> • wb: The World Bank (<https://microdata.worldbank.org>)

2. Targeted Metadata Search

The search_catalog() function allows for granular control over the search space. Instead of relying on the catalog’s often limited web interface, users can programmatically search by catalog ID, keywords, publication date ranges, and more.

The output is a standardized data frame, simplifying cross-catalog comparisons. Here, we search the World Bank catalog (wb) for recently published studies:

search_catalog(
  catalog = "ihsn",
  from = 2023, 
  to = 2025,
  ps = 5
)

3. Deep Dive: File and Variable Metadata

Once a specific study is identified via its unique ID (e.g., 3110), nadaverse enables the retrieval of documentation critical for data preparation.

File Inventory (data_files): This function retrieves the list of data file assets, their size, and descriptions, allowing users to determine the exact resources needed for download.

c <- "wb"
data_files(c, 3110) |> 
  select(where(~ !all(. == "NULL"))) |> 
  kable(format = "pipe")
idsidfile_idfile_namedescriptioncase_count
B1144503110BIND2015-B.datBirth records1315617
C1144513110CIND2015-C.datChild records259627
H1144533110HIND2015-H.datHousehold member records2869043
M1144523110MIND2015-M.datMan records112122
W1144493110WIND2015-W.datWoman records699686

Data Dictionary (data_dictionary): Access to variable-level metadata is paramount for data quality checks and ethical use. This function retrieves the complete data dictionary, including variable names, labels, and value ranges, enabling preparation work before downloading large datasets.

data_dictionary(c, 3110) |>
  head(10) |> 
  select(where(~ !all(. == "NULL"))) |> 
  kable(format = "pipe")
uidsidfidvidnamelabl
26099133110WW_SAMPLEW_SAMPLEIPUMS-DHS sample identifier
26099143110WW_SAMPLESTRW_SAMPLESTRIPUMS-DHS sample identifier (string)
26099153110WW_COUNTRYW_COUNTRYCountry
26099163110WW_YEARW_YEARYear of sample
26099173110WW_IDHSPIDW_IDHSPIDUnique cross-sample respondent identifier
26099183110WW_IDHSHIDW_IDHSHIDUnique cross-sample household identifier
26099193110WW_DHSIDW_DHSIDKey to link DHS clusters to context data (string)
26099203110WW_IDHSPSUW_IDHSPSUUnique sample-case PSU identifier
26099213110WW_IDHSSTRATAW_IDHSSTRATAUnique cross-sample sampling strata
26099223110WW_CASEIDW_CASEIDSample-specific respondent identifier

Advanced Wrangling and Analysis Preparation

The design goal of nadaverse is to ensure its outputs are immediately “tidy” and ready for integration into analytical pipelines. This means the results can be piped directly into dplyr verbs for filtering, reshaping, and analysis preparation, as demonstrated by this example.

This transformation searches the FAO catalog, filters studies by keyword (“Food Insecurity”), and reshapes the resulting metadata into a concise matrix showing which countries conducted the survey in which years—a common preparatory step for cross-country comparative research.

search_catalog("fao", "Food Insecurity", ps = 10000) |>
  filter(grepl("Food Insecurity Experience Scale", title, TRUE)) |>
  select(nation, year_start) |>
  arrange(nation, year_start) |> 
  mutate(value = "Yes") |>
  pivot_wider(id_cols = nation,
              names_from = year_start,
              values_from = value,
              values_fill = "-") |>
  head(5) |> 
  kable(format = "pipe")
nation20142015201620172018201920202021202220232024
AfghanistanYesYesYesYesYesYesYesYesYesYes-
AlbaniaYesYesYesYes-YesYesYesYesYes-
AlgeriaYes-YesYesYesYesYesYes---
AngolaYes----------
Antigua and Barbuda-------Yes---

Helper Functions for Workflow Efficiency

To further streamline the research process, nadaverse includes several helper functions that provide necessary IDs and codes used as query parameters in NADA systems.

These utility functions assist in identifying necessary access codes, collection names, and country codes for specific, authenticated queries.

access_codes("fao")
collections("wb")
country_codes("wb")
latest_entries("ihsn")

Copy Link

Version

Install

install.packages('nadaverse')

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Gutama Girja Urago

Last Published

December 11th, 2025

Functions in nadaverse (0.1.0)

catalogs

Small Helper Functions for Data Catalog Access
data_files

Get Study Data Files List and Data Dictionary
search_catalog

Search Catalogs