data_files: Get Study Data Files List and Data Dictionary

Description

Retrieves information about the files included in a study, or the detailed data dictionary (variables) for the entire study or a specific data file.

Usage

data_files(catalog, id)
data_dictionary(catalog, id, file_id = NULL)

Value

The return value depends on the function called:

data_files(): A data frame detailing the files associated with the study. Typical columns include file_name, dfile_id, file_type, and file_size.
data_dictionary(): A data frame containing the variable-level metadata (the data dictionary). Typical columns include name, label, and var_id.

If the API returns no files or variables, a warning message is issued.

Arguments

catalog: A required character string specifying the name of the data catalog (e.g., "wb", "fao"). Valid codes can be found in the documentation for catalogs().
id: A required study identifier. Accepts either the numeric Study ID (integer, e.g., 101) or the character Study ID Number (string, e.g., "ALB_2012_LSMS_v01_M_v01_A_PUF"). These values are typically returned in the search results from search_catalog(), latest_entries() or data_files().
file_id: An optional character identifier, applicable only to data_dictionary(). This is the ID of a specific data file within the study, typically found in the file_id column returned by data_files(). If NULL (default), data_dictionary() attempts to fetch variables for the entire study.

Author

Gutama Girja Urago

Details

data_files() returns the list of files available for a study, along with metadata like file name, size, and ID.

data_dictionary() retrieves the variable-level metadata, including variable names, labels, and definitions. If file_id is provided, it retrieves the dictionary for that specific file; otherwise, it attempts to fetch the dictionary for the entire study. The function automatically detects whether the provided study identifier (id) is numeric or character.

Examples

Run this code

if (FALSE) {
# Example 1: Get the list of files for a World Bank study (using idno)
study_idno <- "ALB_2012_LSMS_v01_M_v01_A_PUF"
files_wb <- data_files(catalog = "wb", id = study_idno)
print(files_wb)

# Example 2: Get the data dictionary for the entire study (using idno)
dictionary_all <- data_dictionary(catalog = "wb", id = study_idno)
head(dictionary_all)

# Example 3: Get the data dictionary for a specific file
# First, retrieve the files to find a file_id (dfile_id)
file_id_to_use <- files_wb$file_id[1] # Use the ID of the first file
dictionary_file <- data_dictionary(
  catalog = "wb",
  id = study_idno,
  file_id = file_id_to_use
)
head(dictionary_file)
}

Run the code above in your browser using DataLab