Learn R Programming

nadaverse (version 0.1.0)

search_catalog: Search Catalogs

Description

Performs a comprehensive search in the specified catalog's API endpoint, utilizing a full range of available searching, filtering, and sorting parameters.

Usage

search_catalog(
  catalog,
  keyword = NULL,
  from = NULL,
  to = NULL,
  country = NULL,
  inc_iso = NULL,
  collection = NULL,
  created = NULL,
  dtype = NULL,
  sort_by = NULL,
  sort_order = NULL,
  ps = NULL,
  page = NULL,
  rows = TRUE
)

Value

If rows = TRUE (default), returns a data frame where each row is a data entry matching the search criteria. If rows = FALSE, returns a list containing search metadata, including the total number of records found and the search parameters used.

Arguments

catalog

A required character string specifying the name of the data catalog (e.g., "fao", "wb"). Valid codes can be found in the documentation for access_codes().

keyword

A character string used to search data titles, descriptions, and keywords (e.g., "lsms").

from

An integer indicating the start year for the data collection's coverage period (e.g., 2000).

to

An integer indicating the end year for the data collection's coverage period (e.g., 2010).

country

A character vector. Provide one or more country names or ISO 3 codes (case-insensitive). For valid codes, see country_codes(). Multiple values should be passed as a vector, e.g., c("afg", "Indonesia", "bra").

inc_iso

A logical value. If TRUE, the results data frame will include the ISO3 country codes; otherwise, it will contain only country names. Default: NULL.

collection

A character vector. Filters results by the data collection repository ID, which is returned in the repo_id column by collections(). Multiple IDs can be searched by passing a vector.

created

A character string used to filter results by the date of creation or update within the catalog. Use the date format YYYY-MM-DD.

  • Single date: "2020/04/01" (returns records created on or after this date).

  • Date range: "2020/04/01-2020/04/20" (returns records within the range).

dtype

A character vector. Filters results by one or more data access types. Valid values include: "open", "direct", "public", "licensed", "enclave", "remote", and "other". See access_codes() for a list of available types by catalog. Example: c("open", "licensed").

sort_by

A character string used to specify the column by which to sort the results. Valid values are: "rank", "title", "nation" (for country), or "year". Note that "country" is automatically mapped to the API field "nation".

sort_order

A character string indicating the sort direction. Must be either "asc" (ascending) or "desc" (descending).

ps

An integer indicating the number of records to display per page of results. Default: 15 records.

page

An integer specifying the page number of the search results to return.

rows

A logical value. If TRUE, the function returns only a data frame containing the list of returned studies; otherwise, a list containing detailed search metadata (e.g., total records found, total pages) instead of the data records themselves. Default: TRUE.

Author

Gutama Girja Urago

Details

This function constructs a complex API query based on the provided arguments (such as keywords, temporal range, geography, and access types) and returns the matching data entries. The function automatically handles URL encoding and JSON parsing.

All parameters correspond directly to the search options available on the NADA (National Data Archive) platform used by organizations like the World Bank and FAO.

See Also

access_codes, collections, country_codes, latest_entries

Examples

Run this code
if (FALSE) {
# Example 1: Basic search for a keyword in the World Bank catalog
wb_search <- search_catalog(
  catalog = "wb",
  keyword = "LSMS",
  ps = 5, # 5 records per page
  page = 1
)
head(wb_search)

# Example 2: Search by country and year range
fao_search <- search_catalog(
  catalog = "fao",
  country = c("Kenya", "UGA"),
  from = 2010,
  to = 2020,
  sort_by = "year",
  sort_order = "desc"
)

# Example 3: Filter by access type and get search information
ilo_info <- search_catalog(
  catalog = "ilo",
  keyword = "labor",
  dtype = "public",
  rows = FALSE
)
print(ilo_info$found) # Check total number of records found

# Example 4: Include ISO codes in results
ihsn_results <- search_catalog(
  catalog = "ihsn",
  inc_iso = TRUE
)
head(ihsn_results)
}

Run the code above in your browser using DataLab