read_cpi: Read CPI data from an Excel file

Description

Loads and normalizes a CPI time series from an Excel worksheet. The function detects the date/year column and the CPI/value column by pattern-matching on lower-cased header names, parses localized numerics (via to_num_commas()), collapses duplicate years by averaging, and returns a clean, sorted data frame.

Usage

read_cpi(path_cpi)

Value

A data.frame with two columns:

Year (integer)
CPI (numeric)

Arguments

path_cpi: Character path to the CPI Excel file.

Details

Column detection. Headers are lower-cased and matched with:

Date/year: patterns "date|fecha|year|anio|ano".
CPI/value: patterns "cpi|indice|price".

If either column cannot be identified, the function errors.

Cleaning.

Year is extracted as the first 4 digits of the date-like column.
CPI is parsed with to_num_commas() (handles commas/thousands).
NA rows are dropped; duplicates in Year are averaged.
Output is sorted by Year ascending.

Examples

Run this code

# \donttest{
# Create a temporary Excel file with sample CPI data
temp_file <- tempfile(fileext = ".xlsx")
df_sample <- data.frame(
  Fecha = c("2019-01-01", "2020-01-01", "2021-01-01", "2022-01-01"),
  Indice = c(95.5, 100.0, 103.2, 108.7)
)
openxlsx::write.xlsx(df_sample, temp_file)

# Read the CPI data
df <- read_cpi(temp_file)
print(df)

# Verify structure
stopifnot(
  is.data.frame(df),
  all(c("Year", "CPI") %in% names(df)),
  nrow(df) == 4
)

# Clean up
unlink(temp_file)
# }