Learn R Programming

BayesianDisaggregation (version 0.1.2)

read_weights_matrix: Read a weights matrix from an Excel file

Description

Loads a sector-by-year weight table, normalizes weights to the simplex per year, and returns a list with the \(T \times K\) prior matrix P, the sector names, and the year vector. The first column is assumed to contain sector names (renamed to Industry); all other columns are treated as years.

Usage

read_weights_matrix(path_weights)

Value

A list with:

P

\(T \times K\) numeric matrix of prior weights (rows sum to 1).

industries

Character vector of sector names (length \(K\)).

years

Integer vector of years (length \(T\)).

Arguments

path_weights

Character path to the weights Excel file.

Details

Expected layout. One sheet with:

  • First column: sector names (any header; renamed to Industry).

  • Remaining columns: years; the function extracts a 4-digit year from each header using stringr::str_extract(Year, "\\d{4}").

Values are parsed with to_num_commas(), missing rows are dropped, and weights are normalized within each year to sum to 1. Any absent (sector, year) entry becomes 0 when pivoting wide. Finally, rows are re-normalized with row_norm1() for numerical safety.

Safeguards.

  • Rows with all-missing/zero after parsing are dropped by the filters.

  • If no valid year columns are found, the function errors.

See Also

read_cpi, bayesian_disaggregate

Examples

Run this code
# \donttest{
# Create a temporary Excel file with sample weights
temp_file <- tempfile(fileext = ".xlsx")
df_sample <- data.frame(
  Sector = c("Agriculture", "Manufacturing", "Services", "Construction"),
  "2019" = c(0.20, 0.35, 0.30, 0.15),
  "2020" = c(0.18, 0.37, 0.32, 0.13),
  "2021" = c(0.17, 0.38, 0.33, 0.12),
  "2022" = c(0.16, 0.39, 0.34, 0.11),
  check.names = FALSE
)
openxlsx::write.xlsx(df_sample, temp_file)

# Read the weights matrix
w <- read_weights_matrix(temp_file)

# Inspect structure
str(w)
print(w$P)

# Verify properties
stopifnot(
  is.matrix(w$P),
  nrow(w$P) == 4,  # 4 years
  ncol(w$P) == 4,  # 4 sectors
  all(abs(rowSums(w$P) - 1) < 1e-10),  # rows sum to 1
  length(w$industries) == 4,
  length(w$years) == 4
)

# Clean up
unlink(temp_file)
# }

Run the code above in your browser using DataLab