diarrhea_do_analysis: Code for calculating Diarrhea disease cases attributable to extreme precipitation and extreme temperature Run Full diarrhea-Climate Analysis Pipeline

Description

The diarrhea_do_analysis function runs the complete analysis workflow by combining multiple functions to analyze the association between diarrhea cases and climate variables. It processes health, climate, and spatial data, fits models, generates plots, and calculates attributable risk.

Usage

diarrhea_do_analysis(
  health_data_path,
  climate_data_path,
  map_path,
  region_col,
  district_col,
  date_col = NULL,
  year_col,
  month_col,
  case_col,
  tot_pop_col,
  tmin_col,
  tmean_col,
  tmax_col,
  rainfall_col,
  r_humidity_col,
  runoff_col,
  geometry_col,
  spi_col = NULL,
  ndvi_col = NULL,
  max_lag = 2,
  nk = 2,
  basis_matrices_choices,
  inla_param,
  param_term,
  level,
  param_threshold = 1,
  filter_year = NULL,
  family = "nbinomial",
  group_by_year = FALSE,
  config = TRUE,
  save_csv = FALSE,
  save_model = TRUE,
  save_fig = FALSE,
  cumulative = FALSE,
  output_dir = NULL
)

Value

A list containing:

Model output from INLA
Monthly random effects plot
Yearly random effects plot
Contour plot
Relative risk map
Relative risk plot
Attributable fraction and number summary

Arguments

health_data_path: Character. Path to the processed health data file.
climate_data_path: Character. Path to the processed climate data file.
map_path: Character. Path to the spatial data file (e.g., shapefile).
region_col: Character. Column name for the region variable.
district_col: Character. Column name for the district variable.
date_col: Character (optional). Column name for the date variable. Defaults to NULL.
year_col: Character. Column name for the year variable.
month_col: Character. Column name for the month variable.
case_col: Character. Column name for diarrhea case counts.
tot_pop_col: Character. Column name for total population.
tmin_col: Character. Column name for minimum temperature.
tmean_col: Character. Column name for mean temperature.
tmax_col: Character. Column name for maximum temperature.
rainfall_col: Character. Column name for cumulative monthly rainfall.
r_humidity_col: Character. Column name for relative humidity.
runoff_col: Character. Column name for monthly runoff data.
geometry_col: Character. Column name of the geometry column in the shapefile (usually "geometry").
spi_col: Character (optional). Column name for the Standardized Precipitation Index (SPI). Defaults to NULL.
ndvi_col: Character (optional). Column name for the Normalized Difference Vegetation Index (NDVI). Defaults to NULL.
max_lag: Numeric. Maximum temporal lag to include in the distributed lag model (e.g., 2-4). Defaults to 2.
nk: Numeric. Number of internal knots for the natural spline of each predictor, controlling its flexibility: nk = 0 produces a linear effect with one basis column, nk = 1 generates a simple spline with two columns, nk = 2 yields a more flexible spline with three columns, and higher values of nk further increase flexibility but may also raise collinearity among spline terms. Defaults to 2.
basis_matrices_choices: Character vector. Specifies which climate variables to include in the basis matrix (e.g., c("tmax", "rainfall", "r_humidity")).
inla_param: Character vector. Specifies exposure variables included in the INLA model (e.g., c("tmin", "rainfall", "r_humidity")).
param_term: Character or vector. Exposure variable(s) of primary interest for relative risk and attribution (e.g., "tmax", "rainfall").
level: Character. Spatial disaggregation level; must be one of "country", "region", or "district".
param_threshold: Numeric. Threshold above which exposure is considered "attributable." Defaults to 1.
filter_year: Integer or vector (optional). Year(s) to filter the data by. Defaults to NULL.
family: Character. Probability distribution for the outcome variable. Options include "poisson" (default) and "nbinomial" for a negative binomial model.
group_by_year: Logical. Whether to group attributable metrics by year. Defaults to FALSE.
config: Logical. Whether to enable additional INLA model configurations. Defaults to TRUE.
save_csv: Logical. If TRUE, saves intermediate datasets to CSV. Defaults to TRUE.
save_model: Logical. If TRUE, saves fitted INLA model results. Defaults to TRUE.
save_fig: Logical. If TRUE, saves generated plots. Defaults to TRUE.
cumulative: Boolean. If TRUE, plot and save cumulative risk of all year for the specific exposure at region and district level. Defaults to FALSE.
output_dir: Character. Directory where output files (plots, datasets, maps) are saved. Defaults to NULL.