Learn R Programming

climatehealth (version 1.0.0)

malaria_do_analysis: Code for calculating Malaria disease cases attributable to extreme rainfall and extreme temperature Run Full Malaria-Climate Analysis Pipeline

Description

The Malaria_do_analysis() function executes the complete workflow for analyzing the association between malaria cases and climate variables. It integrates health, climate, and spatial data; fits spatio-temporal models using INLA; and generates a suite of diagnostic and inferential outputs, including plots and attributable risk estimates.

Usage

malaria_do_analysis(
  health_data_path,
  climate_data_path,
  map_path,
  region_col,
  district_col,
  date_col = NULL,
  year_col,
  month_col,
  case_col,
  tot_pop_col,
  tmin_col,
  tmean_col,
  tmax_col,
  rainfall_col,
  r_humidity_col,
  runoff_col,
  geometry_col,
  spi_col = NULL,
  ndvi_col = NULL,
  max_lag = 2,
  nk = 2,
  basis_matrices_choices,
  inla_param,
  param_term,
  level,
  param_threshold = 1,
  filter_year = NULL,
  family = "nbinomial",
  group_by_year = FALSE,
  cumulative = FALSE,
  config = FALSE,
  save_csv = FALSE,
  save_model = FALSE,
  save_fig = FALSE,
  output_dir = NULL
)

Value

A named list containing:

  • inla_result - Fitted INLA model object and summaries.

  • plot_malaria, plot_tmax, plot_rainfall - Exploratory time-series plots.

  • reff_plot_monthly - Monthly random effects plot.

  • reff_plot_yearly - Yearly spatial random effects plot.

  • contour_plot - Exposure-response contour plot.

  • rr_map_plot - Spatial relative risk map.

  • rr_plot, rr_df - Relative risk plot and associated data.

  • attr_frac_num - Attributable risk summary table.

  • plot_AR_num, plot_AR_frac, plot_AR_per_100k - Plots of attributable number, fraction, and rate.

Arguments

health_data_path

Character. Path to the processed health data file.

climate_data_path

Character. Path to the processed climate data file.

map_path

Character. Path to the spatial data file (e.g., shapefile).

region_col

Character. Column name for the region variable.

district_col

Character. Column name for the district variable.

date_col

Character (optional). Column name for the date variable. Defaults to NULL.

year_col

Character. Column name for the year variable.

month_col

Character. Column name for the month variable.

case_col

Character. Column name for malaria case counts.

tot_pop_col

Character. Column name for total population.

tmin_col

Character. Column name for minimum temperature.

tmean_col

Character. Column name for mean temperature.

tmax_col

Character. Column name for maximum temperature.

rainfall_col

Character. Column name for cumulative monthly rainfall.

r_humidity_col

Character. Column name for relative humidity.

runoff_col

Character. Column name for monthly runoff data.

geometry_col

Character. Column name of the geometry column in the shapefile (usually "geometry").

spi_col

Character (optional). Column name for the Standardized Precipitation Index (SPI). Defaults to NULL.

ndvi_col

Character (optional). Column name for the Normalized Difference Vegetation Index (NDVI). Defaults to NULL.

max_lag

Numeric. Maximum temporal lag to include in the distributed lag model (e.g., 2-4). Defaults to 4.

nk

Numeric. Number of internal knots for the natural spline of each predictor, controlling its flexibility: nk = 0 produces a linear effect with one basis column, nk = 1 generates a simple spline with two columns, nk = 2 yields a more flexible spline with three columns, and higher values of nk further increase flexibility but may also raise collinearity among spline terms. Defaults to 2.

basis_matrices_choices

Character vector. Specifies which climate variables to include in the basis matrix (e.g., c("tmax", "rainfall", "r_humidity")).

inla_param

Character vector. Specifies exposure variables included in the INLA model (e.g., c("tmin", "rainfall", "r_humidity")).

param_term

Character or vector. Exposure variable(s) of primary interest for relative risk and attribution (e.g., "tmax", "rainfall").

level

Character. Spatial disaggregation level; must be one of "country", "region", or "district".

param_threshold

Numeric. Threshold above which exposure is considered "attributable." Defaults to 1.

filter_year

Integer or vector (optional). Year(s) to filter the data by. Defaults to NULL.

family

Character. Probability distribution for the outcome variable. Options include "poisson" (default) and "nbinomial" for a negative binomial model.

group_by_year

Logical. Whether to group attributable metrics by year. Defaults to FALSE.

cumulative

Boolean. If TRUE, plot and save cumulative risk of all year for the specific exposure at region and district level. Defaults to FALSE.

config

Logical. Whether to enable additional INLA model configurations. Defaults to TRUE.

save_csv

Logical. If TRUE, saves intermediate datasets to CSV. Defaults to TRUE.

save_model

Logical. If TRUE, saves fitted INLA model results. Defaults to TRUE.

save_fig

Logical. If TRUE, saves generated plots. Defaults to TRUE.

output_dir

Character. Directory where output files (plots, datasets, maps) are saved. Defaults to NULL.