suicides_heat_do_analysis: Full analysis pipeline for the suicides and extreme heat indicator

Description

Runs the full pipeline to analyse the impact of extreme heat on suicides using a time-stratified case-crossover approach with distributed lag non-linear model. This function generates relative risk of the suicide-temperature association as well as attributable numbers, rates and fractions of suicides to a specified temperature threshold. Model validation statistics are also provided.

Usage

suicides_heat_do_analysis(
  data_path,
  date_col,
  region_col = NULL,
  temperature_col,
  health_outcome_col,
  population_col,
  country = "National",
  meta_analysis = FALSE,
  var_fun = "bs",
  var_degree = 2,
  var_per = c(25, 50, 75),
  lag_fun = "strata",
  lag_breaks = 1,
  lag_days = 2,
  independent_cols = NULL,
  control_cols = NULL,
  cenper = 50,
  attr_thr = 97.5,
  save_fig = FALSE,
  save_csv = FALSE,
  output_folder_path = NULL,
  seed = NULL
)

Value

qaic_results A dataframe of QAIC and dispersion metrics for each model combination and geography.
qaic_summary A dataframe with the mean QAIC and dispersion metrics for each model combination.
vif_results A dataframe. Variance inflation factors for each independent variables by region.
vif_summary A dataframe with the mean variance inflation factors for each independent variable.
meta_test_res A dataframe of results from statistical tests on the meta model.
power_list A list containing power information by area.
rr_results Dataframe containing cumulative relative risk and confidence intervals from analysis.
res_attr_tot Dataframe. Total attributable fractions, numbers and rates for each area over the whole time series.
attr_yr_list List. Dataframes containing yearly estimates of attributable fractions, numbers and rates by area.
attr_mth_list List. Dataframes containing total attributable fractions, numbers and rates by calendar month and area.

Arguments

data_path: Path to a csv file containing a daily time series of data for a particular health outcome and climate variables, which may be disaggregated by region.
date_col: Character. Name of the column in the dataframe that contains the date.
region_col: Character. Name of the column in the dataframe that contains the region names. Defaults to NULL.
temperature_col: Character. Name of the column in the dataframe that contains the temperature column.
health_outcome_col: Character. Name of the column in the dataframe that contains the health outcome count column (e.g. number of deaths, hospital admissions).
population_col: Character. Name of the column in the dataframe that contains the population estimate coloumn.
country: Character. Name of country for national level estimates.
meta_analysis: Boolean. Whether to perform a meta-analysis.
var_fun: Character. Exposure function for argvar (see dlnm::crossbasis). Defaults to 'bs'.
var_degree: Integer. Degree of the piecewise polynomial for argvar (see dlnm:crossbasis). Defaults to 2 (quadratic).
var_per: Vector. Internal knot positions for argvar (see dlnm::crossbasis). Defaults to c(25,50,75).
lag_fun: Character. Exposure function for arglag (see dlnm::crossbasis). Defaults to 'strata'.
lag_breaks: Integer. Internal cut-off point defining the strata for arglag (see dlnm:crossbasis). Defaults to 1.
lag_days: Integer. Maximum lag. Defaults to 2. (see dlnm:crossbasis).
independent_cols: Additional independent variables to test in model validation
control_cols: A list of confounders to include in the final model adjustment. Defaults to NULL if none.
cenper: Integer. Value for the percentile in calculating the centering value 0-100. Defaults to 50.
attr_thr: Integer. Percentile at which to define the temperature threshold for calculating attributable risk.
save_fig: Boolean. Whether to save the plot as an output. Defaults to FALSE.
save_csv: Boolean. Whether to save the results as a CSV. Defaults to FALSE.
output_folder_path: Path to folder where plots and/or CSV should be saved. Defaults to NULL.
seed: Optional integer random seed used when sampling residuals for model validation plots. Defaults to NULL.

Details

This analysis pipeline requires a daily time series of temperature and suicide deaths with population values as a minimum. This is then processed using a conditional Poisson case-crossover analysis with distributed lag non-linear model and optional meta-analysis. Meta-analysis is recommended if the input data is disaggregated by area.

The model parameters have default values, which are recommended to keep as based on existing studies. However, if desired these can be adjusted for sensitivity analysis.

Model validation testing is provided as a standard output from the pipeline so a user can assess the quality of the model. If a user has additional independent variables these can be specified as independent_cols and assessed within different model combinations in the outputs of this testing. These can be added in the final model via control_cols.

For attributable deaths the default is to use extreme heat as a threshold, defined as the 97.5th percentile of temperature over the corresponding time period for each geography. This can be adjusted if desired, following review of the relative risk association between temperature and suicides, using attr_thr.

Further details on the input data requirements, methodology, quality information and guidance on interpreting outputs can be found in the accompanying published tools:::Rd_expr_doi("10.5281/zenodo.14050224").

References

Pearce M, Watkins E, Glickman M, Lewis B, Ingole V. Standards for Official Statistics on Climate-Health Interactions (SOSCHI): Suicides attributed to extreme heat: methodology. Zenodo; 2024. Available from: tools:::Rd_expr_doi("10.5281/zenodo.14050224")
Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. Lancet. 2015 Jul;386(9991):369-75. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0140673614621140
Kim Y, Kim H, Gasparrini A, Armstrong B, Honda Y, Chung Y, et al. Suicide and Ambient Temperature: A Multi-Country Multi-City Study. Environ Health Perspect. 2019 Nov;127(11):1-10. Available from: https://pubmed.ncbi.nlm.nih.gov/31769300/
Gasparrini A, Armstrong B. Reducing and meta-analysing estimates from distributed lag non-linear models. BMC Med Res Methodol. 2013 Jan 9;13:1. Available from: tools:::Rd_expr_doi("10.1186/1471-2288-13-1")
Gasparrini A, Armstrong B, Kenward MG. Multivariate meta-analysis for non-linear and other multi-parameter associations. Stat Med. 2012 Dec 20;31(29):3821-39. Available from: tools:::Rd_expr_doi("10.1002/sim.5471")
Sera F, Armstrong B, Blangiardo M, Gasparrini A. An extended mixed-effects framework for meta-analysis. Stat Med. 2019 Dec 20;38(29):5429-44. Available from: tools:::Rd_expr_doi("10.1002/sim.8362")
Gasparrini A, Leone M. Attributable risk from distributed lag models. BMC Med Res Methodol. 2014 Dec 23;14(1):55. Available from: https://link.springer.com/article/10.1186/1471-2288-14-55

Examples

Run this code

# \donttest{
example_data <- data.frame(
  date = seq.Date(as.Date("2020-01-01"), by = "day", length.out = 365),
  region = "Example Region",
  tmean = stats::runif(365, 5, 30),
  suicides = stats::rpois(365, lambda = 2),
  pop = 250000
)
example_path <- tempfile(fileext = ".csv")
utils::write.csv(example_data, example_path, row.names = FALSE)

suicides_heat_do_analysis(
  data_path = example_path,
  date_col = "date",
  region_col = "region",
  temperature_col = "tmean",
  health_outcome_col = "suicides",
  population_col = "pop",
  country = "Example Region",
  meta_analysis = FALSE,
  var_fun = "bs",
  var_degree = 2,
  var_per = c(25, 50, 75),
  lag_fun = "strata",
  lag_breaks = 1,
  lag_days = 2,
  independent_cols = NULL,
  control_cols = NULL,
  cenper = 50,
  attr_thr = 97.5,
  save_fig = FALSE,
  save_csv = FALSE,
  output_folder_path = tempdir()
)
# }

Run the code above in your browser using DataLab