Learn R Programming

climatehealth (version 1.0.0)

temp_mortality_do_analysis: Full analysis for the 'mortality attributable to high and low temperatures' indicator

Description

Runs the full methodology to analyse the impact of high and low temperatures on mortality using a quasi-Poisson time series approach with a distributed lag non-linear model. This function generates the relative risk of the temperature-mortality association as well as attributable numbers, rates and fractions of mortalities to specified temperature thresholds for high and low temperatures. Model validation statistics are also provided.

Usage

temp_mortality_do_analysis(
  data_path,
  date_col,
  region_col,
  temperature_col,
  dependent_col,
  population_col,
  country = "National",
  independent_cols = NULL,
  control_cols = NULL,
  var_fun = "bs",
  var_degree = 2,
  var_per = c(10, 75, 90),
  lagn = 21,
  lagnk = 3,
  dfseas = 8,
  meta_analysis = FALSE,
  attr_thr_high = 97.5,
  attr_thr_low = 2.5,
  save_fig = FALSE,
  save_csv = FALSE,
  output_folder_path = NULL,
  seed = NULL
)

Value

  • qaic_results Dataframe. QAIC and dispersion metrics for each model combination and geography.

  • qaic_summary Dataframe. Mean QAIC and dispersion metrics for each model combination.

  • vif_results Dataframe. Variance inflation factors for each independent variables by geography.

  • vif_summary Dataframe. Mean variance inflation factors for each independent variable.

  • adf_results Dataframe. ADF test results for each geography.

  • power_list List. Power information by area.

  • rr_results Dataframe containing cumulative relative risk and confidence intervals from analysis.

  • res_attr_tot Dataframe. Total attributable fractions, numbers and rates for each area over the whole time series.

  • attr_yr_list List. Dataframes containing yearly estimates of attributable fractions, numbers and rates by area.

  • attr_mth_list List. Dataframes containing total attributable fractions, numbers and rates by calendar month and area.

Arguments

data_path

Path to a csv file containing a daily time series of data for a particular health outcome and climate variables, which may be disaggregated by geography.

date_col

Character. Name of the column in the dataframe containing the date.

region_col

Character. Name of the column in the dataframe that contains the geography name(s).

temperature_col

Character. Name of the column in the dataframe that contains the temperature column.

dependent_col

Character. Name of the column in the dataframe containing the dependent health outcome variable e.g. deaths.

population_col

Character. Name of the column in the dataframe that contains the population estimate per geography.

country

Character. Name of country for national-level estimates. Defaults to 'National'.

independent_cols

List. Additional independent variables to test in model validation as confounders. Defaults to NULL.

control_cols

List. Confounders to include in the final model adjustment. Defaults to NULL.

var_fun

Character. Exposure function for argvar (see dlnm::crossbasis). Defaults to 'bs'.

var_degree

Integer. Degree of the piecewise polynomial for argvar (see dlnm:crossbasis). Defaults to 2 (quadratic).

var_per

Vector. Internal knot positions for argvar (see dlnm::crossbasis). Defaults to c(10, 75, 90).

lagn

Integer. Number of days in the lag period. Defaults to 21. (see dlnm::crossbasis).

lagnk

Integer. Number of knots in lag function. Defaults to 3. (see dlnm::logknots).

dfseas

Integer. Degrees of freedom for seasonality. Defaults to 8.

meta_analysis

Boolean. Whether to perform a meta-analysis. Defaults to FALSE.

attr_thr_high

Integer. Percentile at which to define the high temperature threshold for calculating attributable risk. Defaults to 97.5.

attr_thr_low

Integer. Percentile at which to define the low temperature threshold for calculating attributable risk. Defaults to 2.5.

save_fig

Boolean. Whether to save the plot as an output. Defaults to FALSE.

save_csv

Boolean. Whether to save the results as a CSV. Defaults to FALSE.

output_folder_path

Path to folder where plots and/or CSV should be saved. Defaults to NULL.

seed

Optional integer random seed used when sampling residuals for model validation plots. Defaults to NULL.

Details

This analysis requires a daily time series of temperature and death counts with population values as a minimum. This is then processed using a quasi-Poisson time series regression analysis with a distributed lag non-linear model and optional meta-analysis. Meta-analysis is recommended if the input data is disaggregated by area.

The model parameters have default values, which are recommended to keep as based on existing studies. However, if desired these can be adjusted for if appropriate for the user's context.

Model validation testing is provided as a standard output from the pipeline so a user can assess the quality of the model. If a user has additional independent variables these can be specified as independent_cols and assessed within different model combinations in the outputs of this testing. These can be added in the final model via control_cols. Note, a user should include variables if contextually relevant, and not simply based on model optimisation.

For attributable deaths the default is to use a high temperature threshold, defined as the 97.5th percentile of the temperature distribution over the full time period for each geography. The low temperature thresholds is similarly defined at the 2.5th percentile. These can be adjusted if desired, following review of the relative risk association between temperature and mortality using attr_thr_high or attr_thr_low.

Further details on the input data requirements, methodology, quality information and guidance on interpreting outputs can be found in the accompanying published tools:::Rd_expr_doi("10.5281/zenodo.14865904").

References

  1. Watkins E, Hunt C, Lewis B, Ingole V, Glickman M. Standards for Official Statistics on Climate-Health Interactions (SOSCHI): Mortality attributed to high and low temperatures: methodology. Zenodo; 2026. Available from: tools:::Rd_expr_doi("10.5281/zenodo.14865904")

  2. Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. Lancet. 2015 Jul;386(9991):369-75. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0140673614621140

  3. Gasparrini A, Armstrong B. Reducing and meta-analysing estimates from distributed lag non-linear models. BMC Medical Research Methodology. 2013 Jan 9;13:1. Available from: tools:::Rd_expr_doi("10.1186/1471-2288-13-1")

  4. Gasparrini A, Armstrong B, Kenward MG. Multivariate meta-analysis for non-linear and other multi-parameter associations. Statistics in Medicine. 2012 Dec 20;31(29):3821-39. Available from: tools:::Rd_expr_doi("10.1002/sim.5471")

Examples

Run this code
# \donttest{
example_data <- data.frame(
  date = seq.Date(as.Date("2020-01-01"), by = "day", length.out = 365),
  region = "Example Region",
  tmean = stats::runif(365, -2, 32),
  deaths = stats::rpois(365, lambda = 8),
  pop = 500000
)
example_path <- tempfile(fileext = ".csv")
utils::write.csv(example_data, example_path, row.names = FALSE)

temp_mortality_do_analysis(
  data_path = example_path,
  date_col = "date",
  temperature_col = "tmean",
  dependent_col = "deaths",
  population_col = "pop",
  region_col = "region",
  country = "Example Region",
  meta_analysis = FALSE,
  independent_cols = NULL,
  control_cols = NULL,
  var_fun = "bs",
  var_degree = 2,
  var_per = c(10, 75, 90),
  lagn = 7,
  lagnk = 2,
  dfseas = 4,
  attr_thr_high = 97.5,
  attr_thr_low = 2.5,
  save_fig = FALSE,
  save_csv = FALSE,
  output_folder_path = tempdir()
)
# }

Run the code above in your browser using DataLab