Reads in a CSV file for a daily time series of climate, environmental and health data and renames them to standardised names. This function creates year, month, day, and day of week columns derived from the date.
load_air_pollution_data(
data_path,
date_col = "date",
region_col = "region",
pm25_col = "pm25",
deaths_col = "deaths",
population_col = "population",
humidity_col = "humidity",
precipitation_col = "precipitation",
tmax_col = "tmax",
wind_speed_col = "wind_speed",
categorical_others = NULL,
continuous_others = NULL
)Dataframe with formatted and renamed with standardized column names.
Path to a CSV file containing a daily time series of data.
Character. Name of date column in the dataframe with format YYYY-MM-DD. Defaults to "date".
Character. Name of region column in the dataframe. Defaults to "region".
Character. Name of PM2.5 column in the dataframe. Defaults to "pm25".
Character. Name of all-cause mortality column in the dataframe (Note that deaths_col variable has value 1 for each recorded death). 'Defaults to "deaths"
Character. Name of population column in the dataframe. This is REQUIRED for calculating Attributable Rate (AR). Defaults to "population".
Character. Name of humidity column in the dataframe. Defaults to "humidity".
Character. Name of precipitation column in the dataframe. Defaults to "precipitation".
Character. Name of maximum temperature column in the dataframe. Defaults to "tmax".
Character. Name of wind speed column in the dataframe. Defaults to "wind_speed".
Optional. Character vector of additional categorical variables (e.g., "sex", "age_group"). Defaults to NULL.
Optional. Character vector of additional continuous variables (e.g., "tmean"). Defaults to NULL.