run_transfer_entropy: Transfer Entropy for Counts, Rates, and Binary Series

Description

Computes pairwise transfer entropy between I and C for three transformations of the data: raw counts, rates (count/exposure), and binary presence/absence. Each series is first pre-whitened via a GLM and transfer entropy is then estimated for a grid of lags using RTransferEntropy. Results are written to separate CSV files and to a combined summary.

Usage

run_transfer_entropy(
  DT,
  lags = 1:3,
  shuffles = 1000,
  seed = 123,
  use_progress = TRUE
)

Value

A data.frame with one row per lag and type, and columns:

lag: lag order used in transfer_entropy().
TE_ItoC, p_ItoC: transfer entropy and p-value from I to C.
TE_CtoI, p_CtoI: transfer entropy and p-value from C to I.
type: transformation used ("counts", "rates", or "binary").

Arguments

DT

A data.table or data.frame containing at least the following columns:

I, C: count variables (non-negative integers).
exposure50: exposure used to form rates (must be strictly positive).
log_exposure50: log of the exposure (offset).
t_norm, Regime, EconCycle, PopDensity, Epidemics, Climate, War: covariates used by the pre-whitening GLMs.

lags

Integer vector of lag orders L for which transfer entropy is computed (passed to lx and ly in RTransferEntropy::transfer_entropy()).

shuffles

Integer; number of shuffle replications for the surrogate-distribution-based significance test in transfer_entropy().

seed

Integer; base random seed used for reproducibility of the pre-whitening and transfer entropy computations.

use_progress

Logical; reserved for future use to toggle progress reporting. Currently not used.

Details

The function proceeds in four steps:

Counts: I and C are pre-whitened via prewhiten_count_glm (Negative Binomial with offset and Poisson fallback). Transfer entropy is computed in both directions (I→C and C→I) for each lag in lags. Results are saved to "transfer_entropy_counts.csv".
Rates: I and C are divided by exposure50, pre-whitened via prewhiten_rate_glm, and transfer entropy is recomputed. Results are saved to "transfer_entropy_rates.csv". A check is performed to ensure exposure50 > 0 for all observations.
Binary: I and C are recoded as 0/1 presence/absence indicators and pre-whitened via prewhiten_bin_glm. Transfer entropy is computed again and results are saved to "transfer_entropy_binary.csv".
Combined: All tables are stacked into a single data frame with a type column ("counts", "rates", "binary") and written to "transfer_entropy.csv".

Internally, the helpers .get_stat and .get_pval are used to extract the transfer entropy statistic and p-value from the objects returned by RTransferEntropy::transfer_entropy(). The function assumes a global dir_csv object (character scalar) indicating the output directory for CSV files.

Examples

Run this code

# \donttest{
library(data.table)

# 1. Create dummy data with ALL covariates required by prewhiten_*_glm()
# The internal GLM formulas likely include:
# I ~ t_norm + Regime + EconCycle + PopDensity + Epidemics + Climate + War
DT <- data.table(
  year = 2000:2029,
  I = rpois(30, lambda = 10),
  C = rpois(30, lambda = 8),
  exposure50 = runif(30, 100, 200),
  log_exposure50 = log(runif(30, 100, 200)),
  # Covariates
  t_norm = seq(-1, 1, length.out = 30),
  Regime = factor(sample(c("A", "B"), 30, replace = TRUE)),
  EconCycle = rnorm(30),
  PopDensity = rnorm(30),
  Epidemics = rnorm(30),
  Climate = rnorm(30),
  War = rnorm(30)
)

# 2. Define global paths using tempdir() (Fixes CRAN policy)
# run_transfer_entropy writes output to 'dir_csv'
tmp_dir <- tempdir()
dir_csv <- file.path(tmp_dir, "csv")
if (!dir.exists(dir_csv)) dir.create(dir_csv, recursive = TRUE)

# 3. Run the function
# Using fewer shuffles for a faster example check
te_tab <- run_transfer_entropy(DT, lags = 1, shuffles = 10, seed = 123)

# Inspect results
if (!is.null(te_tab)) {
  print(subset(te_tab, type == "counts"))
}
# }

Run the code above in your browser using DataLab