Learn R Programming

SignalY (version 1.1.1)

signal_analysis: Comprehensive Signal Analysis for Panel Data

Description

Master function that orchestrates the complete signal extraction pipeline, integrating spectral decomposition (wavelets, EMD, HP-GC), Bayesian variable' selection (regularized Horseshoe), dimensionality reduction (PCA, DFM), and stationarity testing into a unified analytical framework.

The function constructs a target signal Y from candidate variables X in panel data and applies multiple complementary methodologies to extract the latent structure from phenomenological dynamics.

Usage

signal_analysis(
  data,
  y_formula,
  time_var = NULL,
  group_var = NULL,
  methods = "all",
  filter_config = list(),
  horseshoe_config = list(),
  pca_config = list(),
  dfm_config = list(),
  unitroot_tests = "all",
  na_action = c("interpolate", "omit", "fail"),
  standardize = TRUE,
  first_difference = FALSE,
  verbose = TRUE,
  seed = NULL
)

Value

An S3 object of class "signal_analysis" containing:

call

The matched function call

data

Processed input data

Y

The constructed target signal

X

The predictor matrix

filters

Results from spectral decomposition methods

horseshoe

Results from Bayesian variable selection

pca

Results from PCA with bootstrap

dfm

Results from Dynamic Factor Model

unitroot

Results from unit root tests

interpretation

Automated technical interpretation

config

Configuration parameters used

Arguments

data

A data.frame or matrix containing the panel data. For data.frames, time should be in rows and variables in columns.

y_formula

Formula specifying how to construct Y from X variables, or a character string naming the pre-constructed Y column in data.

time_var

Character string naming the time variable (optional, assumes rows are ordered by time if NULL).

group_var

Character string naming the group/panel variable for panel data (optional for single time series).

methods

Character vector specifying which methods to apply. Options: "wavelet", "emd", "hpgc", "horseshoe", "pca", "dfm", "unitroot", or "all" (default).

filter_config

List of configuration options for filtering methods:

wavelet_filter

Wavelet filter type (default: "la8")

wavelet_levels

Which detail levels to combine (default: c(3,4))

emd_max_imf

Maximum IMFs for EMD (default: 10)

hpgc_prior

Prior configuration: "weak", "informative", "empirical" (default: "weak")

hpgc_chains

Number of MCMC chains (default: 4)

hpgc_iterations

Total iterations per chain (default: 20000)

horseshoe_config

List of configuration for Horseshoe regression:

p0

Expected number of relevant predictors (default: NULL for auto)

chains

Number of MCMC chains (default: 4)

iter_sampling

Sampling iterations per chain (default: 2000)

iter_warmup

Warmup iterations (default: 1000)

adapt_delta

Target acceptance rate (default: 0.95)

use_qr

Use QR decomposition (default: TRUE)

kappa_threshold

Shrinkage threshold for selection (default: 0.5)

pca_config

List of configuration for PCA:

n_components

Number of components (default: NULL for auto)

rotation

Rotation method: "none", "varimax", "oblimin" (default: "none")

n_boot

Bootstrap replications (default: 1000)

block_length

Block length for bootstrap (default: NULL for auto)

alpha

Alpha for bootstrap tests (default: 0.05)

dfm_config

List of configuration for Dynamic Factor Models:

r

Number of factors (default: NULL for auto via IC)

max_factors

Maximum factors to consider (default: 10)

p

VAR lags for factor dynamics (default: 1)

ic

Information criterion: "IC1", "IC2", "IC3" (default: "bai_ng_2")

unitroot_tests

Character vector of unit root tests to apply. Options: "adf", "ers", "kpss", "pp", or "all" (default).

na_action

How to handle missing values: "interpolate", "omit", "fail" (default: "interpolate").

standardize

Logical, whether to standardize variables before analysis (default: TRUE).

first_difference

Logical, whether to first-difference data (default: FALSE).

verbose

Logical, whether to print progress messages (default: TRUE).

seed

Random seed for reproducibility (default: NULL).

Details

Methodological Framework

The signal extraction pipeline distinguishes between latent structure (the underlying data-generating process) and phenomenological dynamics (observed variability). This is achieved through:

  1. Spectral Decomposition: Separates signal frequencies

    • Wavelets: Multi-resolution analysis via MODWT

    • EMD: Data-adaptive decomposition into intrinsic modes

    • HP-GC: Bayesian unobserved components (trend + cycle)

  2. Sparse Regression: Identifies relevant predictors

    • Regularized Horseshoe: Adaptive shrinkage with slab regularization

    • Shrinkage factors (kappa) quantify predictor relevance

  3. Dimensionality Reduction: Extracts common factors

    • PCA: Static factor structure with bootstrap significance

    • DFM: Dynamic factors with VAR transition dynamics

  4. Stationarity Testing: Characterizes persistence properties

    • Integrated battery of ADF, ERS, KPSS, PP tests

    • Synthesized conclusion on stationarity type

Interpretation Framework

The automated interpretation assesses:

  • Signal Smoothness: Variance of second differences

  • Trend Persistence: Deterministic vs. stochastic via unit roots

  • Information Topology: Entropy of PC1 loadings (concentrated vs. diffuse)

  • Sparsity Ratio: Proportion of predictors shrunk to zero

  • Factor Structure: Number of significant common factors

References

Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics, 11(2), 5018-5051. tools:::Rd_expr_doi("10.1214/17-EJS1337SI")

Bai, J., & Ng, S. (2002). Determining the Number of Factors in Approximate Factor Models. Econometrica, 70(1), 191-221. tools:::Rd_expr_doi("10.1111/1468-0262.00273")

See Also

filter_wavelet, filter_emd, filter_hpgc, fit_horseshoe, pca_bootstrap, estimate_dfm, test_unit_root

Examples

Run this code
# \donttest{
# Generate example panel data
set.seed(42)
n_time <- 50   
n_vars <- 10   

# Create correlated predictors with common factor structure
factors <- matrix(rnorm(n_time * 2), n_time, 2)
loadings <- matrix(runif(n_vars * 2, -1, 1), n_vars, 2)
X <- factors %*% t(loadings) + matrix(rnorm(n_time * n_vars, 0, 0.5), n_time, n_vars)
colnames(X) <- paste0("X", 1:n_vars)

# True signal depends on only 3 predictors
true_beta <- c(rep(1, 3), rep(0, 7))
Y <- X %*% true_beta + rnorm(n_time, 0, 0.5)

# Combine into data frame
data <- data.frame(Y = Y, X)

# Run comprehensive analysis
# We pass specific configs to make MCMC very fast just for the example
result <- signal_analysis(
  data = data,
  y_formula = "Y",
  methods = "all",
  verbose = TRUE,
  # Configuration for speed (CRAN policy < 5s preferred)
  filter_config = list(
     hpgc_chains = 1,      
     hpgc_iterations = 50, 
     hpgc_burnin = 10
  ),
  horseshoe_config = list(
     chains = 1,           
     iter_sampling = 50,   
     iter_warmup = 10
  ),
  pca_config = list(
     n_boot = 50           
  )
)

# View interpretation
print(result)

# Plot results
plot(result)
# }

Run the code above in your browser using DataLab