create_mea_heatmaps_enhanced: Create Enhanced Heatmaps for Multi-Electrode Array (MEA) Data Analysis

Description

This function generates comprehensive heatmap visualizations for MEA data analysis, including individual grouping variable heatmaps, combined interaction heatmaps, and variable correlation matrices. It provides flexible scaling, clustering, and customization options with automatic quality filtering and missing data handling.

Usage

create_mea_heatmaps_enhanced(
  data = NULL,
  processing_result = NULL,
  config = NULL,
  value_column = "Normalized_Value",
  variable_column = "Variable",
  grouping_columns = c("Treatment", "Genotype"),
  sample_id_columns = c("Well"),
  timepoint_column = "Timepoint",
  scale_method = "z_score",
  aggregation_method = "mean",
  missing_value_handling = "remove",
  cluster_method = "euclidean",
  cluster_rows = TRUE,
  cluster_cols = TRUE,
  create_individual_heatmaps = TRUE,
  create_combined_heatmap = TRUE,
  create_variable_correlation = TRUE,
  output_dir = NULL,
  save_plots = FALSE,
  plot_format = "png",
  plot_width = 10,
  plot_height = 8,
  dpi = 300,
  fontsize = 10,
  angle_col = 45,
  show_rownames = TRUE,
  show_colnames = TRUE,
  return_data = TRUE,
  verbose = TRUE,
  quality_threshold = 0.8,
  min_observations = 3
)

Value

A list containing:

individual_heatmaps: Named list of heatmap objects for each grouping variable
combined_heatmap: Heatmap object for grouping variable interactions (if applicable)
variable_correlation: List with correlation heatmap and correlation matrix
metadata: List containing processing information and parameters used

Each heatmap object contains: heatmap (pheatmap object), scaled_data (processed matrix), raw_data (aggregated input data), annotation (row annotations), annotation_colors (color schemes), and scaling_info (scaling parameters).

Arguments

data: A data frame containing MEA measurement data. If NULL, must provide processing_result.
processing_result: A list object from MEA data processing containing normalized_data or raw_data components. Takes precedence over the data parameter if provided.
config: Configuration list from MEA processing. If NULL and processing_result is provided, will attempt to use config from processing_result$config_used.
value_column: Character string specifying the column containing measurement values (default: "Normalized_Value").
variable_column: Character string specifying the column containing variable names (default: "Variable").
grouping_columns: Character vector of column names to use for grouping (default: c("Treatment", "Genotype")). Function will auto-detect which columns are available.
sample_id_columns: Character vector of columns identifying individual samples (default: c("Well")).
timepoint_column: Character string specifying the timepoint column (default: "Timepoint").
scale_method: Character string specifying scaling method. Options: "z_score" (default), "min_max", "robust", "none".
aggregation_method: Character string specifying how to aggregate multiple measurements. Options: "mean" (default), "median", "sum".
missing_value_handling: Character string specifying how to handle missing values. Options: "remove" (default), "impute_mean", "impute_zero".
cluster_method: Character string specifying clustering distance method. Options: "euclidean" (default), "correlation", "manhattan".
cluster_rows: Logical indicating whether to cluster rows (default: TRUE).
cluster_cols: Logical indicating whether to cluster columns (default: TRUE).
create_individual_heatmaps: Logical indicating whether to create separate heatmaps for each grouping variable (default: TRUE).
create_combined_heatmap: Logical indicating whether to create interaction heatmap when multiple grouping variables are present (default: TRUE).
create_variable_correlation: Logical indicating whether to create variable correlation heatmap (default: TRUE).
output_dir: Character string specifying output directory (default: NULL, no files saved)
save_plots: Logical indicating whether to save plots to disk (default: FALSE)
plot_format: Character string specifying file format for saved plots (default: "png").
plot_width: Numeric value specifying plot width in inches (default: 10).
plot_height: Numeric value specifying plot height in inches (default: 8).
dpi: Numeric value specifying resolution for saved plots (default: 300).
fontsize: Numeric value specifying font size for heatmap labels (default: 10).
angle_col: Numeric value specifying angle for column labels in degrees (default: 45).
show_rownames: Logical indicating whether to show row names (default: TRUE).
show_colnames: Logical indicating whether to show column names (default: TRUE).
return_data: Logical indicating whether to return processed data matrices (default: TRUE).
verbose: Logical indicating whether to print progress messages (default: TRUE).
quality_threshold: Numeric value between 0-1 specifying minimum data completeness per variable (default: 0.8).
min_observations: Numeric value specifying minimum observations required per group (default: 3).

Details

The function performs several key operations:

Quality filtering: Removes variables with insufficient data completeness
Missing value handling: Multiple strategies for dealing with NA values
Data aggregation: Combines multiple measurements per group using specified method
Scaling: Applies normalization methods appropriate for heatmap visualization
Clustering: Hierarchical clustering of rows and/or columns using specified distance metrics
Visualization: Creates publication-ready heatmaps with proper color schemes and annotations

For scaling methods:

z_score: Centers data around mean with unit variance (best for comparing relative changes)
min_max: Scales to 0-1 range (best for absolute comparisons)
robust: Uses median and MAD for outlier-resistant scaling
none: No scaling applied

The function automatically adjusts plot dimensions based on data size and uses optimized color palettes appropriate for the scaling method chosen (diverging palettes for z_score/robust, sequential palettes for min_max).