Learn R Programming

BioGSP (version 1.0.0)

codex_toy_data: Toy CODEX Spatial Cell Type Data

Description

A synthetic dataset mimicking CODEX multiplexed imaging data for demonstrating Spectral Graph Wavelet Transform (SGWT) analysis on spatial cell type distributions. The dataset contains spatial coordinates and cell type annotations for multiple immune cell populations arranged in realistic spatial clusters.

Usage

data(codex_toy_data)

Arguments

Format

A data frame with 18604 rows and 5 columns:

cellLabel

Character. Unique identifier for each cell

Y_cent

Numeric. Y coordinate of cell centroid (0-115 range)

X_cent

Numeric. X coordinate of cell centroid (0-116 range)

Annotation5

Character. Full descriptive cell type name

ROI_num

Character. Region of interest identifier ("ROI_0" through "ROI_15")

Details

The dataset contains 16 regions of interest (ROI_0 through ROI_15) with different spatial patterns and varying cell counts (945-1497 cells per ROI). Each ROI represents a distinct tissue region with unique spatial arrangements of the same cell types.

ROI Distribution:

  • ROI_0: 952 cells

  • ROI_1: 945 cells

  • ROI_2: 1155 cells

  • ROI_3: 1421 cells

  • ROI_4: 1096 cells

  • ROI_5: 1420 cells

  • ROI_6-ROI_15: 958-1497 cells each

Cell types across all ROIs include:

  • BCL6- B Cell (~3719 cells): Primary B cell population

  • CD4 T (~4092 cells): Helper T cells - largest population

  • CD8 T (~3346 cells): Cytotoxic T cells

  • DC (~2233 cells): Dendritic cells

  • M1 (~1490 cells): M1 macrophages

  • CD4 Treg (~1490 cells): Regulatory T cells

  • BCL6+ B Cell (~931 cells): Activated B cells

  • Endothelial (~746 cells): Vascular cells

  • M2 (~370 cells): M2 macrophages

  • Myeloid (~186 cells): Other myeloid cells

  • Other (~1 cells): Miscellaneous cell types

This synthetic data is designed to demonstrate:

  • Spatial clustering patterns of different cell types

  • Multi-scale spatial analysis using SGWT

  • Cross-cell type correlation analysis

  • Graph construction and eigenvalue analysis

  • Wavelet decomposition of spatial signals

Examples

Run this code
# Load the toy dataset
data(codex_toy_data)

# Examine the structure
str(codex_toy_data)
head(codex_toy_data)

# Summary of cell types
table(codex_toy_data$Annotation5)

# Summary by ROI
table(codex_toy_data$ROI_num)
table(codex_toy_data$ROI_num, codex_toy_data$Annotation5)

# Quick visualization of spatial distribution
if (requireNamespace("ggplot2", quietly = TRUE)) {
  library(ggplot2)
  ggplot(codex_toy_data, aes(x = X_cent, y = Y_cent, color = Annotation5)) +
    geom_point(size = 0.8, alpha = 0.7) +
    facet_wrap(~ROI_num, scales = "free") +
    labs(title = "Toy CODEX Spatial Cell Distribution by ROI",
         x = "X Coordinate", y = "Y Coordinate") +
    theme_minimal() +
    scale_y_reverse()
}

# Basic SGWT analysis example
# \donttest{
# Focus on BCL6- B Cell cells in ROI_1 for SGWT analysis
bcl6nb_data <- codex_toy_data[codex_toy_data$Annotation5 == "BCL6- B Cell" & 
                              codex_toy_data$ROI_num == "ROI_1", ]

# Create binned representation
library(dplyr)
binned_data <- codex_toy_data %>%
  filter(Annotation5 == "BCL6- B Cell", ROI_num == "ROI_1") %>%
  mutate(
    x_bin = cut(X_cent, breaks = 20, labels = FALSE),
    y_bin = cut(Y_cent, breaks = 20, labels = FALSE)
  ) %>%
  group_by(x_bin, y_bin) %>%
  summarise(cell_count = n(), .groups = 'drop')

# Prepare for SGWT
complete_grid <- expand.grid(x_bin = 1:20, y_bin = 1:20)
sgwt_data <- complete_grid %>%
  left_join(binned_data, by = c("x_bin", "y_bin")) %>%
  mutate(
    cell_count = ifelse(is.na(cell_count), 0, cell_count),
    x = x_bin,
    y = y_bin,
    signal = cell_count / max(cell_count, na.rm = TRUE)
  ) %>%
  select(x, y, signal)

# Apply SGWT using new workflow
SG <- initSGWT(sgwt_data, signals = "signal", J = 3, kernel_type = "heat")
SG <- runSpecGraph(SG, k = 8)
SG <- runSGWT(SG)

# View results
print(SG)
# }

Run the code above in your browser using DataLab