Spatial Gene Set Enrichment Analysis with ChatSpatial

Find biological pathways and processes that are spatially organized in your tissue using ChatSpatial. This tutorial shows how to perform spatial gene set enrichment analysis using EnrichMap integration.

What You Will Learn

By the end of this tutorial, you will know how to:

  1. Ask ChatSpatial to perform spatial enrichment using simple questions
  2. Load gene sets from MSigDB and other databases conversationally
  3. Identify spatially coherent biological processes in your tissue
  4. Create enrichment visualizations automatically
  5. Interpret spatial enrichment results biologically
  6. Compare enrichment patterns across regions interactively

Before You Start

What You Need

  • Claude Desktop installed and running
  • ChatSpatial configured (see Getting Started)
  • Your spatial data already loaded and preprocessed
  • Basic knowledge of gene sets (pathways, GO terms, etc.)
  • Knowledge of biological processes in your tissue

Understanding Spatial Enrichment

Traditional enrichment analysis asks: “Are these genes overrepresented in my list?”

Spatial enrichment analysis asks: “Are these biological processes organized spatially in my tissue?”

This approach identifies:

  • 🗺️ Pathway territories - where specific processes are active
  • 🔗 Functional neighborhoods - regions sharing biological functions
  • 📈 Spatial gradients - how pathway activity changes across space
  • 🎯 Local hotspots - areas of concentrated pathway activity

Step 1: Understanding Your Data Context

💬 Start with Your Spatial Data

Before enrichment analysis, make sure your data is properly processed:


👤 You: “I have a mouse brain dataset that is already preprocessed. Can you tell me what spatial domains we found and prepare it for enrichment analysis?”

🤖 Claude: “Let me examine your preprocessed mouse brain data and check what spatial information we have available.”

[ChatSpatial analyzes the spatial organization]

🤖 Claude: “Your mouse brain data is ready for spatial enrichment analysis:

Spatial Organization Summary:

  • 2,651 spots with high-quality spatial coordinates
  • 7 distinct spatial domains already identified:
    • Cortical layers (outer and deep)
    • Hippocampal formation
    • White matter regions
    • Ventricular zones
    • Boundary regions
  • 18,078 genes available for enrichment analysis
  • Spatial neighborhoods already computed (6-nearest neighbors)

Your data has good spatial structure, so we can perform spatial enrichment analysis. What biological processes are you interested in exploring?”


🎯 What ChatSpatial Validated

Spatial Prerequisites:

  • ✅ Spatial coordinates are available
  • ✅ Spatial domains have been identified
  • ✅ Neighbor graphs are computed
  • ✅ Gene expression is properly normalized

Step 2: Loading Gene Sets for Analysis

💬 Choosing Your Gene Sets

Enrichment analysis depends on choosing relevant gene sets:


👤 You: “I want to understand what brain-related biological processes are active in different regions. Can you load some relevant gene sets for brain analysis?”

🤖 Claude: “For brain tissue analysis, I will load several gene set collections that provide different biological perspectives. Let me get those for you.”

[ChatSpatial loads relevant gene sets]

🤖 Claude: “I’ve loaded gene sets for brain analysis:

Gene Set Collections Loaded:

🧠 MSigDB Hallmark Pathways (50 sets)

  • Core biological processes (e.g., neurogenesis, synaptic signaling)
  • Validated, high-confidence pathways
  • Useful for understanding major biological themes

🔬 GO Biological Processes (2,891 brain-relevant sets)

  • Detailed molecular processes
  • Hierarchical organization from general to specific
  • Includes neuron development, synapse formation, etc.

🛤️ KEGG Pathways (186 mouse pathways)

  • Well-characterized metabolic and signaling pathways
  • Dopaminergic synapse, axon guidance, neurodegeneration pathways
  • Good for understanding pathway crosstalk

🧬 Reactome Pathways (674 sets)

  • Detailed biochemical reactions
  • Neurotransmitter signaling, neural development
  • Detailed pathway maps

Which type would you like to start with, or should I analyze all of them for a complete view?”

👤 You: “Let’s start with hallmark pathways to get the big picture, then we can dive deeper.”


💡 Gene Set Selection Strategy

For Brain/Neural Tissue:

  • 🏆 Start with Hallmarks - broad biological themes
  • 🎯 Add GO Biological Process - detailed molecular functions
  • 🛤️ Include KEGG/Reactome - specific pathway mechanisms
  • 🧬 Consider Cell Type Markers - understand cellular composition

For Cancer Tissue:

  • 🏆 Hallmark pathways - cancer-relevant processes
  • 🎯 Oncogenic signatures (C6) - known cancer drivers
  • 🛡️ Immunologic signatures (C7) - immune responses
  • 📊 Custom gene lists - your specific research genes

Step 3: Performing Spatial Enrichment Analysis

💬 Running the Analysis

Now for the analysis - let’s discover spatial patterns:


👤 You: “Great! Now can you perform spatial enrichment analysis using those hallmark pathways? I want to see which biological processes are spatially organized in my brain tissue.”

🤖 Claude: “I will perform spatial enrichment analysis using EnrichMap. This will identify which pathways are active and where they are active in your brain tissue.”

[ChatSpatial runs spatial enrichment analysis]

🤖 Claude: “The spatial enrichment analysis shows clear biological organization:

🗺️ Spatial Enrichment Results:

Neuronal Signaling Territories:

  • OXIDATIVE_PHOSPHORYLATION - Strongest in white matter regions (NES: 2.4)
  • MITOCHONDRIAL_BIOGENESIS - Co-localized with metabolically active areas (NES: 2.1)
  • SYNAPTIC_TRANSMISSION - Enriched in cortical layers and hippocampus (NES: 1.9)

Development & Plasticity Zones:

  • NEUROGENESIS - Concentrated in ventricular zones (NES: 2.3)
  • AXON_GUIDANCE - Strong gradient from cortex to deep structures (NES: 1.8)
  • SYNAPTIC_PLASTICITY - Hippocampal formation hotspot (NES: 2.0)

Homeostasis & Support Regions:

  • IMMUNE_RESPONSE - Boundary regions and blood-brain barrier (NES: 1.7)
  • LIPID_METABOLISM - White matter territories (NES: 1.6)

📊 Analysis Quality:

  • 42 pathways spatially enriched (FDR < 0.05)
  • Spatial coherence score: 0.78 (good)
  • Cross-validation: 89% consistency

Would you like me to create visualizations showing these spatial patterns?”

👤 You: “Yes! Can you show me the spatial maps for the top pathways?”


🎯 What the Analysis Discovered

Spatial Patterns Identified:

  • Functional territories - pathways with distinct spatial domains
  • Metabolic gradients - energy pathways following tissue architecture
  • Developmental zones - neurogenesis concentrated in stem cell areas
  • Support networks - glial and vascular processes in specific regions

💡 Understanding the Results

Key Metrics Explained:

  • NES (Normalized Enrichment Score) - strength of spatial enrichment
  • FDR (False Discovery Rate) - statistical confidence
  • Spatial coherence - how well pathway activity clusters spatially
  • Cross-validation - robustness across spatial neighborhoods

Step 4: Creating Spatial Enrichment Visualizations

💬 Visualizing Spatial Patterns

Let’s see these patterns visually:


🤖 Claude: “Here are spatial visualizations of your top enriched pathways:”

[Creates multi-panel spatial enrichment maps]

🤖 Claude: “These spatial enrichment maps show the biological organization of your brain tissue:

🗺️ Spatial Pattern Insights:

Panel A - OXIDATIVE_PHOSPHORYLATION:

  • Bright hotspots in white matter regions
  • Follows myelinated fiber tracts closely
  • Supports high-energy axonal transport

Panel B - NEUROGENESIS:

  • Concentrated in ventricular zones (red regions)
  • Sharp boundaries with differentiated areas
  • Matches known stem cell niches

Panel C - SYNAPTIC_TRANSMISSION:

  • Layered pattern in cortex
  • Strong signal in hippocampal CA fields
  • Reflects synaptic density gradients

Panel D - IMMUNE_RESPONSE:

  • Concentrated at tissue boundaries
  • Blood vessel associated regions
  • Protective barrier functions

The color scale shows spatial enrichment scores - warmer colors indicate stronger pathway activity in those tissue regions.”

👤 You: “This is interesting! Can you also show me how these pathways relate to the spatial domains we found earlier?”


💬 Domain-Specific Enrichment Analysis

🤖 Claude: “I will create a domain-by-pathway heatmap to show which biological processes define each spatial region.”

[Creates enrichment heatmap by spatial domain]

🤖 Claude: “Here’s your spatial domain enrichment heatmap:

🔥 Domain-Pathway Associations:

Cortical Layer Domain:

  • 🟥 SYNAPTIC_TRANSMISSION (3.2) - Primary function
  • 🟧 NEUROPLASTICITY (2.1) - Learning processes
  • 🟨 CALCIUM_SIGNALING (1.8) - Neural communication

White Matter Domain:

  • 🟥 OXIDATIVE_PHOSPHORYLATION (3.5) - Energy for axons
  • 🟧 LIPID_METABOLISM (2.4) - Myelination processes
  • 🟨 MITOCHONDRIAL_BIOGENESIS (2.0) - Metabolic support

Hippocampal Domain:

  • 🟥 MEMORY_CONSOLIDATION (2.9) - Core function
  • 🟧 SYNAPTIC_PLASTICITY (2.3) - Learning mechanisms
  • 🟨 NEUROGENESIS (1.9) - Adult stem cells

Each spatial domain has a unique biological signature - this shows the value of spatial enrichment analysis.”


🎨 Visualization Options Available

Spatial Maps:

  • 🗺️ Individual pathway maps - single pathway spatial distribution
  • 🎭 Multi-panel comparisons - compare multiple pathways side-by-side
  • 🌈 Composite overlays - show pathway interactions spatially

Summary Visualizations:

  • 🔥 Domain heatmaps - pathways × spatial domains
  • 📊 Enrichment bar plots - rank pathways by spatial strength
  • 🎯 Network diagrams - pathway relationships and spatial overlap

Interactive Options:

  • 🔍 Zoom into regions - explore specific tissue areas
  • 📈 Pathway profiles - expression patterns across space
  • 🔗 Cross-correlations - how pathways co-vary spatially

Step 5: Advanced Spatial Enrichment Analysis

💬 Local Enrichment Maps

Let’s explore more sophisticated analyses:


👤 You: “This is really insightful! Can we go deeper and look at local enrichment patterns? I’m curious about pathway interactions in specific regions.”

🤖 Claude: “Local enrichment analysis will show pathway interactions and co-regulation patterns within specific tissue regions. Let me compute local enrichment maps for you.”

[ChatSpatial performs local enrichment analysis]

🤖 Claude: “Local enrichment analysis results:

🔍 Local Enrichment Analysis Results:

Cortical Layer Hotspots:

  • Synaptic hub (Layer 2/3): SYNAPTIC_TRANSMISSION + CALCIUM_SIGNALING
  • Metabolic center (Layer 5): OXIDATIVE_PHOSPHORYLATION + MITOCHONDRIAL_BIOGENESIS
  • Plasticity zone (Layer 6): NEUROPLASTICITY + AXON_GUIDANCE

Hippocampal Subregions:

  • CA1 field: MEMORY_CONSOLIDATION + SYNAPTIC_PLASTICITY (r=0.73)
  • CA3 field: NEUROGENESIS + STEM_CELL_MAINTENANCE (r=0.65)
  • Dentate gyrus: ADULT_NEUROGENESIS + WNT_SIGNALING (r=0.71)

White Matter Territories:

  • Corpus callosum: MYELINATION + LIPID_METABOLISM (r=0.82)
  • Internal capsule: AXONAL_TRANSPORT + OXIDATIVE_PHOSPHORYLATION (r=0.78)

🎯 Key Finding: Pathways form functional modules that are spatially co-organized.”


💬 Pathway Network Analysis

👤 You: “Can you show me how these pathways are connected? Like a network of biological processes?”

🤖 Claude: “I will create a spatial pathway network showing how biological processes are connected functionally and spatially.”

[Creates pathway interaction network]

🤖 Claude: “Here’s your spatial pathway network:

🔗 Network Insights:

Central Hub Pathways:

  • OXIDATIVE_PHOSPHORYLATION - connects 8 other pathways
  • CALCIUM_SIGNALING - bridges neurotransmission and plasticity
  • MITOCHONDRIAL_BIOGENESIS - supports multiple energy-dependent processes

Functional Modules:

  1. Neurotransmission Module (Green cluster):
    • SYNAPTIC_TRANSMISSION ↔ CALCIUM_SIGNALING ↔ MEMBRANE_DEPOLARIZATION
  2. Development Module (Blue cluster):
    • NEUROGENESIS ↔ AXON_GUIDANCE ↔ WNT_SIGNALING
  3. Metabolism Module (Red cluster):
    • OXIDATIVE_PHOSPHORYLATION ↔ LIPID_METABOLISM ↔ MITOCHONDRIAL_BIOGENESIS

Spatial Connections:

  • Thick edges = strong spatial correlation (>0.6)
  • Node size = spatial enrichment strength
  • Colors = functional pathway categories

This network shows the biological organization of your brain tissue.”


Step 6: Custom Gene Sets and Specialized Analysis

💬 Using Your Own Gene Sets

You can also analyze your own custom gene sets:


👤 You: “I have a list of genes from my previous experiment that I think are important for memory formation. Can I test if they’re spatially organized in specific brain regions?”

🤖 Claude: “Custom gene sets are good for testing specific hypotheses. Please provide your gene list, and I will analyze its spatial organization.”

👤 You: “Here are my memory-related genes: CREB1, BDNF, ARC, FOS, JUN, NPAS4, NR4A1, EGR1, ATF3, HOMER1, CAMK2A, GRIN1, GRIN2B, DLG4, SYN1”

🤖 Claude: “Let me analyze the spatial organization of your custom ‘Memory Formation’ signature.”

[ChatSpatial analyzes custom gene set]

🤖 Claude: “Results for your memory genes:

🎯 Custom ‘Memory Formation’ Signature Analysis:

Spatial Enrichment:

  • Overall NES: 2.7 (highly significant, p < 0.001)
  • Spatial coherence: 0.81 (good clustering)
  • Top regions: Hippocampal formation, cortical layers 2-3

Gene Contributions:

  • CREB1 (0.34) - Strongest contributor, concentrated in CA1
  • BDNF (0.31) - High in dentate gyrus and cortical layers
  • ARC (0.29) - Activity-dependent, follows synaptic density
  • FOS & JUN (0.25 each) - Immediate early gene hotspots

Spatial Pattern:

  • Sharp enrichment in hippocampal CA fields
  • Gradient from high (hippocampus) to moderate (cortex) to low (white matter)
  • Matches known memory circuit anatomy well

Your gene signature shows clear spatial organization - these memory genes are concentrated in brain regions important for learning and memory.”


💡 Custom Gene Set Tips

Effective Custom Sets:

  • 10-50 genes - optimal for spatial analysis
  • Functionally related - coherent biological theme
  • Well-annotated genes - known functions and interactions
  • Hypothesis-driven - test specific biological questions

Common Custom Sets:

  • 🔬 Your experiment’s DEGs - differentially expressed genes
  • 📚 Literature gene lists - curated from papers
  • 🧬 Protein complexes - functionally interacting proteins
  • 🎯 Disease signatures - pathology-associated genes

Step 7: Interpreting and Validating Results

💬 Biological Interpretation

Understanding what your results mean biologically:


👤 You: “These results look good, but how do I know if they are biologically meaningful? And how should I interpret the spatial patterns I’m seeing?”

🤖 Claude: “Let me help you interpret these results and validate their biological significance.

🔍 Biological Interpretation Framework:

1. Anatomical Consistency Check:OXIDATIVE_PHOSPHORYLATION in white matter - Expected. Myelinated axons need high energy ✅ NEUROGENESIS in ventricular zones - Expected. Known adult stem cell niches ✅ SYNAPTIC_TRANSMISSION in cortical layers - Expected. High synapse density regions ✅ MEMORY genes in hippocampus - Expected. Core memory formation circuit

2. Literature Validation:

  • 📚 Your OXIDATIVE_PHOSPHORYLATION pattern matches Magistretti & Allaman (Nature, 2015)
  • 📚 NEUROGENESIS localization aligns with Kempermann et al. (Nature Reviews, 2018)
  • 📚 Memory gene spatial distribution consistent with Kandel et al. principles

3. Statistical Robustness:

  • 📊 Cross-validation: 89% consistency across spatial neighborhoods
  • 📊 Permutation testing: All enrichments significant (p < 0.001)
  • 📊 Spatial autocorrelation: Controlled for technical artifacts

4. Biological Coherence:

  • 🧬 Related pathways cluster together spatially
  • 🧬 Energy pathways co-locate with high-demand processes
  • 🧬 Development signatures match known anatomical gradients

Your results pass validation checks - they are biologically meaningful.”


💬 Result Validation Strategies

👤 You: “How can I further validate these findings and use them for my research?”

🤖 Claude: “Here are multiple validation approaches:

🔬 Experimental Validation:

  1. In situ hybridization - validate key genes in predicted regions
  2. Immunofluorescence - confirm protein expression patterns
  3. Single-cell RNA-seq - higher resolution validation
  4. Perturbation experiments - test functional predictions

📊 Computational Validation:

  1. Cross-dataset comparison - analyze similar tissues from other studies
  2. Temporal validation - if you have time-course data
  3. Species comparison - mouse vs human brain atlases
  4. Integration with protein data - spatial proteomics if available

🔗 Network Validation:

  1. Protein-protein interactions - do enriched genes actually interact?
  2. Pathway databases - compare with KEGG/Reactome pathway maps
  3. Literature mining - automated validation against published studies
  4. Functional enrichment - GO term analysis of spatially enriched genes

📝 Publication Strategies:

  1. Figure panels - spatial maps make good main figures
  2. Methods validation - compare with other spatial analysis tools
  3. Biological insights - focus on novel pathway localizations
  4. Clinical relevance - connect to disease or development

Would you like me to help with any of these validation approaches?”


Step 8: Advanced Analysis Workflows

💬 Comparative Enrichment Analysis

Compare enrichment between conditions or timepoints:


👤 You: “I actually have control and disease samples. Can I compare spatial enrichment patterns between them to see what changes in disease?”

🤖 Claude: “Let me show you how to compare spatial enrichment patterns between control and disease conditions.”

[ChatSpatial performs comparative enrichment analysis]

🤖 Claude: “Disease-associated spatial changes:

🔄 Comparative Spatial Enrichment Results:

Pathways Lost in Disease:

  • SYNAPTIC_TRANSMISSION: Control NES=2.4 → Disease NES=0.8 (⬇️ 67% decrease)
  • MEMORY_CONSOLIDATION: Control NES=2.1 → Disease NES=0.6 (⬇️ 71% decrease)
  • NEUROPLASTICITY: Control NES=1.9 → Disease NES=0.4 (⬇️ 79% decrease)

Pathways Gained in Disease:

  • INFLAMMATORY_RESPONSE: Control NES=0.3 → Disease NES=2.2 (⬆️ 633% increase)
  • APOPTOSIS: Control NES=0.1 → Disease NES=1.8 (⬆️ 1700% increase)
  • OXIDATIVE_STRESS: Control NES=0.2 → Disease NES=1.6 (⬆️ 700% increase)

Spatial Pattern Changes:

  • Control: Sharp functional domains, clear boundaries
  • Disease: Blurred boundaries, expanded inflammatory zones
  • Novel hotspots: Disease creates new apoptosis clusters in hippocampus

🎯 Key Finding: Disease changes gene expression and reorganizes the spatial architecture of biological processes.”


💬 Trajectory and Gradient Analysis

👤 You: “Can we analyze how pathways change along spatial trajectories? Like from the edge of tissue to the center?”

🤖 Claude: “Spatial trajectory analysis will show how biological processes are organized along tissue axes. Let me compute pathway gradients.”

[ChatSpatial analyzes spatial trajectories]

🤖 Claude: “Spatial gradient analysis results:

🗺️ Tissue Axis Analysis:

Dorsal-Ventral Gradient:

  • NEUROGENESIS: High dorsal → Low ventral (slope = -0.23/mm)
  • METABOLISM: Steady high throughout (slope ≈ 0)
  • IMMUNE_RESPONSE: Low dorsal → High ventral (slope = +0.18/mm)

Anterior-Posterior Gradient:

  • MEMORY_FORMATION: High anterior → Low posterior (slope = -0.31/mm)
  • MOTOR_FUNCTION: Low anterior → High posterior (slope = +0.24/mm)
  • SENSORY_PROCESSING: Complex U-shaped pattern

Surface-Deep Gradient:

  • SYNAPTIC_TRANSMISSION: High surface, moderate deep
  • WHITE_MATTER_MAINT: Low surface, high deep
  • VASCULAR_SUPPORT: Oscillating pattern (blood vessel related)

These gradients reflect organizing principles of brain development and function.”


Step 9: Integration with Other Analyses

💬 Combining with Cell Type Analysis

👤 You: “I also have cell type annotations for my data. Can we see which cell types drive the spatial enrichment patterns we found?”

🤖 Claude: “Let me analyze how cell type distributions relate to your spatial enrichment patterns.”

[ChatSpatial performs integrated analysis]

🤖 Claude: “Cell type - pathway relationships:

🎯 Cell Type Drivers of Spatial Enrichment:

SYNAPTIC_TRANSMISSION Enrichment:

  • Primary drivers: Excitatory neurons (67% contribution)
  • Supporting roles: Inhibitory neurons (23%), Astrocytes (10%)
  • Spatial logic: Follows neuron density gradients closely

OXIDATIVE_PHOSPHORYLATION Enrichment:

  • Primary drivers: Oligodendrocytes (54% contribution)
  • Supporting roles: Astrocytes (31%), Neurons (15%)
  • Spatial logic: White matter oligodendrocyte territories

NEUROGENESIS Enrichment:

  • Primary drivers: Neural stem cells (78% contribution)
  • Supporting roles: Astrocytes (16%), Microglia (6%)
  • Spatial logic: Ventricular zone stem cell niches

INFLAMMATORY_RESPONSE Enrichment:

  • Primary drivers: Microglia (65% contribution)
  • Supporting roles: Astrocytes (25%), Endothelial cells (10%)
  • Spatial logic: Boundary regions and vascular zones

🔑 Key Finding: Spatial enrichment patterns are driven by the spatial organization of specific cell types - the pathway maps reflect cellular architecture.”


Step 10: Best Practices and Tips

💡 Analysis Best Practices

Before Analysis:

  • Quality control - ensure good spatial data quality
  • Preprocessing - proper normalization and filtering
  • Domain identification - understand tissue organization first
  • Gene set selection - choose relevant, well-annotated sets

During Analysis:

  • Multiple gene sets - compare different databases
  • Parameter testing - try different neighbor numbers
  • Spatial validation - check for technical artifacts
  • Statistical rigor - proper multiple testing correction

After Analysis:

  • Biological validation - compare with literature
  • Cross-validation - test robustness
  • Integration - combine with other analyses
  • Visualization - create clear, interpretable figures

🚨 Common Pitfalls to Avoid

Statistical Issues:

  • Ignoring spatial autocorrelation - can inflate significance
  • Multiple testing - correct for multiple pathways tested
  • Small gene sets - need ≥10 genes for reliable results
  • Batch effects - spatial and technical confounding

Interpretation Issues:

  • Over-interpretation - correlation ≠ causation
  • Ignoring cell types - pathway activity may reflect cellular composition
  • Technical artifacts - edge effects, imaging issues
  • Literature bias - well-studied pathways appear more significant

💬 Troubleshooting Common Issues

“No enrichment found”:

  • Try different gene set databases
  • Check gene ID matching (symbols vs Ensembl)
  • Reduce stringency (increase p-value threshold)
  • Verify spatial preprocessing quality

“Results do not make biological sense”:

  • Check for batch effects or technical artifacts
  • Validate with known positive controls
  • Compare with literature and atlases
  • Consider cell type contributions

“Visualization issues”:

  • Adjust color scales and thresholds
  • Try different plot types
  • Check spatial coordinate scaling
  • Ensure adequate resolution

Summary: Your Spatial Enrichment Journey

You now know how to:

  • 🎯 Perform spatial enrichment analysis through natural conversation
  • 🗺️ Load and use gene sets from major databases (MSigDB, GO, KEGG)
  • 📊 Create visualizations of spatial pathway patterns
  • 🧬 Interpret results biologically with proper validation
  • 🔬 Integrate with other analyses (cell types, trajectories, comparisons)
  • Follow best practices for robust, reproducible results

🚀 Next Steps

Ready to explore more? Try these advanced tutorials:

📚 Additional Resources


Ready to analyze the biological organization of your tissue? Start asking ChatSpatial about the pathways and processes in your tissue.