Spatial Statistics Methods with ChatSpatial
Identify spatial patterns in your tissue using detailed statistical methods through natural conversation. This tutorial shows you how to perform spatial statistics analysis by asking ChatSpatial what you want to explore - no statistical background required.
What You’ll Learn
By the end of this tutorial, you’ll know how to:
- Perform Bivariate Moran’s I Analysis - discover spatial correlations between gene pairs
- Run Local Indicators of Spatial Association (LISA) - identify hotspots and coldspots
- Execute Neighborhood Enrichment Analysis - understand cell type co-occurrence patterns
- Apply Ripley’s K Function - analyze point pattern distributions
- Calculate Spatial Centrality Metrics - find network hubs in your tissue
- Conduct Join Count Analysis - measure categorical spatial autocorrelation
For each method, you’ll learn:
- What biological questions it answers
- How to ask ChatSpatial to run it
- When to use each approach
- How to interpret the results
Before You Start
What You Need
- ✅ Claude Desktop installed and running
- ✅ ChatSpatial configured (see Getting Started)
- ✅ Your spatial data already loaded and preprocessed
- ✅ Basic spatial analysis completed (see Basic Spatial Analysis)
- ✅ Interest in spatial patterns in your tissue
Understanding Spatial Statistics
Traditional statistics asks: “Is this gene differentially expressed?”
Spatial statistics asks: “Is this biological pattern organized spatially in my tissue?”
This approach identifies:
- 🗺️ Spatial hotspots - regions of concentrated activity
- 🔗 Spatial correlations - how genes co-vary across space
- 📍 Point patterns - how cells are distributed spatially
- 🌐 Network structures - connectivity patterns in tissue architecture
Method 1: Bivariate Moran’s I Analysis
Understanding Bivariate Spatial Correlation
What it does: Measures whether two genes show similar spatial patterns - are they both high or low in the same regions?
Biological applications:
- Co-regulated gene pairs in development
- Metabolic pathway coordination across tissue
- Stress response gene co-localization
- Transcription factor-target relationships
💬 Real Conversation Example
👤 You: “I have two genes, CREB1 and BDNF, that I think might be co-regulated in memory formation. Can you check if they show similar spatial patterns in my mouse brain data?”
🤖 Claude: “Good hypothesis! CREB1 and BDNF are indeed key memory formation genes. Let me perform bivariate Moran’s I analysis to see if they show coordinated spatial patterns in your brain tissue.”
[ChatSpatial analyzes bivariate spatial correlation]
🤖 Claude: “Results show that your memory genes display strong spatial co-regulation:
🧠 Bivariate Moran’s I Results for CREB1 vs BDNF:
Global Bivariate Correlation:
- Bivariate Moran’s I: 0.67 (strong positive spatial correlation)
- p-value: < 0.001 (highly significant)
- Interpretation: These genes are spatially co-expressed across your brain tissue
Local Spatial Co-regulation:
- Hippocampus CA1: Both genes peak together (local I = 0.85)
- Cortical layers 2-3: Strong coordinated expression (local I = 0.72)
- White matter: Both consistently low (local I = 0.61)
Biological Interpretation: CREB1 and BDNF show spatial coordination, consistent with their role as co-regulated memory formation factors. They are highest in brain regions critical for learning and memory.”
👤 You: “That is informative! Can you check a few more gene pairs involved in memory circuits?”
🤖 Claude: “Absolutely! Let me analyze multiple memory-related gene pairs to map your brain’s memory circuit architecture.”
🎯 When to Use Bivariate Moran’s I
Used for:
- 🧬 Testing co-regulation hypotheses - do pathway genes coordinate spatially?
- 📊 Validating gene networks - are network partners spatially correlated?
- 🔬 Discovering new interactions - find unexpected spatial correlations
- 🎯 Pathway analysis - understand spatial organization of biological processes
💡 Natural Ways to Ask
Questions you can ask:
- “Do these two genes show similar spatial patterns?”
- “Are GENE1 and GENE2 spatially correlated?”
- “Can you find genes that co-vary with my favorite gene spatially?”
- “Which gene pairs show the strongest spatial correlation?”
- “Test these pathway genes for spatial coordination”
Method 2: Local Indicators of Spatial Association (LISA)
Understanding LISA Analysis
What it does: Identifies specific tissue locations where genes show unusual spatial clustering - hotspots where expression is much higher than expected.
Biological applications:
- Disease lesion identification
- Stem cell niche detection
- Immune infiltration zones
- Developmental organizing centers
💬 Real Conversation Example
👤 You: “I want to find hotspots of immune activity in my tumor tissue. Can you identify where immune response genes are spatially clustered?”
🤖 Claude: “Good question for understanding tumor immunity! Let me perform LISA analysis to identify immune hotspots in your tumor tissue using local spatial association statistics.”
[ChatSpatial runs LISA analysis on immune genes]
🤖 Claude: “Analysis reveals immune spatial organization:
🎯 LISA Results for Immune Response Genes:
CD68 (Macrophage Marker):
- High-High clusters: 3 distinct hotspots (89 spots total)
- Hotspot 1: Tumor edge - dense macrophage infiltration
- Hotspot 2: Necrotic core boundary - cleanup macrophages
- Hotspot 3: Vascular regions - patrolling macrophages
- Significance: p < 0.001 for all hotspots
IFNG (T-cell Activation):
- High-High clusters: 2 major hotspots (56 spots total)
- Hotspot 1: Tumor-stroma interface - active immune response
- Hotspot 2: Lymphoid aggregate - T-cell education zone
- Low-Low regions: Tumor core - immune exclusion zones
IL1B (Inflammatory Cytokine):
- High-High clusters: Overlaps with macrophage hotspots
- Spatial pattern: Ring-like distribution around tumor areas
- Biological significance: Inflammatory barrier formation
🔍 Key Finding: Your tumor has 3 distinct immune microenvironments - infiltrated edges, excluded cores, and vascular patrol zones.”
👤 You: “Can you create a map showing these immune hotspots?”
🎯 When to Use LISA Analysis
Used for:
- 🎯 Hotspot identification - where are the action zones?
- 🏥 Disease analysis - locate pathological processes
- 🧪 Quality control - detect technical artifacts or batch effects
- 🌱 Development studies - find organizing centers and gradients
💡 Natural Ways to Ask
Questions you can ask:
- “Where are the hotspots for this gene?”
- “Can you find spatial clusters of immune activity?”
- “Identify regions where my gene is unusually high”
- “Show me local spatial patterns for this pathway”
- “Find the organizing centers in my developmental data”
Method 3: Neighborhood Enrichment Analysis
Understanding Cell Type Co-occurrence
What it does: Determines whether specific cell types prefer to be neighbors - are certain cell types more likely to be adjacent than expected by chance?
Biological applications:
- Tissue architecture analysis
- Cell-cell communication zones
- Developmental lineage tracking
- Pathological tissue organization
💬 Real Conversation Example
👤 You: “In my brain organoid data, I want to understand how different cell types organize themselves. Do neurons prefer to be near astrocytes? Can you analyze cell type neighborhoods?”
🤖 Claude: “Good question for understanding brain tissue architecture! Let me perform neighborhood enrichment analysis to map how your brain organoid cell types organize spatially.”
[ChatSpatial analyzes cell type neighborhoods]
🤖 Claude: “Analysis reveals cellular architecture patterns:
🔍 Neighborhood Enrichment Results:
Strong Positive Associations (Preferred Neighbors):
- Excitatory Neurons ↔ Astrocytes: Z-score = +4.2 (p < 0.001)
- Biology: Astrocytes provide metabolic support to active neurons
- Inhibitory Neurons ↔ Excitatory Neurons: Z-score = +3.8 (p < 0.001)
- Biology: Balanced excitation-inhibition circuits
- Oligodendrocytes ↔ Neurons: Z-score = +3.1 (p < 0.01)
- Biology: Myelination of axonal projections
Neutral Associations (Random Distribution):
- Microglia ↔ All Types: Z-scores near 0
- Biology: Uniform tissue surveillance pattern
Negative Associations (Spatial Avoidance):
- Neural Stem Cells ↔ Mature Neurons: Z-score = -2.9 (p < 0.01)
- Biology: Stem cells concentrate in neurogenic niches
- Astrocytes ↔ Astrocytes: Z-score = -2.1 (p < 0.05)
- Biology: Territorial spacing to cover tissue efficiently
🎯 Key Finding: Your brain organoid recapitulates natural tissue architecture - supporting cells cluster with neurons they serve, while maintaining functional territories.”
👤 You: “Can you show me which regions have the most organized cell type patterns?”
🎯 When to Use Neighborhood Enrichment
Used for:
- 🏗️ Tissue architecture - understanding structural organization
- 💬 Communication analysis - identifying interaction zones
- 🔬 Developmental biology - tracking lineage relationships
- 🏥 Pathology studies - how disease disrupts normal organization
💡 Natural Ways to Ask
Questions you can ask:
- “Which cell types prefer to be neighbors?”
- “How do my cell types organize spatially?”
- “Are there cell-cell communication hotspots?”
- “Do tumor cells avoid immune cells?”
- “Show me cell type territorial patterns”
Method 4: Ripley’s K Function Analysis
Understanding Point Pattern Analysis
What it does: Analyzes whether cells of specific types are randomly distributed, clustered together, or regularly spaced across tissue.
Biological applications:
- Stem cell niche organization
- Immune cell patrol patterns
- Blood vessel distribution analysis
- Tumor cell invasion patterns
💬 Real Conversation Example
👤 You: “I have cancer stem cells marked in my tumor tissue. Are they randomly scattered or do they form organized clusters? Can you analyze their spatial distribution pattern?”
🤖 Claude: “Good question about cancer stem cell organization! Ripley’s K function analysis will reveal whether your stem cells show clustering, regular spacing, or random distribution - each pattern has different biological implications.”
[ChatSpatial performs Ripley’s K analysis]
🤖 Claude: “Analysis reveals cancer stem cell spatial organization:
📊 Ripley’s K Function Results:
Cancer Stem Cell Distribution Pattern:
- Pattern Type: Strong clustering (non-random)
- Clustering Scale: Most pronounced at 50-150 μm distances
- Peak Clustering: At ~80 μm (typical niche size)
- Significance: p < 0.001 vs random distribution
Spatial Characteristics:
- Cluster Count: 7 distinct stem cell clusters identified
- Average Cluster Size: 12.3 cells per cluster
- Inter-cluster Distance: 280 ± 45 μm (regular spacing)
- Pattern Interpretation: Territorial niche organization
Cross-type Analysis (Stem Cells vs Other Cancer Cells):
- Attraction Zone: 0-60 μm (stem cells recruit other cancer cells)
- Neutral Zone: 60-120 μm (random association)
- Exclusion Zone: >120 μm (stem cell territorial boundaries)
🔬 Biological Interpretation: Your cancer stem cells form organized niches with territorial spacing - they create supportive microenvironments while maintaining optimal resource access through regular spacing.”
👤 You: “Can you compare this pattern to how normal stem cells are distributed?”
🎯 When to Use Ripley’s K Function
Used for:
- 🧬 Stem cell studies - understanding niche organization
- 🩸 Vascular analysis - blood vessel distribution patterns
- 🦠 Immune surveillance - patrol pattern analysis
- 🔬 Development studies - how cells organize during morphogenesis
💡 Natural Ways to Ask
Questions you can ask:
- “Are my cells randomly distributed or clustered?”
- “Do these cell types form organized patterns?”
- “What’s the typical spacing between these cells?”
- “Analyze the distribution pattern of my marked cells”
- “Compare clustering patterns between conditions”
Method 5: Spatial Centrality Analysis
Understanding Network-Based Spatial Metrics
What it does: Identifies the most “important” locations in your tissue based on spatial connectivity - which regions are hubs that connect different tissue areas?
Biological applications:
- Identifying tissue organizing centers
- Finding communication bottlenecks
- Discovering critical transition zones
- Understanding information flow patterns
💬 Real Conversation Example
👤 You: “In my brain tissue, I want to find the most important regions that connect different brain areas. Which locations act as hubs in the spatial network?”
🤖 Claude: “Good question about brain connectivity! Spatial centrality analysis will identify the hub regions that are most critical for connecting different parts of your brain tissue - like finding the major intersections in a city.”
[ChatSpatial analyzes spatial network centrality]
🤖 Claude: “Analysis reveals spatial network architecture:
🎯 Spatial Centrality Analysis Results:
Top Network Hubs (Highest Betweenness Centrality):
- Hub 1: Corpus callosum region (centrality = 0.89)
- Role: Bridge between left-right hemispheres
- Connection count: Links 23 different brain regions
- Hub 2: Hippocampal-cortical boundary (centrality = 0.76)
- Role: Memory-cortex integration center
- Connection count: Links 18 regions
- Hub 3: White matter junction (centrality = 0.68)
- Role: Multi-tract convergence zone
- Connection count: Links 15 regions
Network Properties:
- Most Connected Spots: Average 12.4 connections per hub
- Network Efficiency: 0.73 (highly efficient connectivity)
- Clustering Coefficient: 0.58 (modular organization)
- Small-World Index: 2.1 (optimal information transfer)
Critical Pathways:
- Information bottlenecks: 3 spots that, if removed, would disconnect major brain regions
- Redundant pathways: Multiple routes between cortical areas (robust network)
- Modular boundaries: Clear boundaries between functional brain modules
🔍 Key Finding: Your brain tissue shows classic small-world network architecture with efficient hubs that minimize connection distances while maintaining modular organization.”
👤 You: “Can you identify which genes are most highly expressed in these network hubs?”
🎯 When to Use Spatial Centrality Analysis
Used for:
- 🧠 Brain connectivity studies - understanding neural network architecture
- 🩸 Vascular network analysis - identifying critical vessel junctions
- 🔬 Developmental studies - finding organizing centers
- 🏥 Pathology research - understanding how disease disrupts connectivity
💡 Natural Ways to Ask
Questions you can ask:
- “Which regions are the most important hubs in my tissue?”
- “Find the network centers in my spatial data”
- “What are the critical connection points?”
- “Identify bottlenecks in my tissue architecture”
- “Show me the spatial network structure”
Method 6: Join Count Analysis
Understanding Categorical Spatial Autocorrelation
What it does: Measures whether similar categories (like cell types or tissue states) tend to cluster together spatially more than expected by chance.
Biological applications:
- Tissue zonation analysis
- Cell state transition boundaries
- Pathological region organization
- Developmental compartment formation
💬 Real Conversation Example
👤 You: “I have liver tissue with healthy, fibrotic, and necrotic regions marked. Do similar tissue states cluster together, or are they randomly mixed? Can you analyze the spatial organization of these tissue states?”
🤖 Claude: “Good question for understanding tissue pathology organization! Join count analysis will reveal whether your liver tissue states show spatial clustering - this tells us about disease progression patterns and tissue organization principles.”
[ChatSpatial performs join count analysis]
🤖 Claude: “Analysis reveals tissue organization patterns:
🔍 Join Count Analysis Results:
Healthy-Healthy Joins:
- Observed: 287 adjacent healthy-healthy pairs
- Expected: 198 pairs (if randomly distributed)
- Z-score: +6.8 (p < 0.001)
- Interpretation: Healthy tissue strongly clusters together
Fibrotic-Fibrotic Joins:
- Observed: 156 adjacent fibrotic-fibrotic pairs
- Expected: 89 pairs (if random)
- Z-score: +5.2 (p < 0.001)
- Interpretation: Fibrotic regions form connected zones
Necrotic-Necrotic Joins:
- Observed: 78 adjacent necrotic-necrotic pairs
- Expected: 34 pairs (if random)
- Z-score: +7.1 (p < 0.001)
- Interpretation: Necrotic areas cluster in discrete patches
Cross-State Boundaries:
- Healthy-Fibrotic: Less than expected (Z = -3.4) - sharp boundaries
- Fibrotic-Necrotic: More than expected (Z = +2.8) - transition zones
- Healthy-Necrotic: Much less than expected (Z = -5.9) - rarely adjacent
🎯 Key Finding: Your liver shows organized pathological zonation - healthy regions remain protected, fibrotic areas form transition zones around necrotic cores, and there are sharp boundaries preventing healthy-necrotic contact.”
👤 You: “Can you quantify how sharp these boundaries are between tissue states?”
🎯 When to Use Join Count Analysis
Used for:
- 🏥 Pathology studies - understanding disease organization
- 🧬 Cell state analysis - measuring state transition sharpness
- 🌱 Development research - compartment boundary formation
- 🔬 Tissue zonation - liver zones, kidney regions, brain layers
💡 Natural Ways to Ask
Questions you can ask:
- “Do similar cell types cluster together spatially?”
- “Are my tissue states randomly mixed or organized?”
- “How sharp are the boundaries between these regions?”
- “Analyze clustering of my categorical data”
- “Do disease states show spatial organization?”
Specialized Integration: Combining Multiple Methods
💬 Multi-Method Analysis Example
👤 You: “I want to do a complete spatial analysis of my brain development data. Can you combine multiple spatial statistics methods to give me the full picture?”
🤖 Claude: “Good approach! A multi-method spatial analysis will reveal different aspects of your brain development patterns. Let me run a complete spatial statistics battery.”
[ChatSpatial performs integrated analysis]
🤖 Claude: “Complete spatial architecture analysis finished:
🎯 Integrated Spatial Statistics Results:
Method 1 - LISA Hotspots:
- NEUROG2: 4 neurogenesis hotspots in ventricular zones
- SHH: Organizing center hotspot in floor plate region
- PAX6: Dorsal telencephalon development hotspot
Method 2 - Bivariate Correlations:
- NEUROG2 ↔ ASCL1: r = 0.73 (proneural gene coordination)
- SHH ↔ GLI1: r = 0.84 (signaling pathway integrity)
- WNT1 ↔ EN1: r = 0.69 (midbrain patterning coordination)
Method 3 - Cell Type Neighborhoods:
- Neural Progenitors ↔ Radial Glia: Strong association (Z = +5.2)
- Differentiating Neurons ↔ Progenitors: Moderate association (Z = +2.8)
- Mature Neurons ↔ Progenitors: Avoidance pattern (Z = -3.1)
Method 4 - Network Centrality:
- Hub 1: Ventricular zone organizing center (centrality = 0.92)
- Hub 2: Cortical plate formation zone (centrality = 0.76)
- Hub 3: Boundary region coordinators (centrality = 0.64)
Method 5 - Ripley’s K Patterns:
- Neural Progenitors: Strong clustering (50-100 μm scale)
- Differentiating Neurons: Transition from clustered to regular spacing
- Mature Neurons: Regular spacing pattern (optimal connectivity)
🧠 Integrated Biological Interpretation: Your developing brain shows classic developmental organization - organizing centers create morphogen gradients (LISA), coordinate gene networks (bivariate), establish cell type territories (neighborhoods), form connectivity hubs (centrality), and regulate cell distribution patterns (Ripley’s K) in a coordinated developmental program.”
Method Selection Guide
🎯 Which Method Should I Use?
| Research Question | Best Method | Why This Method? |
|---|---|---|
| “Do these two genes coordinate spatially?” | Bivariate Moran’s I | Measures spatial correlation between gene pairs |
| “Where are the hotspots for this process?” | LISA Analysis | Identifies local clusters and spatial outliers |
| “How do cell types organize spatially?” | Neighborhood Enrichment | Quantifies cell-cell spatial associations |
| “Are cells randomly distributed?” | Ripley’s K Function | Tests point pattern randomness vs clustering |
| “Which regions are network hubs?” | Spatial Centrality | Identifies important connectivity nodes |
| “Do similar states cluster together?” | Join Count Analysis | Measures categorical spatial autocorrelation |
💡 Combining Methods Strategy
Start with LISA → Find interesting hotspots Add Bivariate Moran’s → Understand gene coordination in hotspots Include Neighborhood Analysis → See how cell types organize around hotspots Apply Ripley’s K → Understand cell distribution patterns Finish with Centrality → Identify network importance of regions
Best Practices and Tips
✅ Before Analysis
Data Preparation:
- Ensure spatial coordinates are accurate
- Check for batch effects or technical artifacts
- Verify cell type annotations are reliable
- Remove low-quality spots/cells
Parameter Selection:
- Start with standard neighbor numbers (6-30)
- Test different spatial scales for your tissue
- Use appropriate significance thresholds
- Consider multiple testing correction
✅ During Analysis
Quality Control:
- Check that results make biological sense
- Validate with known positive controls
- Compare across different parameter settings
- Look for technical artifacts (edge effects, etc.)
Interpretation:
- Consider both statistical and biological significance
- Account for tissue architecture and biology
- Compare with literature and atlases
- Think about multiple scales of organization
✅ After Analysis
Validation:
- Cross-validate with independent datasets
- Confirm key findings with targeted experiments
- Compare results across analysis methods
- Test robustness to parameter choices
Integration:
- Combine with other omics data types
- Link to functional studies
- Connect to biological pathways
- Consider clinical or therapeutic relevance
Common Pitfalls to Avoid
🚨 Statistical Issues
❌ Multiple Testing Problems:
- Testing hundreds of genes without correction
- Solution: Use FDR correction, focus on strong effects
❌ Ignoring Spatial Scale:
- Using inappropriate neighbor distances
- Solution: Test multiple scales, consider tissue architecture
❌ Technical Artifacts:
- Edge effects, batch effects, imaging artifacts
- Solution: Quality control, technical replicates
🚨 Interpretation Issues
❌ Over-interpretation:
- Assuming correlation implies causation
- Solution: Validate with experiments, consider alternative explanations
❌ Ignoring Biology:
- Focusing only on statistics, not biological meaning
- Solution: Connect results to known biology, literature
❌ Single-Method Bias:
- Relying on one analysis approach
- Solution: Use multiple complementary methods
Troubleshooting Guide
“No significant results found”
Possible causes:
- Spatial scale mismatch with biological processes
- Low signal-to-noise ratio in data
- Inappropriate parameter settings
Solutions to try:
- Adjust neighbor parameters (try 6, 15, 30 neighbors)
- Focus on highly variable genes
- Check data quality and preprocessing
- Reduce statistical stringency initially
“Results do not make biological sense”
Possible causes:
- Technical artifacts or batch effects
- Incorrect spatial coordinates
- Poor cell type annotations
Solutions to try:
- Validate spatial coordinates
- Check for batch effects
- Compare with positive controls
- Examine raw data quality
“Analysis takes too long”
Possible causes:
- Large datasets with many features
- Computationally intensive methods
- Suboptimal parameters
Solutions to try:
- Focus on subset of highly variable genes
- Use spatial subsampling for exploration
- Optimize neighbor graph computation
- Run analysis in batches
Summary: Your Spatial Statistics Toolkit
Congratulations! You now know how to:
🎯 Perform six spatial statistics methods through natural conversation 📊 Choose the right method for your biological question 🔬 Interpret results biologically with confidence 🧬 Combine multiple approaches for complete analysis ⭐ Avoid common pitfalls and troubleshoot issues
🚀 Next Steps
Ready to explore more advanced spatial analysis? Try these tutorials:
- Spatial Gene Set Enrichment - Map biological pathways spatially
- Cell Communication Analysis - Discover cell-cell signaling patterns
- Trajectory Analysis - Track developmental and dynamic processes
📚 Additional Resources
- Spatial Statistics Theory - Mathematical background
- PySAL Documentation - Spatial analysis library details
- Spatial Transcriptomics Best Practices - Analysis guidelines
Ready to identify spatial patterns in your tissue? Start asking ChatSpatial about the spatial statistics methods that interest you most.