Concepts¶
Understanding spatial transcriptomics analysis.
What is Spatial Transcriptomics?¶
Spatial transcriptomics measures gene expression while preserving the physical location of cells in tissue. Unlike standard single-cell RNA sequencing, it tells you where cells are, not just what they express.
Key insight: Location matters. A tumor cell behaves differently depending on whether it’s surrounded by immune cells or fibroblasts.
Core Analysis Types¶
Spatial Domains¶
What it does: Groups tissue regions based on similar gene expression patterns.
When to use: First step after preprocessing. Identifies tissue architecture like tumor regions, immune infiltrates, or tissue layers.
Method selection:
Your Data |
Recommended Method |
|---|---|
Visium with H&E image |
SpaGCN (uses histology) |
High-resolution (Xenium, MERFISH) |
STAGATE or GraphST |
Quick exploratory analysis |
Leiden clustering |
Cell Type Annotation vs Deconvolution¶
These two concepts are often confused. Here’s the difference:
Annotation |
Deconvolution |
|
|---|---|---|
Output |
“This spot is T cells” |
“This spot is 60% T cells, 30% macrophages, 10% fibroblasts” |
Best for |
Single-cell resolution data |
Spot-based data (Visium) |
Assumption |
One cell type per spot |
Multiple cell types per spot |
Rule of thumb:
Xenium, MERFISH, CosMx: Use annotation (single-cell resolution)
Visium, Slide-seq: Use deconvolution (multiple cells per spot)
Cell Communication¶
What it does: Identifies which cell types are “talking” to each other through ligand-receptor interactions.
Key concept: Cell A expresses a ligand (signal molecule), Cell B expresses the receptor. If they’re spatially close, they may be communicating.
Species matters: Use the correct database:
Human:
liana_resource="consensus"Mouse:
liana_resource="mouseconsensus"
RNA Velocity¶
What it does: Predicts future cell states by comparing spliced vs unspliced RNA.
Key insight: If a gene has more unspliced RNA, it’s being upregulated. If more spliced, it’s being downregulated. This tells you the “direction” cells are moving.
Requirement: Your data must have spliced and unspliced layers (from velocyto, kallisto, or STARsolo).
Choosing Methods¶
Deconvolution Methods¶
Method |
Speed |
Accuracy |
When to Use |
|---|---|---|---|
FlashDeconv |
Fast |
Good |
Default choice, quick exploration |
Cell2location |
Slow |
Excellent |
Final analysis, publication |
RCTD |
Fast |
Good |
R users, batch processing |
CARD |
Medium |
Good |
Need spatial imputation |
Accuracy vs Speed tradeoff: Start with FlashDeconv for exploration, run Cell2location for final figures.
Annotation Methods¶
Method |
Requires |
Best For |
|---|---|---|
Tangram |
Reference scRNA-seq |
Most accurate when reference matches tissue |
scANVI |
Reference scRNA-seq |
Large datasets, GPU available |
CellAssign |
Marker gene list |
When you know cell type markers |
mLLMCelltype |
Nothing |
Quick automated annotation |
Spatial Statistics¶
Analysis |
Question It Answers |
|---|---|
Moran’s I |
“Is this gene spatially clustered?” (global) |
Local Moran’s I |
“Where are the clusters?” (local hotspots) |
Getis-Ord Gi* |
“Where are the high/low expression hotspots?” |
Neighborhood enrichment |
“Do these cell types co-localize?” |
Co-occurrence |
“How does co-localization change with distance?” |
Understanding Results¶
Interpreting Deconvolution¶
Good deconvolution results show:
Cell type proportions sum to ~1.0 per spot
Known tissue structure is visible (e.g., epithelium vs stroma)
Proportions correlate with histology
Warning signs:
One cell type dominates everywhere (>80%)
Proportions don’t match expected tissue composition
Results change dramatically with different methods
Interpreting Spatial Statistics¶
Moran’s I interpretation:
I > 0: Clustered (similar values near each other)
I ~ 0: Random
I < 0: Dispersed (dissimilar values near each other)
p-value: Tests if pattern is significant vs random.
Common Pitfalls¶
1. Skipping Preprocessing¶
Most analyses fail because preprocessing wasn’t run. Always preprocess first:
"Preprocess the data"
2. Wrong Species Parameter¶
Cell communication analysis requires correct species:
# Human data
species="human"
# Mouse data
species="mouse", liana_resource="mouseconsensus"
3. Expecting Single-Cell Resolution from Visium¶
Visium spots contain 1-10 cells. Use deconvolution to estimate proportions, not annotation to assign types.
4. Using GPU Methods Without GPU¶
Methods like Cell2location are 10-100x slower without GPU. Either:
Set
use_gpu=FalseexplicitlyUse CPU-friendly alternatives (FlashDeconv, RCTD)
Workflow Patterns¶
Standard Discovery Workflow¶
Load → Preprocess → Domains → Markers → Visualize
Best for: Initial exploration of new dataset.
Reference-Based Workflow¶
Load spatial → Load reference → Preprocess both → Deconvolve → Communicate
Best for: When you have matching single-cell reference data.
Publication Workflow¶
Load → Preprocess → Domains → Deconvolve → Statistics → Communication → Velocity
Best for: Comprehensive analysis for publication.
Next Steps¶
Quick Start — Get running in 5 minutes
Examples — See all analysis types
Methods Reference — Full parameter details