Concepts¶

Understanding spatial transcriptomics analysis.

What is Spatial Transcriptomics?¶

Spatial transcriptomics measures gene expression while preserving the physical location of cells in tissue. Unlike standard single-cell RNA sequencing, it tells you where cells are, not just what they express.

Key insight: Location matters. A tumor cell behaves differently depending on whether it’s surrounded by immune cells or fibroblasts.

Core Analysis Types¶

Spatial Domains¶

What it does: Groups tissue regions based on similar gene expression patterns.

When to use: First step after preprocessing. Identifies tissue architecture like tumor regions, immune infiltrates, or tissue layers.

Method selection:

Your Data	Recommended Method
Visium with H&E image	SpaGCN (uses histology)
High-resolution (Xenium, MERFISH)	STAGATE or GraphST
Quick exploratory analysis	Leiden clustering

Cell Type Annotation vs Deconvolution¶

These two concepts are often confused. Here’s the difference:

	Annotation	Deconvolution
Output	“This spot is T cells”	“This spot is 60% T cells, 30% macrophages, 10% fibroblasts”
Best for	Single-cell resolution data	Spot-based data (Visium)
Assumption	One cell type per spot	Multiple cell types per spot

Rule of thumb:

Xenium, MERFISH, CosMx: Use annotation (single-cell resolution)
Visium, Slide-seq: Use deconvolution (multiple cells per spot)

Cell Communication¶

What it does: Identifies which cell types are “talking” to each other through ligand-receptor interactions.

Key concept: Cell A expresses a ligand (signal molecule), Cell B expresses the receptor. If they’re spatially close, they may be communicating.

Species matters: Use the correct database:

Human: liana_resource="consensus"
Mouse: liana_resource="mouseconsensus"

RNA Velocity¶

What it does: Predicts future cell states by comparing spliced vs unspliced RNA.

Key insight: If a gene has more unspliced RNA, it’s being upregulated. If more spliced, it’s being downregulated. This tells you the “direction” cells are moving.

Requirement: Your data must have spliced and unspliced layers (from velocyto, kallisto, or STARsolo).

Choosing Methods¶

Deconvolution Methods¶

Method	Speed	Accuracy	When to Use
FlashDeconv	Fast	Good	Default choice, quick exploration
Cell2location	Slow	Excellent	Final analysis, publication
RCTD	Fast	Good	R users, batch processing
CARD	Medium	Good	Need spatial imputation

Accuracy vs Speed tradeoff: Start with FlashDeconv for exploration, run Cell2location for final figures.

Annotation Methods¶

Method	Requires	Best For
Tangram	Reference scRNA-seq	Most accurate when reference matches tissue
scANVI	Reference scRNA-seq	Large datasets, GPU available
CellAssign	Marker gene list	When you know cell type markers
mLLMCelltype	Nothing	Quick automated annotation

Spatial Statistics¶

Analysis	Question It Answers
Moran’s I	“Is this gene spatially clustered?” (global)
Local Moran’s I	“Where are the clusters?” (local hotspots)
Getis-Ord Gi*	“Where are the high/low expression hotspots?”
Neighborhood enrichment	“Do these cell types co-localize?”
Co-occurrence	“How does co-localization change with distance?”

Understanding Results¶

Interpreting Deconvolution¶

Good deconvolution results show:

Cell type proportions sum to ~1.0 per spot
Known tissue structure is visible (e.g., epithelium vs stroma)
Proportions correlate with histology

Warning signs:

One cell type dominates everywhere (>80%)
Proportions don’t match expected tissue composition
Results change dramatically with different methods

Interpreting Spatial Statistics¶

Moran’s I interpretation:

I > 0: Clustered (similar values near each other)
I ~ 0: Random
I < 0: Dispersed (dissimilar values near each other)

p-value: Tests if pattern is significant vs random.

Common Pitfalls¶

1. Skipping Preprocessing¶

Most analyses fail because preprocessing wasn’t run. Always preprocess first:

"Preprocess the data"

2. Wrong Species Parameter¶

Cell communication analysis requires correct species:

# Human data
species="human"

# Mouse data
species="mouse", liana_resource="mouseconsensus"

3. Expecting Single-Cell Resolution from Visium¶

Visium spots contain 1-10 cells. Use deconvolution to estimate proportions, not annotation to assign types.

4. Using GPU Methods Without GPU¶

Methods like Cell2location are 10-100x slower without GPU. Either:

Set use_gpu=False explicitly
Use CPU-friendly alternatives (FlashDeconv, RCTD)

Workflow Patterns¶

Standard Discovery Workflow¶

Load → Preprocess → Domains → Markers → Visualize

Best for: Initial exploration of new dataset.

Reference-Based Workflow¶

Load spatial → Load reference → Preprocess both → Deconvolve → Communicate

Best for: When you have matching single-cell reference data.

Publication Workflow¶

Load → Preprocess → Domains → Deconvolve → Statistics → Communication → Velocity

Best for: Comprehensive analysis for publication.

Next Steps¶

Quick Start — Get running in 5 minutes
Examples — See all analysis types
Methods Reference — Full parameter details