Concepts

Understanding spatial transcriptomics analysis.


What is Spatial Transcriptomics?

Spatial transcriptomics measures gene expression while preserving the physical location of cells in tissue. Unlike standard single-cell RNA sequencing, it tells you where cells are, not just what they express.

Key insight: Location matters. A tumor cell behaves differently depending on whether it’s surrounded by immune cells or fibroblasts.


Core Analysis Types

Spatial Domains

What it does: Groups tissue regions based on similar gene expression patterns.

When to use: First step after preprocessing. Identifies tissue architecture like tumor regions, immune infiltrates, or tissue layers.

Method selection:

Your Data

Recommended Method

Visium with H&E image

SpaGCN (uses histology)

High-resolution (Xenium, MERFISH)

STAGATE or GraphST

Quick exploratory analysis

Leiden clustering


Cell Type Annotation vs Deconvolution

These two concepts are often confused. Here’s the difference:

Annotation

Deconvolution

Output

“This spot is T cells”

“This spot is 60% T cells, 30% macrophages, 10% fibroblasts”

Best for

Single-cell resolution data

Spot-based data (Visium)

Assumption

One cell type per spot

Multiple cell types per spot

Rule of thumb:

  • Xenium, MERFISH, CosMx: Use annotation (single-cell resolution)

  • Visium, Slide-seq: Use deconvolution (multiple cells per spot)


Cell Communication

What it does: Identifies which cell types are “talking” to each other through ligand-receptor interactions.

Key concept: Cell A expresses a ligand (signal molecule), Cell B expresses the receptor. If they’re spatially close, they may be communicating.

Species matters: Use the correct database:

  • Human: liana_resource="consensus"

  • Mouse: liana_resource="mouseconsensus"


RNA Velocity

What it does: Predicts future cell states by comparing spliced vs unspliced RNA.

Key insight: If a gene has more unspliced RNA, it’s being upregulated. If more spliced, it’s being downregulated. This tells you the “direction” cells are moving.

Requirement: Your data must have spliced and unspliced layers (from velocyto, kallisto, or STARsolo).


Choosing Methods

Deconvolution Methods

Method

Speed

Accuracy

When to Use

FlashDeconv

Fast

Good

Default choice, quick exploration

Cell2location

Slow

Excellent

Final analysis, publication

RCTD

Fast

Good

R users, batch processing

CARD

Medium

Good

Need spatial imputation

Accuracy vs Speed tradeoff: Start with FlashDeconv for exploration, run Cell2location for final figures.


Annotation Methods

Method

Requires

Best For

Tangram

Reference scRNA-seq

Most accurate when reference matches tissue

scANVI

Reference scRNA-seq

Large datasets, GPU available

CellAssign

Marker gene list

When you know cell type markers

mLLMCelltype

Nothing

Quick automated annotation


Spatial Statistics

Analysis

Question It Answers

Moran’s I

“Is this gene spatially clustered?” (global)

Local Moran’s I

“Where are the clusters?” (local hotspots)

Getis-Ord Gi*

“Where are the high/low expression hotspots?”

Neighborhood enrichment

“Do these cell types co-localize?”

Co-occurrence

“How does co-localization change with distance?”


Understanding Results

Interpreting Deconvolution

Good deconvolution results show:

  • Cell type proportions sum to ~1.0 per spot

  • Known tissue structure is visible (e.g., epithelium vs stroma)

  • Proportions correlate with histology

Warning signs:

  • One cell type dominates everywhere (>80%)

  • Proportions don’t match expected tissue composition

  • Results change dramatically with different methods


Interpreting Spatial Statistics

Moran’s I interpretation:

  • I > 0: Clustered (similar values near each other)

  • I ~ 0: Random

  • I < 0: Dispersed (dissimilar values near each other)

p-value: Tests if pattern is significant vs random.


Common Pitfalls

1. Skipping Preprocessing

Most analyses fail because preprocessing wasn’t run. Always preprocess first:

"Preprocess the data"

2. Wrong Species Parameter

Cell communication analysis requires correct species:

# Human data
species="human"

# Mouse data
species="mouse", liana_resource="mouseconsensus"

3. Expecting Single-Cell Resolution from Visium

Visium spots contain 1-10 cells. Use deconvolution to estimate proportions, not annotation to assign types.

4. Using GPU Methods Without GPU

Methods like Cell2location are 10-100x slower without GPU. Either:

  • Set use_gpu=False explicitly

  • Use CPU-friendly alternatives (FlashDeconv, RCTD)


Workflow Patterns

Standard Discovery Workflow

Load → Preprocess → Domains → Markers → Visualize

Best for: Initial exploration of new dataset.

Reference-Based Workflow

Load spatial → Load reference → Preprocess both → Deconvolve → Communicate

Best for: When you have matching single-cell reference data.

Publication Workflow

Load → Preprocess → Domains → Deconvolve → Statistics → Communication → Velocity

Best for: Comprehensive analysis for publication.


Next Steps