ChatSpatial API Reference

Complete reference for all ChatSpatial MCP tools, parameters, and data models.

Overview

ChatSpatial provides 16 MCP tools for spatial transcriptomics analysis. Each tool follows the Model Context Protocol specification with:

JSON Schema validation for all inputs and outputs
Structured error handling with detailed error messages
Type-safe parameters with automatic validation
Return types include images, data, and metadata

Tool Categories

Category	Tools	Description
Data Management	`load_data`, `preprocess_data`	Data loading, QC, and preprocessing
Cell Annotation	`annotate_cell_types`	7 annotation methods with reference data support
Spatial Analysis	`analyze_spatial_data`, `identify_spatial_domains`, `register_spatial_data`	Comprehensive spatial pattern analysis, domain identification, and registration
Gene Analysis	`find_spatial_genes`, `find_markers`, `analyze_enrichment`	Spatial variable genes, differential expression, and enrichment
Cell Communication	`analyze_cell_communication`	Ligand-receptor interaction analysis
Deconvolution	`deconvolve_data`	Cell type proportion estimation
Integration	`integrate_samples`	Multi-modal and batch integration
Trajectory	`analyze_velocity_data`, `analyze_trajectory_data`	RNA velocity analysis and trajectory inference
Visualization	`visualize_data`	20 plot types with MCP image outputs

Quick Reference

Essential Tools

# Data loading and preprocessing
load_data(data_path="data.h5ad", name="dataset")
preprocess_data(data_id="dataset", normalize_total=True, log1p=True)

# Core analysis
identify_spatial_domains(data_id="dataset", method="spagcn")
annotate_cell_types(data_id="dataset", method="tangram")
analyze_cell_communication(data_id="dataset", method="liana")
analyze_enrichment(data_id="dataset", method="spatial_enrichmap")

# Advanced spatial analysis
register_spatial_data(source_id="section1", target_id="section2")
analyze_spatial_data(data_id="dataset", params={"analysis_type": "geary", "genes": ["gene"]})

# Visualization
visualize_data(data_id="dataset", plot_type="spatial_domains")

Parameter Patterns

All tools follow consistent parameter patterns:

data_path: Path to data file (load_data only)
data_type: Data format specification (load_data only)
data_id: Required string identifier for loaded datasets
method: Analysis method selection with fallbacks
*_key: Keys for accessing data layers (e.g., spatial_key, batch_key)
use_*: Boolean flags for optional features
n_*: Numeric parameters (neighbors, components, etc.)
*_threshold: Filtering and significance thresholds

Data Management

load_data

Load spatial transcriptomics data from various formats.

Signature:

load_data(
    data_path: str,
    data_type: str = "auto", 
    name: Optional[str] = None,
    context: Context = None
) -> SpatialDataset

Supported Formats:

H5AD: AnnData format with spatial coordinates
CSV: Expression matrix with separate coordinate file
H5/HDF5: Hierarchical data format
10x Visium: Space Ranger outputs
Zarr: Cloud-optimized arrays

Parameters:

Parameter	Type	Default	Description
`data_path`	`str`	-	Path to the data file or directory
`data_type`	`str`	`"auto"`	Type of spatial data (auto, 10x_visium, slide_seq, merfish, seqfish, other, h5ad). If ‘auto’, will try to determine the type from the file extension or directory structure
`name`	`Optional[str]`	`None`	Optional name for the dataset
`context`	`Context`	`None`	Optional MCP context for logging

Example:

result = load_data(
    data_path="data/mouse_brain_visium.h5ad",
    data_type="auto",  # or "other" for generic h5ad files
    name="mouse_brain"
)
print(f"Loaded dataset: {result.id}")

preprocess_data

Preprocessing pipeline for spatial transcriptomics data.

Signature:

preprocess_data(
    data_id: str,
    min_genes: int = 200,
    min_cells: int = 3,
    normalize_total: bool = True,
    log1p: bool = True,
    highly_variable_genes: bool = True,
    n_top_genes: int = 2000,
    pca: bool = True,
    neighbors: bool = True,
    clustering: bool = True,
    umap: bool = True
) -> PreprocessingResult

Features:

Quality control filtering
Normalization and scaling
Highly variable gene selection
Dimensionality reduction (PCA, UMAP)
Neighbor graph construction
Leiden clustering

Cell Annotation

annotate_cell_types

Cell type annotation with multiple methods.

Signature:

annotate_cell_types(
    data_id: str,
    method: str = "tangram",
    reference_data_id: Optional[str] = None,
    marker_genes: Optional[Dict] = None,
    confidence_threshold: float = 0.5
) -> AnnotationResult

Available Methods:

Method	Description	Requirements
`tangram`	Spatial mapping with reference	Single-cell reference data
`sctype`	Automated cell type identification	Tissue type specification
`cell2location`	Probabilistic deconvolution	Reference signatures
`scanvi`	Semi-supervised annotation	Reference data with labels
`cellassign`	Probabilistic assignment	Marker gene matrix
`mllmcelltype`	Multi-modal LLM classifier	Pre-trained model

Example:

# Reference-based annotation with Tangram
result = annotate_cell_types(
    data_id="spatial_dataset",
    method="tangram",
    reference_data_id="reference_scRNA_dataset"
)

# CellAssign with custom marker genes
markers = {
    "T_cells": ["CD3D", "CD3E", "CD3G"],
    "B_cells": ["CD19", "CD20", "MS4A1"],
    "Macrophages": ["CD68", "CD163", "CSF1R"]
}

result = annotate_cell_types(
    data_id="dataset",
    method="cellassign",
    marker_genes=markers
)

Spatial Analysis

identify_spatial_domains

Identify spatial domains and tissue architecture.

Signature:

identify_spatial_domains(
    data_id: str,
    method: str = "spagcn",
    n_clusters: Optional[int] = None,
    resolution: float = 1.0,
    spatial_key: str = "spatial"
) -> SpatialDomainResult

Available Methods:

Method	Description	Use Case
`spagcn`	Graph convolutional networks	General spatial domains
`stagate`	Spatial-temporal attention	Complex tissue architecture
`leiden`	Community detection	Quick clustering
`louvain`	Modularity optimization	Alternative clustering

analyze_spatial_data

Spatial pattern analysis.

Signature:

analyze_spatial_data(
    data_id: str,
    analysis_type: str = "autocorrelation",
    genes: Optional[List[str]] = None,
    method: str = "moran"
) -> SpatialStatisticsResult

Analysis Types:

autocorrelation: Spatial autocorrelation (Moran’s I, Geary’s C)
hotspots: Hotspot detection (Getis-Ord Gi*)
patterns: Spatial expression patterns
neighborhoods: Neighborhood analysis

register_spatial_data

Signature:

register_spatial_data(
    source_id: str,
    target_id: str,
    method: str = "paste",
    landmarks: Optional[List[Dict[str, Any]]] = None
) -> Dict[str, Any]

Available Methods:

Method	Description	Use Case
`paste`	PASTE algorithm for spatial alignment	Multi-slice integration

Features:

Cross-section spatial alignment
Transformation matrix computation
Landmark-guided registration
Batch correction integration
Quality metrics for alignment assessment

Example:

# Register consecutive tissue sections
result = register_spatial_data(
    source_id="section_1",
    target_id="section_2", 
    method="paste"
)

print(f"Registration successful with transformation matrix")
print(f"Alignment quality score: {result['alignment_score']:.3f}")

analyze_spatial_data (Enhanced)

Unified spatial statistics analysis with support for 12 different analysis types.

Signature:

analyze_spatial_data(
    data_id: str,
    params: Dict[str, Any]
) -> SpatialStatisticsResult

Available Analysis Types:

Analysis Type	Description	Key Parameters
`moran`	Global Moran’s I spatial autocorrelation	`genes`, `moran_n_perms`
`local_moran`	Local Moran’s I (LISA) for hotspot detection	`genes`, `n_neighbors`
`geary`	Geary’s C spatial autocorrelation	`genes`, `moran_n_perms`
`getis_ord`	Getis-Ord Gi* hot/cold spot analysis	`genes`, `n_neighbors`
`neighborhood`	Neighborhood enrichment analysis	`cluster_key`, `n_neighbors`
`co_occurrence`	Cell type co-occurrence patterns	`cluster_key`, `n_neighbors`
`ripley`	Ripley’s K/L point pattern analysis	`cluster_key`
`centrality`	Graph centrality measures	`cluster_key`
`bivariate_moran`	Bivariate spatial correlation	`gene_pairs`
`join_count`	Join count for categorical data	`cluster_key`
`network_properties`	Spatial network analysis	`cluster_key`
`spatial_centrality`	Spatial-specific centrality	`cluster_key`

New Unified Gene Selection:

The genes parameter now provides unified gene selection across all relevant analysis types:

# Example: Local Moran's I analysis
result = analyze_spatial_data(
    data_id="tissue_dataset",
    params={
        "analysis_type": "local_moran",
        "genes": ["CD8A", "FOXP3"],  # Unified parameter
        "n_neighbors": 6
    }
)

# Example: Geary's C analysis  
result = analyze_spatial_data(
    data_id="tissue_dataset",
    params={
        "analysis_type": "geary", 
        "genes": ["GAPDH", "MKI67"],  # Same unified parameter
        "n_neighbors": 8
    }
)

Gene Analysis

find_spatial_genes

Identify spatially variable genes using multiple methods.

Signature:

find_spatial_genes(
    data_id: str,
    method: str = "sparkx",
    n_genes: int = 1000,
    alpha: float = 0.05
) -> SpatialVariableGenesResult

Available Methods:

Method	Description	Strengths
`sparkx`	SPARK-X non-parametric method	Fast, accurate
`spatialde`	Gaussian process models	Variable patterns

find_markers

Find marker genes for cell types or spatial domains.

Signature:

find_markers(
    data_id: str,
    groupby: str = "cell_type",
    method: str = "wilcoxon",
    n_genes: int = 100,
    logfc_threshold: float = 0.25
) -> DifferentialExpressionResult

analyze_enrichment

Perform gene set enrichment analysis on spatial transcriptomics data.

Signature:

analyze_enrichment(
    data_id: str,
    method: str = "spatial_enrichmap",
    gene_sets: Optional[Union[List[str], Dict[str, List[str]]]] = None,
    gene_set_database: str = "GO_Biological_Process",
    spatial_key: str = "spatial",
    n_neighbors: int = 6,
    smoothing: bool = True,
    min_genes: int = 10
) -> EnrichmentResult

Available Methods:

Method	Description	Use Case
`spatial_enrichmap`	Spatially-aware enrichment mapping	Spatial pathway analysis
`pathway_gsea`	Gene Set Enrichment Analysis	Ranked gene lists
`pathway_ora`	Over-representation analysis	Discrete gene sets
`pathway_enrichr`	Enrichr web service integration	Online databases
`pathway_ssgsea`	Single-sample GSEA	Sample-level enrichment

Features:

Spatial awareness for tissue-specific pathways
Multiple database support (GO, KEGG, Reactome)
Custom gene set analysis
Spatial smoothing and covariate correction
Statistical significance testing with FDR correction

Example:

# Custom gene set enrichment
custom_pathways = {
    "Neuronal_Signaling": ["SNAP25", "SYN1", "GRIN1", "GRIA1"],
    "Glial_Function": ["GFAP", "AQP4", "S100B", "ALDH1L1"]
}

result = analyze_enrichment(
    data_id="brain_dataset",
    method="spatial_enrichmap",
    gene_sets=custom_pathways,
    smoothing=True,
    n_neighbors=8
)

print(f"Found {result.n_significant} significant pathways")

Cell Communication

analyze_cell_communication

Analyze cell-cell communication using ligand-receptor interactions.

Signature:

analyze_cell_communication(
    data_id: str,
    method: str = "liana",
    groupby: str = "cell_type",
    spatial_mode: str = "global",
    database: str = "consensus"
) -> CellCommunicationResult

Available Methods:

Method	Description	Features
`liana`	Comprehensive LR analysis	Multiple databases, spatial modes
`cellphonedb`	Statistical interaction testing	Permutation testing
`cellchat_liana`	CellChat via LIANA	Pathway analysis

Spatial Modes:

global: Cell type-level interactions
local: Spatially-aware interactions
bivariate: Pairwise spatial analysis

Deconvolution

deconvolve_data

Estimate cell type proportions in spatial transcriptomics data.

Signature:

deconvolve_data(
    data_id: str,
    method: str = "cell2location",
    reference_data_id: Optional[str] = None,
    n_factors: int = 50
) -> DeconvolutionResult

Available Methods:

Method	Description	Requirements
`cell2location`	Bayesian deconvolution	Reference single-cell data
`stereoscope`	Probabilistic deconvolution	Reference signatures
`rctd`	Robust cell type decomposition	Reference profiles

Full documentation will be added in future versions

Integration

integrate_samples

Integrate multiple spatial transcriptomics datasets.

Signature:

integrate_samples(
    data_ids: List[str],
    method: str = "harmony",
    batch_key: str = "batch"
) -> IntegrationResult

Available Methods:

Method	Description	Use Case
`harmony`	Harmony batch correction	Simple batch effects
`scvi`	Variational integration	Complex batch effects
`combat`	ComBat batch correction	Gene expression normalization

Note: Harmony parameters are hardcoded in the implementation for optimal performance:

sigma=0.1 (diversity clustering penalty parameter)
max_iter_harmony=10 (maximum iterations for convergence)
nclust=None (automatic cluster number detection)
verbose=True (progress display enabled)

Full documentation will be added in future versions

Trajectory

analyze_velocity_data

Analyze RNA velocity to understand cellular dynamics.

Signature:

analyze_velocity_data(
    data_id: str,
    method: str = "scvelo",
    mode: str = "dynamical"
) -> VelocityResult

Available Methods:

Method	Description	Features
`scvelo`	Standard RNA velocity analysis	Stochastic, deterministic, and dynamical models
`velovi`	Deep learning RNA velocity	More accurate velocity with uncertainty quantification (requires scvi-tools)
`sirv`	Reference-based velocity	Transfer velocity from reference dataset (not yet implemented)

analyze_trajectory_data

Infer cellular trajectories and pseudotime from spatial data.

Signature:

analyze_trajectory_data(
    data_id: str,
    method: str = "cellrank",
    spatial_weight: float = 0.5
) -> TrajectoryResult

Available Methods:

Method	Description	Features
`dpt`	Diffusion pseudotime	Classic pseudotime inference (no velocity needed)
`palantir`	Probabilistic trajectory inference	Branch probability analysis (no velocity needed)
`cellrank`	RNA velocity-based trajectory inference	Fate mapping and terminal states (requires velocity)

Important Note: VELOVI is a velocity computation method (see analyze_velocity_data above), not a trajectory inference method. After computing velocity with VELOVI, use CellRank, Palantir, or DPT for trajectory inference.

Full documentation will be added in future versions

Visualization

visualize_data

Create visualizations with MCP image outputs.

Signature:

visualize_data(
    data_id: str,
    params: VisualizationParameters
) -> Image

Key Parameters in VisualizationParameters:

plot_type: str = “spatial” (visualization type)
feature: Optional[Union[str, List[str]]] = None (gene/column to visualize)
colormap: str = “viridis” (color scheme)
figure_size: Optional[Tuple[int, int]] = None (width, height)
dpi: int = 100 (resolution)

Plot Types (20 Total):

Type	Description	Use Case
`spatial`	Spatial gene expression	Gene visualization
`spatial_domains`	Spatial domain overlay	Domain identification
`umap`	UMAP embedding	Dimensionality reduction
`heatmap`	Expression heatmap	Multi-gene comparison
`violin`	Distribution plots	Expression distributions
`deconvolution`	Cell type proportion maps	Deconvolution results
`cell_communication`	Communication networks	Interaction visualization
`multi_gene`	Multi-gene spatial panels	Gene comparison
`lr_pairs`	Ligand-receptor pairs	LR interaction analysis
`gene_correlation`	Gene correlation analysis	Co-expression patterns
`rna_velocity`	RNA velocity plots	Trajectory inference
`trajectory`	Developmental trajectories	Pseudotime analysis
`spatial_analysis`	Spatial statistics (6 subtypes)	Pattern analysis
`spatial_enrichment`	Spatial enrichment maps	Functional enrichment
`pathway_enrichment`	Pathway enrichment plots	GSEA visualization
`spatial_interaction`	Cell-cell interactions	Spatial communication
`batch_integration`	Batch integration quality	Batch correction QC

MCP Integration:

All visualizations return MCP Image objects for direct display in LLM agents like Claude Desktop.

Error Handling

Error Types

ChatSpatial implements error handling:

Error Type	Description	Common Causes
ValidationError	Invalid parameters or data format	Wrong parameter types, out-of-range values
DataError	Missing data or incompatible datasets	Missing required columns, incompatible data structures
MethodError	Algorithm-specific failures	Method not applicable to data type
ResourceError	Memory or computation limits	Insufficient memory, timeout exceeded
SystemError	File I/O or environment issues	File not found, permission denied

Error Response Format

{
  "error": {
    "code": 1001,
    "message": "Invalid parameter: n_clusters must be positive",
    "type": "ValidationError",
    "details": {"parameter": "n_clusters", "value": -1},
    "suggestions": ["Use n_clusters > 0", "Set n_clusters=None for auto"]
  }
}

Usage Examples

Chaining Analysis

# Complete workflow  
result = load_data(data_path="data.h5ad", name="sample")
preprocess_data(data_id=result.id)
identify_spatial_domains(data_id=result.id, method="spagcn")
annotate_cell_types(data_id=result.id, method="tangram", reference_data_id="ref")
analyze_cell_communication(data_id=result.id, method="liana")
analyze_enrichment(data_id=result.id, method="spatial_enrichmap")
visualize_data(data_id=result.id, plot_type="spatial_domains")

Parameter Optimization

# Test multiple resolutions
for res in [0.5, 1.0, 1.5, 2.0]:
    identify_spatial_domains(
        data_id="sample",
        resolution=res,
        method="spagcn"
    )

Batch Processing

# Process multiple samples
sample_files = ["sample1.h5ad", "sample2.h5ad", "sample3.h5ad"]
for sample_file in sample_files:
    result = load_data(data_path=f"data/{sample_file}", name=sample_file.replace(".h5ad", ""))
    preprocess_data(data_id=result.id)
    identify_spatial_domains(data_id=result.id)

# Register multiple tissue sections
sections = ["section_1", "section_2", "section_3"]
for i in range(len(sections)-1):
    register_spatial_data(
        source_id=sections[i+1], 
        target_id=sections[i],
        method="paste"
    )

ChatSpatial API Reference

Overview

Tool Categories

Quick Reference

Essential Tools

Parameter Patterns

Data Management

load_data

preprocess_data

Cell Annotation

annotate_cell_types

Spatial Analysis

identify_spatial_domains

analyze_spatial_data

register_spatial_data

analyze_spatial_data (Enhanced)

Gene Analysis

find_spatial_genes

find_markers

analyze_enrichment

Cell Communication

analyze_cell_communication

Deconvolution

deconvolve_data

Integration

integrate_samples

Trajectory

analyze_velocity_data

analyze_trajectory_data

Visualization

visualize_data

Error Handling

Error Types

Error Response Format

Usage Examples

Chaining Analysis

Parameter Optimization

Batch Processing

See Also

Table of contents