Arboreto
Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using paralleli.
Maintainer FreedomIntelligence · Last updated April 1, 2026
Integrate multiple scRNA-seq samples/batches using Harmony, scVI, Seurat anchors, and fastMNN. Remove technical variation while preserving biological differences. Use when integrating multiple scRNA-seq batches or datasets.
Original source
https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-single-cell-batch-integration
Skill Snapshot
Source Doc
| Tool | Speed | Scalability | Best For |
|---|---|---|---|
| Harmony | Fast | Good | Quick integration, most use cases |
| scVI | Moderate | Excellent | Large datasets, deep learning |
| Seurat CCA/RPCA | Moderate | Good | Conserved biology across batches |
| fastMNN | Fast | Good | MNN-based correction |
Goal: Remove batch effects from merged scRNA-seq datasets using Harmony's iterative correction of PCA embeddings.
Approach: Run PCA on merged data, iteratively adjust embeddings to mix batches while preserving biological variation, and use corrected embeddings for downstream analysis.
"Integrate my batches" → Merge samples, preprocess jointly, correct technical variation in the embedding space, and cluster on corrected coordinates.
library(Seurat)
library(harmony)
Related skills
Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using paralleli.
Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for nuclear and whole-cel…
Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep includes UMIs and ac…
Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for consistent, reproducible…