AnnData
AnnData is a Python package for handling annotated data matrices, storing experimental measurements (X) alongside observation metadata (obs)…
Maintainer FreedomIntelligence · Last updated April 1, 2026
Detect and remove doublets (multiple cells captured in one droplet) from single-cell RNA-seq data. Uses Scrublet (Python), DoubletFinder (R), and scDblFinder (R). Essential QC step before clustering to avoid artificial cell populations. Use when identifying and removing doublets from scRNA-seq data.
Original source
https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-single-cell-doublet-detection
Skill Snapshot
Source Doc
Goal: Detect and score doublets in scRNA-seq data using simulated doublet profiles.
Approach: Simulate artificial doublets by combining random cell pairs, embed real and simulated cells together, and score each cell's similarity to simulated doublets.
"Remove doublets from my data" → Identify droplets containing multiple cells by comparing each cell's profile to computationally simulated doublets, then filter flagged cells.
import matplotlib.pyplot as plt
scrub.plot_histogram()
plt.savefig('doublet_histogram.pdf')
## DoubletFinder (R)
**Goal:** Detect doublets in Seurat objects using DoubletFinder's pANN-based classification.
**Approach:** Optimize the pK neighborhood parameter via parameter sweep, compute artificial nearest neighbor proportions, and classify cells as singlets or doublets.
Related skills
AnnData is a Python package for handling annotated data matrices, storing experimental measurements (X) alongside observation metadata (obs)…
Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using paralleli.
Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for nuclear and whole-cel…
Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep includes UMIs and ac…