Data & ReproSingle-Cell & Spatial OmicsFreedomIntelligence/OpenClaw-Medical-SkillsData & Reproduction
BI

bio-read-qc-umi-processing

Maintainer FreedomIntelligence · Last updated April 1, 2026

Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep includes UMIs and accurate molecule counting is needed, such as in single-cell RNA-seq, low-input RNA-seq, or targeted sequencing to distinguish PCR from biological duplicates.

OpenClawNanoClawAnalysisReproductionbio-read-qc-umi-processing🧬 bioinformatics (gptomics bio-* suite)bioinformatics — sequencing & read qcextract

Original source

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-read-qc-umi-processing

Maintainer
FreedomIntelligence
License
MIT
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • CLI: umi_tools extract + umi_tools dedup (UMI-tools).
  • CLI: fgbio GroupReadsByUmi + fgbio CallMolecularConsensusReads.
  • Deduplicate reads using UMIs" → Extract UMI barcodes, group reads by UMI+position, and collapse PCR duplicates to count unique molecules. CLI: umi_tools extract + umi_tools dedup (UMI-tools) CLI: fgbio GroupReadsByUmi + fgbio CallMolecularConsensusReads.
  • UMIs (Unique Molecular Identifiers) are short random sequences added during library preparation to tag individual molecules before PCR amplification. This enables accurate PCR duplicate removal and molecule counting.
  • umi_tools extract \ --stdin=R1.fastq.gz \ --read2-in=R2.fastq.gz \ --stdout=R1_extracted.fastq.gz \ --read2-out=R2_extracted.fastq.gz \ --bc-pattern=NNNNNNNN.

Source Doc

Excerpt From SKILL.md

UMI at start of R2

umi_tools extract
--stdin=R1.fastq.gz
--read2-in=R2.fastq.gz
--stdout=R1_extracted.fastq.gz
--read2-out=R2_extracted.fastq.gz
--bc-pattern2=NNNNNNNN

UMI in both reads

umi_tools extract
--stdin=R1.fastq.gz
--read2-in=R2.fastq.gz
--stdout=R1_extracted.fastq.gz
--read2-out=R2_extracted.fastq.gz
--bc-pattern=NNNNNNNN
--bc-pattern2=NNNNNNNN


## UMI Pattern Syntax

| Pattern | Meaning |
|---------|---------|
| `N` | UMI base (extracted) |
| `C` | Cell barcode (extracted, kept separate) |
| `X` | Discard base |
| `NNNNNNNN` | 8bp UMI |
| `CCCCCCCCNNNNNNNN` | 8bp cell barcode + 8bp UMI |
| `NNNXXXNNN` | 3bp UMI, skip 3bp, 3bp UMI |

Use cases

  • Use when library prep includes UMIs and accurate molecule counting is needed, such as in single-cell RNA-seq, low-input RNA-seq, or targeted sequencing to distinguish PCR from biological duplicates.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Upstream Related Skills

  • fastp-workflow - Simple UMI extraction during preprocessing
  • quality-filtering - QC before UMI extraction
  • alignment-files/sam-bam-basics - BAM sorting/indexing required before dedup
  • single-cell/preprocessing - scRNA-seq workflows use UMI counting

Related skills

Related skills

Back to directory
AR
Data & ReproSingle-Cell & Spatial Omics

Arboreto

Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using paralleli.

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-imaging-mass-cytometry-cell-segmentation

Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for nuclear and whole-cel…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-single-cell-batch-integration

Integrate multiple scRNA-seq samples/batches using Harmony, scVI, Seurat anchors, and fastMNN. Remove technical variation while preserving b…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-single-cell-cell-annotation

Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for consistent, reproducible…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView