数据与复现单细胞与空间组学FreedomIntelligence/OpenClaw-Medical-Skills数据与复现
BI

bio-read-qc-umi-processing

维护者 FreedomIntelligence · 最近更新 2026年4月1日

Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep includes UMIs and accurate molecule counting is needed, such as in single-cell RNA-seq, low-input RNA-seq, or targeted sequencing to distinguish PCR from biological duplicates.

OpenClawNanoClaw分析处理复现实验bio-read-qc-umi-processing🧬 bioinformatics (gptomics bio-* suite)bioinformatics — sequencing & read qcextract

原始来源

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-read-qc-umi-processing

维护者
FreedomIntelligence
许可
MIT
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • CLI:umi_tools extract + umi_tools dedup (UMI-tools)。
  • CLI:fgbio GroupReadsByUmi + fgbio CallMolecularConsensusReads。
  • Deduplicate reads ,使用 UMIs" → Extract UMI barcodes,group reads by UMI+position,、 collapse PCR duplicates to count unique molecules. CLI:umi_tools extract + umi_tools dedup (UMI-tools) CLI:fgbio GroupReadsByUmi + fgbio CallMolecularConsensusReads。
  • UMIs (Unique Molecular Identifiers) are short random sequences added during 库 preparation to tag individual molecules before PCR amplification. This enables accurate PCR duplicate removal 、 molecule counting。
  • umi_tools extract \ --stdin=R1.fastq.gz \ --read2-in=R2.fastq.gz \ --stdout=R1_extracted.fastq.gz \ --read2-out=R2_extracted.fastq.gz \ --bc-pattern=NNNNNNNN。

原始文档

SKILL.md 摘录

UMI at start of R2

umi_tools extract
--stdin=R1.fastq.gz
--read2-in=R2.fastq.gz
--stdout=R1_extracted.fastq.gz
--read2-out=R2_extracted.fastq.gz
--bc-pattern2=NNNNNNNN

UMI in both reads

umi_tools extract
--stdin=R1.fastq.gz
--read2-in=R2.fastq.gz
--stdout=R1_extracted.fastq.gz
--read2-out=R2_extracted.fastq.gz
--bc-pattern=NNNNNNNN
--bc-pattern2=NNNNNNNN


## UMI Pattern Syntax

| Pattern | Meaning |
|---------|---------|
| `N` | UMI base (extracted) |
| `C` | Cell barcode (extracted, kept separate) |
| `X` | Discard base |
| `NNNNNNNN` | 8bp UMI |
| `CCCCCCCCNNNNNNNN` | 8bp cell barcode + 8bp UMI |
| `NNNXXXNNN` | 3bp UMI, skip 3bp, 3bp UMI |

适用场景

  • 适合在库 prep includes UMIs 、 accurate molecule counting is needed,such as in single-cell RNA-seq,low-input RNA-seq,或 targeted sequencing to distinguish PCR ,面向 biological duplicates时使用。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

上游相关技能

  • fastp-workflow - Simple UMI extraction during preprocessing
  • quality-filtering - QC before UMI extraction
  • alignment-files/sam-bam-basics - BAM sorting/indexing required before dedup
  • single-cell/preprocessing - scRNA-seq workflows use UMI counting

相关技能

相关技能

返回目录
AR
数据与复现单细胞与空间组学

Arboreto

Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using parall…

Claude CodeOpenClaw分析处理
K-Dense-AI/claude-scientific-skills查看
BI
数据与复现单细胞与空间组学

bio-imaging-mass-cytometry-cell-segmentation

Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for n…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现单细胞与空间组学

bio-single-cell-batch-integration

Integrate multiple scRNA-seq samples/batches using Harmony, scVI, Seurat anchors, and fastMNN. Remove technical variatio…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现单细胞与空间组学

bio-single-cell-cell-annotation

Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for cons…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看