数据与复现统计与数据分析FreedomIntelligence/OpenClaw-Medical-Skills数据与复现

bio-sequence-statistics

维护者 FreedomIntelligence · 最近更新 2026年3月31日

bio-sequence-statistics：Calculate sequence statistics (N50，length distribution，GC content，summary reports) ，使用 Biopython。适合在analyzing sequence 数据集s，generating QC reports，或 comparing assemblies时使用。

OpenClawNanoClaw分析处理复现实验bio-sequence-statistics🧬 bioinformatics (gptomics bio-* suite)bioinformatics — sequencing & read qccalculate

原始来源

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-sequence-statistics

维护者: FreedomIntelligence
许可: MIT
最近更新: 2026年3月31日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

Python：SeqIO.parse()，gc_fraction() (BioPython)。
Calculate N50 、 other assembly statistics" → Compute sequence count，length distribution，N50/L50，GC content，、 nucleotide composition ，用于 FASTA 数据集s. Python：SeqIO.parse()，gc_fraction() (BioPython)。
Calculate comprehensive statistics ，用于 sequence 数据集s ，使用 Biopython。
bin_size = 100 bins = [(l // bin_size) * bin_size ，用于 l in lengths] histogram = Counter(bins)。
用于 length_bin in sorted(histogram.keys())：count = histogram[length_bin] print(f'{length_bin}-{length_bin + bin_size}：{count}')。

原始文档

SKILL.md 摘录

Length Histogram Data

from collections import Counter

lengths = [len(r.seq) for r in SeqIO.parse('sequences.fasta', 'fasta')]

## Comprehensive Summary Report

**Goal:** Generate a complete QC summary (counts, lengths, N50, GC) for any FASTA file.

**Approach:** Load all records, compute length and GC arrays, derive N50/L50 from cumulative sorted lengths, and package into a dictionary.

**Reference (BioPython 1.83+):**

## Compare Multiple Assemblies

**Goal:** Generate a side-by-side comparison table of key metrics across multiple assembly files.

**Approach:** Run `sequence_summary` on each file and format results into an aligned table.

**Reference (BioPython 1.83+):**

适用场景

适合在analyzing sequence 数据集s，generating QC reports，或 comparing assemblies时使用。

不适用场景

Do not rely on this catalog entry alone ，用于 installation 或 maintenance details。

上游相关技能

read-sequences - Parse sequences for statistics calculation
batch-processing - Calculate stats across multiple files
fastq-quality - Quality score statistics for FASTQ files
sequence-manipulation/sequence-properties - Per-sequence GC content and properties
alignment-files - samtools stats/flagstat for alignment statistics

arxiv-database

arxiv-database：This skill provides Python tools ，用于 searching 、 retrieving preprints ，面向 arXiv.org ，通过 its public Atom A…

Claude Code分析处理

K-Dense-AI/claude-scientific-skills查看

数据与复现统计与数据分析

bayesian-optimizer

bayesian-optimizer：Bayesian optimization ，用于 experimental design 、 hyperparameter tuning in biomedical research。

OpenClawNanoClaw分析处理

FreedomIntelligence/OpenClaw-Medical-Skills查看

数据与复现统计与数据分析

bio-alignment-files-bam-statistics

bio-alignment-files-bam-statistics：Compute alignment statistics：flagstat，idxstats，coverage depth。

OpenClawNanoClaw分析处理

FreedomIntelligence/OpenClaw-Medical-Skills查看

数据与复现统计与数据分析

bio-alignment-msa-statistics

bio-alignment-msa-statistics：Calculate alignment statistics ，涵盖 sequence identity，conservation scores，substitution matri…

OpenClawNanoClaw分析处理

FreedomIntelligence/OpenClaw-Medical-Skills查看