数据与复现统计与数据分析FreedomIntelligence/OpenClaw-Medical-Skills数据与复现
BI

bio-sequence-statistics

维护者 FreedomIntelligence · 最近更新 2026年4月1日

bio-sequence-statistics:Calculate sequence statistics (N50,length distribution,GC content,summary reports) ,使用 Biopython。 适合在analyzing sequence 数据集s,generating QC reports,或 comparing assemblies时使用。

OpenClawNanoClaw分析处理复现实验bio-sequence-statistics🧬 bioinformatics (gptomics bio-* suite)bioinformatics — sequencing & read qccalculate

原始来源

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-sequence-statistics

维护者
FreedomIntelligence
许可
MIT
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • Python:SeqIO.parse(),gc_fraction() (BioPython)。
  • Calculate N50 、 other assembly statistics" → Compute sequence count,length distribution,N50/L50,GC content,、 nucleotide composition ,用于 FASTA 数据集s. Python:SeqIO.parse(),gc_fraction() (BioPython)。
  • Calculate comprehensive statistics ,用于 sequence 数据集s ,使用 Biopython。
  • bin_size = 100 bins = [(l // bin_size) * bin_size ,用于 l in lengths] histogram = Counter(bins)。
  • 用于 length_bin in sorted(histogram.keys()):count = histogram[length_bin] print(f'{length_bin}-{length_bin + bin_size}:{count}')。

原始文档

SKILL.md 摘录

Length Histogram Data

from collections import Counter

lengths = [len(r.seq) for r in SeqIO.parse('sequences.fasta', 'fasta')]

## Comprehensive Summary Report

**Goal:** Generate a complete QC summary (counts, lengths, N50, GC) for any FASTA file.

**Approach:** Load all records, compute length and GC arrays, derive N50/L50 from cumulative sorted lengths, and package into a dictionary.

**Reference (BioPython 1.83+):**

## Compare Multiple Assemblies

**Goal:** Generate a side-by-side comparison table of key metrics across multiple assembly files.

**Approach:** Run `sequence_summary` on each file and format results into an aligned table.

**Reference (BioPython 1.83+):**

适用场景

  • 适合在analyzing sequence 数据集s,generating QC reports,或 comparing assemblies时使用。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

上游相关技能

  • read-sequences - Parse sequences for statistics calculation
  • batch-processing - Calculate stats across multiple files
  • fastq-quality - Quality score statistics for FASTQ files
  • sequence-manipulation/sequence-properties - Per-sequence GC content and properties
  • alignment-files - samtools stats/flagstat for alignment statistics

相关技能

相关技能

返回目录
AR
数据与复现统计与数据分析

arxiv-database

arxiv-database:This skill provides Python tools ,用于 searching 、 retrieving preprints ,面向 arXiv.org ,通过 its public Atom A…

Claude Code分析处理
K-Dense-AI/claude-scientific-skills查看
BA
数据与复现统计与数据分析

bayesian-optimizer

bayesian-optimizer:Bayesian optimization ,用于 experimental design 、 hyperparameter tuning in biomedical research。

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现统计与数据分析

bio-alignment-files-bam-statistics

bio-alignment-files-bam-statistics:Compute alignment statistics:flagstat,idxstats,coverage depth。

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现统计与数据分析

bio-alignment-msa-statistics

bio-alignment-msa-statistics:Calculate alignment statistics ,涵盖 sequence identity,conservation scores,substitution matri…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看