arxiv-database
This skill provides Python tools for searching and retrieving preprints from arXiv.org via its public Atom API. It supports keyword search,…
Maintainer FreedomIntelligence · Last updated April 1, 2026
Calculate sequence statistics (N50, length distribution, GC content, summary reports) using Biopython. Use when analyzing sequence datasets, generating QC reports, or comparing assemblies.
Original source
https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-sequence-statistics
Skill Snapshot
Source Doc
from collections import Counter
lengths = [len(r.seq) for r in SeqIO.parse('sequences.fasta', 'fasta')]
## Comprehensive Summary Report
**Goal:** Generate a complete QC summary (counts, lengths, N50, GC) for any FASTA file.
**Approach:** Load all records, compute length and GC arrays, derive N50/L50 from cumulative sorted lengths, and package into a dictionary.
**Reference (BioPython 1.83+):**
## Compare Multiple Assemblies
**Goal:** Generate a side-by-side comparison table of key metrics across multiple assembly files.
**Approach:** Run `sequence_summary` on each file and format results into an aligned table.
**Reference (BioPython 1.83+):**
Related skills
This skill provides Python tools for searching and retrieving preprints from arXiv.org via its public Atom API. It supports keyword search,…
Bayesian optimization for experimental design and hyperparameter tuning in biomedical research.
Compute alignment statistics: flagstat, idxstats, coverage depth.
Calculate alignment statistics including sequence identity, conservation scores, substitution matrices, and similarity metrics. Use when com…