数据与复现蛋白结构与设计K-Dense-AI/claude-scientific-skills数据与复现
PH

phylogenetics

维护者 Kuan-lin Huang · 最近更新 2026年4月1日

Phylogenetic analysis reconstructs the evolutionary history of biological sequences (genes, proteins, genomes) by inferring the branching pattern of descent. This skill covers the standard pipeline: 1. **MAFFT** — Multiple sequence alignment 2. **IQ-TREE 2** — Maximum likelihood tree inference with model selection 3. **FastTree** — Fast approximate maximum likelihood (for large datasets) 4. **ETE3** — Python library….

Claude CodeOpenClawNanoClaw分析处理复现实验phylogeneticsbioinformaticsanalysisanalysis & methodology

原始来源

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/phylogenetics

维护者
Kuan-lin Huang
许可
Unknown
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • Phylogenetic analysis reconstructs evolutionary history of biological sequences (genes,proteins,genomes) by inferring branching pattern of descent. This skill covers standard pipeline:。
  • 1. MAFFT — Multiple sequence alignment 2. IQ-TREE 2 — Maximum likelihood tree inference ,支持 model selection 3. FastTree — Fast approximate maximum likelihood (,用于 large 数据集s) 4. ETE3 — Python 库 ,用于 tree manipulation 、 visualization。
  • conda install -c bioconda mafft iqtree fasttree pip install ete3。

原始文档

SKILL.md 摘录

When to Use This Skill

Use phylogenetics when:

  • Evolutionary relationships: Which organism/gene is most closely related to my sequence?
  • Viral phylodynamics: Trace outbreak spread and estimate transmission dates
  • Protein family analysis: Infer evolutionary relationships within a gene family
  • Horizontal gene transfer detection: Identify genes with discordant species/gene trees
  • Ancestral sequence reconstruction: Infer ancestral protein sequences
  • Molecular clock analysis: Estimate divergence dates using temporal sampling
  • GWAS companion: Place variants in evolutionary context (e.g., SARS-CoV-2 variants)
  • Microbiology: Species phylogeny from 16S rRNA or core genome phylogeny

1. Multiple Sequence Alignment with MAFFT

import subprocess
import os

def run_mafft(input_fasta: str, output_fasta: str, method: str = "auto",
               n_threads: int = 4) -> str:
    """
    Align sequences with MAFFT.

    Args:
        input_fasta: Path to unaligned FASTA file
        output_fasta: Path for aligned output
        method: 'auto' (auto-select), 'einsi' (accurate), 'linsi' (accurate, slow),
                'fftnsi' (medium), 'fftns' (fast), 'retree2' (fast)
        n_threads: Number of CPU threads

    Returns:
        Path to aligned FASTA file
    """
    methods = {
        "auto": ["mafft", "--auto"],
        "einsi": ["mafft", "--genafpair", "--maxiterate", "1000"],
        "linsi": ["mafft", "--localpair", "--maxiterate", "1000"],
        "fftnsi": ["mafft", "--fftnsi"],
        "fftns": ["mafft", "--fftns"],
        "retree2": ["mafft", "--retree", "2"],
    }

    cmd = methods.get(method, methods["auto"])
    cmd += ["--thread", str(n_threads), "--inputorder", input_fasta]

    with open(output_fasta, 'w') as out:
        result = subprocess.run(cmd, stdout=out, stderr=subprocess.PIPE, text=True)

    if result.returncode != 0:
        raise RuntimeError(f"MAFFT failed:\n{result.stderr}")

    # Count aligned sequences
    with open(output_fasta) as f:
        n_seqs = sum(1 for line in f if line.startswith('>'))
    print(f"MAFFT: aligned {n_seqs} sequences → {output_fasta}")

    return output_fasta

适用场景

  • **Evolutionary relationships**:Which organism/gene is most closely related to my sequence。
  • **Viral phylodynamics**:Trace outbreak spread 、 estimate transmission dates。
  • **Protein family analysis**:Infer evolutionary relationships within gene family。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

相关技能

相关技能

返回目录
AL
数据与复现蛋白结构与设计

alphafold

Validate protein designs using AlphaFold2 structure prediction. Use this skill when: (1) Validating designed sequences f…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
AL
数据与复现蛋白结构与设计

AlphaFold DB

AlphaFold DB is a public repository of AI-predicted 3D protein structures for over 200 million proteins, maintained by D…

Claude CodeOpenClaw分析处理
K-Dense-AI/claude-scientific-skills查看
BI
数据与复现蛋白结构与设计

bindcraft

End-to-end binder design using BindCraft hallucination. Use this skill when: (1) Designing protein binders with built-in…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现蛋白结构与设计

binder-design

Guidance for choosing the right protein binder design tool. Use this skill when: (1) Deciding between BoltzGen, BindCraf…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看