Data & ReproProtein Structure & DesignK-Dense-AI/claude-scientific-skillsData & Reproduction
PH

phylogenetics

Maintainer Kuan-lin Huang · Last updated April 1, 2026

Phylogenetic analysis reconstructs the evolutionary history of biological sequences (genes, proteins, genomes) by inferring the branching pattern of descent. This skill covers the standard pipeline: 1. **MAFFT** — Multiple sequence alignment 2. **IQ-TREE 2** — Maximum likelihood tree inference with model selection 3. **FastTree** — Fast approximate maximum likelihood (for large datasets) 4. **ETE3** — Python library….

Claude CodeOpenClawNanoClawAnalysisReproductionphylogeneticsbioinformaticsanalysisanalysis & methodology

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/phylogenetics

Maintainer
Kuan-lin Huang
License
Unknown
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Phylogenetic analysis reconstructs the evolutionary history of biological sequences (genes, proteins, genomes) by inferring the branching pattern of descent. This skill covers the standard pipeline:.
  • 1. MAFFT — Multiple sequence alignment 2. IQ-TREE 2 — Maximum likelihood tree inference with model selection 3. FastTree — Fast approximate maximum likelihood (for large datasets) 4. ETE3 — Python library for tree manipulation and visualization.
  • conda install -c bioconda mafft iqtree fasttree pip install ete3.

Source Doc

Excerpt From SKILL.md

When to Use This Skill

Use phylogenetics when:

  • Evolutionary relationships: Which organism/gene is most closely related to my sequence?
  • Viral phylodynamics: Trace outbreak spread and estimate transmission dates
  • Protein family analysis: Infer evolutionary relationships within a gene family
  • Horizontal gene transfer detection: Identify genes with discordant species/gene trees
  • Ancestral sequence reconstruction: Infer ancestral protein sequences
  • Molecular clock analysis: Estimate divergence dates using temporal sampling
  • GWAS companion: Place variants in evolutionary context (e.g., SARS-CoV-2 variants)
  • Microbiology: Species phylogeny from 16S rRNA or core genome phylogeny

1. Multiple Sequence Alignment with MAFFT

import subprocess
import os

def run_mafft(input_fasta: str, output_fasta: str, method: str = "auto",
               n_threads: int = 4) -> str:
    """
    Align sequences with MAFFT.

    Args:
        input_fasta: Path to unaligned FASTA file
        output_fasta: Path for aligned output
        method: 'auto' (auto-select), 'einsi' (accurate), 'linsi' (accurate, slow),
                'fftnsi' (medium), 'fftns' (fast), 'retree2' (fast)
        n_threads: Number of CPU threads

    Returns:
        Path to aligned FASTA file
    """
    methods = {
        "auto": ["mafft", "--auto"],
        "einsi": ["mafft", "--genafpair", "--maxiterate", "1000"],
        "linsi": ["mafft", "--localpair", "--maxiterate", "1000"],
        "fftnsi": ["mafft", "--fftnsi"],
        "fftns": ["mafft", "--fftns"],
        "retree2": ["mafft", "--retree", "2"],
    }

    cmd = methods.get(method, methods["auto"])
    cmd += ["--thread", str(n_threads), "--inputorder", input_fasta]

    with open(output_fasta, 'w') as out:
        result = subprocess.run(cmd, stdout=out, stderr=subprocess.PIPE, text=True)

    if result.returncode != 0:
        raise RuntimeError(f"MAFFT failed:\n{result.stderr}")

    # Count aligned sequences
    with open(output_fasta) as f:
        n_seqs = sum(1 for line in f if line.startswith('>'))
    print(f"MAFFT: aligned {n_seqs} sequences → {output_fasta}")

    return output_fasta

Use cases

  • **Evolutionary relationships**: Which organism/gene is most closely related to my sequence?
  • **Viral phylodynamics**: Trace outbreak spread and estimate transmission dates.
  • **Protein family analysis**: Infer evolutionary relationships within a gene family.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
AL
Data & ReproProtein Structure & Design

alphafold

Validate protein designs using AlphaFold2 structure prediction. Use this skill when: (1) Validating designed sequences fold correctly, (2) P…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
AL
Data & ReproProtein Structure & Design

AlphaFold DB

AlphaFold DB is a public repository of AI-predicted 3D protein structures for over 200 million proteins, maintained by DeepMind and EMBL-EBI…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
BI
Data & ReproProtein Structure & Design

bindcraft

End-to-end binder design using BindCraft hallucination. Use this skill when: (1) Designing protein binders with built-in AF2 validation, (2)…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproProtein Structure & Design

binder-design

Guidance for choosing the right protein binder design tool. Use this skill when: (1) Deciding between BoltzGen, BindCraft, or RFdiffusion, (…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView