Data & ReproProtein Structure & DesignFreedomIntelligence/OpenClaw-Medical-SkillsData & Reproduction
CL

claw-semantic-sim

Maintainer FreedomIntelligence · Last updated April 1, 2026

Semantic Similarity Index for disease research literature using PubMedBERT embeddings.

OpenClawNanoClawAnalysisReproductionclaw-semantic-sim⚙️ clawbio pipelinesstructural biology & literaturesemantic

Original source

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/claw-semantic-sim

Maintainer
FreedomIntelligence
License
MIT
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Measure how isolated or connected disease research is across the global biomedical literature, using PubMedBERT embeddings on PubMed abstracts spanning 175 GBD diseases.
  • Semantic Isolation Index (SII): average cosine distance to k-nearest disease neighbours; higher = more isolated, less connected research.
  • Knowledge Transfer Potential (KTP): cross-disease centroid similarity; higher = more potential for research spillover.
  • Research Clustering Coefficient (RCC): within-disease embedding variance; higher = more diverse research approaches.
  • Temporal Semantic Drift: cosine distance between yearly centroids; measures how research focus evolves.

Source Doc

Excerpt From SKILL.md

Why this exists

If you ask ChatGPT to "measure research neglect for diseases," it will:

  • Not know which embedding model to use for biomedical text
  • Hallucinate metrics that sound plausible but have no methodological grounding
  • Skip quality filtering (year coverage, abstract coverage, minimum papers)
  • Not handle MPS acceleration or checkpointed batch processing
  • Produce a single scatter plot with no disease classification

This skill encodes the correct methodological decisions:

  • Uses PubMedBERT (the gold-standard biomedical language model)
  • Fetches from PubMed with exponential backoff and NCBI rate limiting
  • Quality filters: year coverage >= 70%, abstract coverage >= 95%, minimum 50 papers
  • Batch embedding with Apple MPS acceleration and CPU fallback
  • Checkpointed processing (resume after interruption)
  • HDF5 storage with gzip compression and SHA-256 checksums
  • Classification against WHO NTD list and Global South priority diseases
  • Statistical significance testing (Welch's t-test, Cohen's d)

Key Finding

Neglected tropical diseases (NTDs) are significantly more semantically isolated than other conditions (P < 0.001, Cohen's d = 0.8+). They exist in knowledge silos with limited cross-disciplinary research bridges. The 25 most isolated diseases are disproportionately Global South priority conditions.

Demo (works out of the box)

The demo uses pre-computed embeddings and metrics for 175 GBD diseases and generates the full 4-panel figure instantly.

Use cases

  • Use claw-semantic-sim in research workflows aligned with this subject area.
  • Follow the upstream documentation for the full working procedure.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
AD
Data & ReproProtein Structure & Design

Adaptyv

Adaptyv is a cloud laboratory platform that provides automated protein testing and validation services. Submit protein sequences via API or…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
AL
Data & ReproProtein Structure & Design

alphafold

Validate protein designs using AlphaFold2 structure prediction. Use this skill when: (1) Validating designed sequences fold correctly, (2) P…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
AN
Data & ReproProtein Structure & Design

antibody-design-agent

Antibody design: epitope mapping, CDR engineering, bispecific construction.

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproProtein Structure & Design

bindcraft

End-to-end binder design using BindCraft hallucination. Use this skill when: (1) Designing protein binders with built-in AF2 validation, (2)…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView