Data & ReproClinical MedicineFreedomIntelligence/OpenClaw-Medical-SkillsData & Reproduction
BI

bio-variant-calling-filtering-best-practices

Maintainer FreedomIntelligence · Last updated April 1, 2026

Comprehensive variant filtering including GATK VQSR, hard filters, bcftools expressions, and quality metric interpretation for SNPs and indels. Use when filtering variants using GATK best practices.

OpenClawNanoClawAnalysisReproductionbio-variant-calling-filtering-best-practices🧬 bioinformatics (gptomics bio-* suite)bioinformatics — clinical databases & variant analysiscomprehensive

Original source

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-variant-calling-filtering-best-practices

Maintainer
FreedomIntelligence
License
MIT
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • gatk VariantFiltration \ -R reference.fa \ -V raw_snps.vcf \ -O filtered_snps.vcf \ --filter-expression "QD < 2.0" --filter-name "QD2" \ --filter-expression "FS > 60.0" --filter-name "FS60" \ --filter-expression "MQ < 40.0" --filter-name "MQ40" \ --filter-expression "MQRankSum < -12.5" --filter-name "MQRankSum-12.5" \ --filter-expression "ReadPosRankSum < -8.0" --filter-name "ReadPosRankSum-8" \ --filter-expression "SOR > 3.0" --filter-name "SOR3.
  • gatk VariantFiltration \ -R reference.fa \ -V raw_indels.vcf \ -O filtered_indels.vcf \ --filter-expression "QD < 2.0" --filter-name "QD2" \ --filter-expression "FS > 200.0" --filter-name "FS200" \ --filter-expression "ReadPosRankSum < -20.0" --filter-name "ReadPosRankSum-20" \ --filter-expression "SOR > 10.0" --filter-name "SOR10".

Source Doc

Excerpt From SKILL.md

GATK Hard Filter Thresholds

Goal: Apply GATK-recommended annotation thresholds to separate true variants from artifacts.

Approach: Use VariantFiltration with per-metric filter expressions for SNPs and indels separately.

"Filter my variants using GATK best practices" → Apply fixed annotation thresholds (QD, FS, MQ, SOR, RankSum) to flag low-quality variants.


## Understanding Quality Metrics

| Metric | Meaning | Good Value |
|--------|---------|------------|
| QD | Quality by Depth | >2 (variant quality normalized by depth) |
| FS | Fisher Strand | <60 SNP, <200 indel (strand bias) |
| MQ | Mapping Quality | >40 (RMS mapping quality) |
| MQRankSum | MQ Rank Sum | >-12.5 (ref vs alt mapping quality) |
| ReadPosRankSum | Read Position | >-8 (position in read bias) |
| SOR | Strand Odds Ratio | <3 SNP, <10 indel (strand bias) |
| DP | Depth | Sample-specific, avoid extremes |
| GQ | Genotype Quality | >20 (confidence in genotype) |

## bcftools filter

**Goal:** Filter variants using bcftools expression syntax with soft or hard removal.

**Approach:** Use -e (exclude) or -i (include) with expressions on QUAL, INFO, and FORMAT fields; use -s for soft filtering.

Use cases

  • Use when filtering variants using GATK best practices.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Upstream Related Skills

  • variant-calling/variant-calling - Generate VCF files
  • variant-calling/gatk-variant-calling - GATK VQSR
  • variant-calling/variant-annotation - Annotation after filtering
  • variant-calling/vcf-statistics - Evaluate filter effects

Related skills

Related skills

Back to directory
AR
Data & ReproClinical Medicine

armored-cart-design-agent

Design armored CAR-T cells with cytokine payloads and resistance mechanisms.

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
AR
Data & ReproClinical Medicine

arxiv-search

Search arXiv physics, math, and computer science preprints using natural language queries. Powered by Valyu semantic search.

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
AU
Data & ReproClinical Medicine

autonomous-oncology-agent

Autonomous oncology research agent: literature mining, trial matching, biomarker analysis, and treatment hypothesis generation.

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproClinical Medicine

bio-cfdna-preprocessing

Preprocesses cell-free DNA sequencing data including adapter trimming, alignment optimized for short fragments, and UMI-aware duplicate remo…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView