Training & EvalMachine Learning & Research AIK-Dense-AI/claude-scientific-skillsModel Training & Evaluation
MA

MarkItDown

Maintainer K-Dense Inc. · Last updated April 1, 2026

MarkItDown is a Python tool developed by Microsoft for converting various file formats to Markdown. It's particularly useful for converting documents into LLM-friendly text format, as Markdown is token-efficient and well-understood by modern language models. **Key Benefits**: - Convert documents to clean, structured Markdown - Token-efficient format for LLM processing - Supports 15+ file formats - Optional AI-enhanc….

Claude CodeOpenClawNanoClawTrainingEvaluationmarkitdowndocument-processingworkflowdocument processing & conversion

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/markitdown

Maintainer
K-Dense Inc.
License
MIT license
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Convert documents to clean, structured Markdown.
  • Token-efficient format for LLM processing.
  • Supports 15+ file formats.
  • Optional AI-enhanced image descriptions.
  • OCR for images and scanned documents.

Source Doc

Excerpt From SKILL.md

Visual Enhancement with Scientific Schematics

When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.

If your document does not already contain schematics or diagrams:

  • Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
  • Simply describe your desired diagram in natural language
  • Nano Banana Pro will automatically generate, review, and refine the schematic

For new documents: Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.

How to generate schematics:

The AI will automatically:

  • Create publication-quality images with proper formatting
  • Review and refine through multiple iterations
  • Ensure accessibility (colorblind-friendly, high contrast)
  • Save outputs in the figures/ directory

When to add schematics:

  • Document conversion workflow diagrams
  • File format architecture illustrations
  • OCR processing pipeline diagrams
  • Integration workflow visualizations
  • System architecture diagrams
  • Data flow diagrams
  • Any complex concept that benefits from visualization

For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.


Supported Formats

FormatDescriptionNotes
PDFPortable Document FormatFull text extraction
DOCXMicrosoft WordTables, formatting preserved
PPTXPowerPointSlides with notes
XLSXExcel spreadsheetsTables and data
ImagesJPEG, PNG, GIF, WebPEXIF metadata + OCR
AudioWAV, MP3Metadata + transcription
HTMLWeb pagesClean conversion
CSVComma-separated valuesTable format
JSONJSON dataStructured representation
XMLXML documentsStructured format
ZIPArchive filesIterates contents
EPUBE-booksFull text extraction
YouTubeVideo URLsFetch transcriptions

Or from source

git clone https://github.com/microsoft/markitdown.git cd markitdown pip install -e 'packages/markitdown[all]'

Use cases

  • Use MarkItDown in research workflows aligned with this subject area.
  • Follow the upstream documentation for the full working procedure.
  • Use markitdown in research workflows aligned with this subject area.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
BI
Training & EvalMachine Learning & Research AI

bio-immunoinformatics-tcr-epitope-binding

Predict TCR-epitope specificity using ERGO-II and deep learning models for T-cell receptor antigen recognition. Match TCRs to their cognate…

OpenClawNanoClawTraining
FreedomIntelligence/OpenClaw-Medical-SkillsView
GE
Training & EvalMachine Learning & Research AI

Get Available Resources

Detect available computational resources and generate strategic recommendations for scientific computing tasks. This skill automatically ide…

Claude CodeTraining
K-Dense-AI/claude-scientific-skillsView
HY
Training & EvalMachine Learning & Research AI

Hypothesis Generation

Hypothesis generation is a systematic process for developing testable explanations. Formulate evidence-based hypotheses from observations, d…

Claude CodeOpenClawTraining
K-Dense-AI/claude-scientific-skillsView
PY
Training & EvalMachine Learning & Research AI

PyMOO

Pymoo is a comprehensive Python framework for optimi.

Claude CodeOpenClawTraining
K-Dense-AI/claude-scientific-skillsView