← ClaudeAtlas

markitdownlisted

Convert local documents to Markdown using Microsoft's markitdown CLI. Best for: PDF, Word, Excel, PowerPoint, images (OCR), audio. Can fetch URLs but Jina is faster for web. Triggers on: convert to markdown, read PDF, parse document, extract text from, docx, xlsx, pptx, OCR image, local file.
0xDarkMatter/claude-mods · ★ 22 · Data & Documents · score 74
Install: claude install-skill 0xDarkMatter/claude-mods
# markitdown - Document to Markdown Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more. ## When to Use markitdown | Use Case | Recommendation | |----------|----------------| | **Local files (PDF, Word, Excel)** | ✅ **Use markitdown** - unique capability | | **Web pages** | ❌ Use Jina (`r.jina.ai/`) - 5x faster | | **Blocked/anti-bot sites** | ❌ Use Firecrawl | | **OCR on images** | ✅ **Use markitdown** | | **Audio transcription** | ✅ **Use markitdown** | ## Basic Usage ```bash # Local files (primary use case) markitdown document.pdf markitdown report.docx markitdown data.xlsx markitdown slides.pptx markitdown screenshot.png # OCR # URLs (works, but Jina is faster) markitdown https://example.com # Save output markitdown document.pdf > document.md ``` ## Supported Formats | Format | Extensions | Notes | |--------|------------|-------| | PDF | `.pdf` | Text extraction, tables | | Word | `.docx` | Formatting preserved | | Excel | `.xlsx` | Tables to markdown | | PowerPoint | `.pptx` | Slides as sections | | Images | `.jpg`, `.png` | OCR text extraction | | HTML | `.html` | Clean conversion | | Audio | `.mp3`, `.wav` | Speech-to-text | | Text | `.txt`, `.csv`, `.json`, `.xml` | Pass-through/structure | | URLs | `https://...` | Works but slower than Jina | ## Benchmarked Performance (URLs) | Tool | Avg Speed | Success Rate | |------|-----------|--------------| | Jina | **0.5s** | 10/10 | | markitdown | 2.5s | 9/10 |