markdrop

Solid

Professional AI skill and usage instructions for the Markdrop package, a Python tool for converting PDFs to Markdown/HTML with AI-powered image/table descriptions.

Data & Documents 204 stars 18 forks Updated 2 months ago GPL-3.0

Install

View on GitHub

Quality Score: 80/100

Stars 20%
77
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Markdrop Skill Welcome to the `markdrop` skill. `markdrop` is a powerful Python package and CLI tool used to convert PDF documents into structured Markdown and interactive HTML, while natively leveraging AI vision models to interpret and describe extracted images and tables. If you are an AI agent or a user aiming to process PDFs and augment them with text or image descriptions, this document serves as your complete guide on utilizing `markdrop` efficiently and accurately. ## 1. Capabilities - **PDF to Markdown/HTML**: Retains formatting, extracts images, and detects tables via Microsoft Table Transformer and Docling. Supports processing both local file paths and direct PDF URLs. - **AI Vision Descriptions**: Query GEMINI, OPENAI, ANTHROPIC, GROQ, OPENROUTER, or LITELLM to generate rich descriptions of images and tables. - **Batch Processing**: Describe entire directories of images in single commands using multiple LLM backends simultaneously. - **Extensible Configuration**: Precise override control over which structural text-models vs vision-models are used, as well as prompts, resolution scales, and output features. ## 2. API Keys Setup Before using AI features, API keys must be available in the root `.env` file or environment variables. If deploying programmatically, you can run the built-in CLI command, or inject them into `os.environ`: ```bash markdrop setup gemini # -> GEMINI_API_KEY markdrop setup openai # -> OPENAI_API_KEY markdrop setup anthropic # ...

Details

Author
shoryasethia
Repository
shoryasethia/markdrop
Created
1 years ago
Last Updated
2 months ago
Language
Python
License
GPL-3.0

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

Data & Documents Solid

phd-deepread

Guided workflow for processing academic PDFs into structured literature notes using Text-First decision tree (PyMuPDF + Tesseract OCR) and Claude-assisted analysis. Perfect for literature review and note-taking in Obsidian.

47 Updated today
heleninsights-dot
AI & Automation Solid

ai-data-engineering

Data pipelines, feature stores, and embedding generation for AI/ML systems. Use when building RAG pipelines, ML feature serving, or data transformations. Covers feature stores (Feast, Tecton), embedding pipelines, chunking strategies, orchestration (Dagster, Prefect, Airflow), dbt transformations, data versioning (LakeFS), and experiment tracking (MLflow, W&B).

367 Updated 5 months ago
ancoleman
Data & Documents Solid

pdfkit-generation

Generate professional PDFs with PDFKit in Node.js. Use when creating pitch decks, reports, or styled documents with AGNT branding. Covers large script handling, Unicode-safe characters, and brand design patterns.

270 Updated today
agnt-gg
AI & Automation Solid

ml-paper-writing

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, conducting literature reviews, finding related work, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, citation verification workflows, and paper discovery/evaluation criteria.

4,008 Updated 1 weeks ago
Galaxy-Dawn
Data & Documents Featured

pdf

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

14,116 Updated today
eigent-ai