pdflisted

Comprehensive PDF manipulation, extraction, and generation with support for text extraction, form filling, merging, splitting, annotations, and creation. Use when working with .pdf files for: (1) Extracting text and tables, (2) Filling PDF forms, (3) Merging/splitting PDFs, (4) Creating PDFs programmatically, (5) Adding watermarks/annotations, (6) PDF metadata management
aiskillstore/marketplace · ★ 329 · Data & Documents · score 79

Install: claude install-skill aiskillstore/marketplace

# PDF Manipulation Skill Comprehensive guide for working with PDF files in Python, covering extraction, manipulation, creation, and advanced operations using progressive disclosure for efficiency. ## Core Capabilities Extract and manipulate PDF content: - Extract text with layout preservation - Extract tables and parse structured data - Fill PDF forms programmatically - Merge multiple PDFs into a single document - Split PDFs by pages or ranges - Create PDFs from scratch with text, images, and graphics - Add watermarks and annotations - Extract and modify metadata (author, title, keywords) - Add password protection and encryption - Perform OCR on scanned documents - Convert images to PDF - Compress and optimize PDF files - Extract images from PDFs - Rotate and reorder pages ## Quick Start Install required libraries: ```bash pip install pypdf pdfplumber reportlab PyMuPDF pdf2image pytesseract pillow ``` For detailed installation instructions including system dependencies, see: - [Library Installation Guide](./references/library-installation.md) ## Python Libraries Overview **pypdf**: Basic operations (merge, split, rotate, metadata) **pdfplumber**: Advanced text/table extraction with layout awareness **reportlab**: Create PDFs from scratch (reports, invoices, documents) **PyMuPDF (fitz)**: Advanced manipulation, annotations, compression **pdf2image**: Convert PDF pages to images (requires poppler) **pytesseract**: OCR for scanned documents (requires tesseract) ## Text