scrapling

Solid

CLI-first web scraping & content extraction with optional MCP server. Use when you have target URLs and need clean, selector-based outputs (html/md/txt).

AI & Automation 2,202 stars 164 forks Updated 1 weeks ago Apache-2.0

Install

View on GitHub

Quality Score: 91/100

Stars 20%
100
Recency 20%
90
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Scrapling Skill (VCO) Scrapling is a Python-based web scraping / extraction toolkit that exposes: - a **CLI** (`scrapling ...`) for fetching + extracting content into files - an **optional MCP server** (`scrapling mcp`) so an agent can call structured scraping tools This skill is **CLI-first**. Prefer it when you already have URLs and need reliable, repeatable extraction (CSS selector → file). ## When to use Use `scrapling` when you need: - Extract **specific parts** of a web page (CSS selector / XPath) into `.txt` / `.md` / `.html` - Run **repeatable scraping jobs** (batch URLs with a small wrapper script) - Reduce token usage by extracting only the relevant DOM region before passing to the LLM - Provide a local MCP endpoint for scraping tools (agent → MCP → scrapling) ## Boundaries (vs Playwright / Search) ### vs `playwright` - `scrapling`: best for “get URL → extract selector → write file” workflows; simpler, faster iteration - `playwright`: best for interactive UI flows (login, multi-step navigation, downloads, complex JS actions, stateful sessions) If you must *navigate* or *click through* a UI, use `playwright`. If you can directly fetch the target page and just need extraction, use `scrapling`. ### vs search tools - Search tools are for discovering sources/URLs (query → result list → choose URLs). - `scrapling` is for acquisition + extraction once you already know the URL(s). A common pipeline: 1) Search → find candidate URLs 2) Scrapling → extract focused ...

Details

Author
foryourhealth111-pixel
Repository
foryourhealth111-pixel/Vibe-Skills
Created
3 months ago
Last Updated
1 weeks ago
Language
Python
License
Apache-2.0

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category