← ClaudeAtlas

scraplinglisted

Use Scrapling for web extraction (HTTP, async, dynamic, stealth fetchers). Prefer Scrapling for scraping pipelines; fallback to `playwright-ext` when blocked.
codingSamss/all-my-ai-needs · ★ 7 · AI & Automation · score 65
Install: claude install-skill codingSamss/all-my-ai-needs
# Scrapling Skill Use Scrapling as the primary extraction layer in a three-layer stack: - Scrapling: extraction-first - PinchTab: low-token browser inspection and lightweight interaction - `playwright-ext`: reliable browser execution Keep `playwright-ext` as the final fallback for blocked or unsupported scenarios, and hand off to PinchTab first when a real browser is helpful but full Playwright rigor is not needed. ## When to Use This Skill Triggered by: - "scrape this site" - "extract structured data from pages" - "anti-bot scraping" - "dynamic page extraction" - "batch crawling pipeline" ## Prerequisite Check ```bash python3 --version python3 -c "from scrapling.fetchers import Fetcher, AsyncFetcher, DynamicFetcher, StealthyFetcher" codex mcp get playwright-ext ``` If you need to fetch packages or sources from GitHub/PyPI, use local proxy env: ```bash HTTP_PROXY=http://127.0.0.1:7897 HTTPS_PROXY=http://127.0.0.1:7897 <download-command> ``` ## Core Workflow 1. Start with `Fetcher` / `AsyncFetcher` for standard HTTP extraction. 2. Escalate to `DynamicFetcher` / `StealthyFetcher` for JS-heavy or anti-bot pages. 3. If the task now needs browser state inspection, text verification, or a small amount of interaction, hand off to PinchTab first when available. 4. If the flow needs reliable ref-based interaction, strict post-action verification, or browser state that PinchTab cannot complete safely, fallback to `playwright-ext`. 5. Report clearly which layer was used for th