web-scraping
SolidWeb scraping with anti-bot bypass, content extraction, undocumented APIs and poison pill detection. Use when extracting content from websites, handling paywalls, implementing scraping cascades or processing social media. Covers requests, trafilatura, Playwright with stealth mode, yt-dlp and instaloader patterns.
Install
Quality Score: 89/100
Skill Content
Details
- Author
- jamditis
- Repository
- jamditis/claude-skills-journalism
- Created
- 5 months ago
- Last Updated
- today
- Language
- HTML
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
scrapling
Use Scrapling for web extraction (HTTP, async, dynamic, stealth fetchers). Prefer Scrapling for scraping pipelines; fallback to `playwright-ext` when blocked.
scrapling
Web scraping with Scrapling - HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python.
web-scrape
Intelligent web scraper with content extraction, multiple output formats, and error handling
ghost-scraper
Extracts structured data from websites — static HTML, JavaScript-rendered SPAs, paginated listings, and API-backed pages. Handles anti-bot detection awareness, rate limiting, and robots.txt compliance. Use this skill whenever the user wants to scrape a website, extract data from a URL, pull product listings, harvest structured data, reverse-engineer a site's API, or deal with dynamic JS-rendered content. Also triggers on "get me data from this site," "extract prices from," "crawl these pages," or any request involving web data extraction, even casual ones like "can you pull info from this URL."
web-scraper
Web scraping inteligente multi-estrategia. Extrai dados estruturados de paginas web (tabelas, listas, precos). Paginacao, monitoramento e export CSV/JSON.