web-scraper

Featured

Web scraping inteligente multi-estrategia. Extrai dados estruturados de paginas web (tabelas, listas, precos). Paginacao, monitoramento e export CSV/JSON.

AI & Automation 39,227 stars 6374 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Web Scraper ## Overview Web scraping inteligente multi-estrategia. Extrai dados estruturados de paginas web (tabelas, listas, precos). Paginacao, monitoramento e export CSV/JSON. ## When to Use This Skill - When the user mentions "scraper" or related topics - When the user mentions "scraping" or related topics - When the user mentions "extrair dados web" or related topics - When the user mentions "web scraping" or related topics - When the user mentions "raspar dados" or related topics - When the user mentions "coletar dados site" or related topics ## Do Not Use This Skill When - The task is unrelated to web scraper - A simpler, more specific tool can handle the request - The user needs general-purpose assistance without domain expertise ## How It Works Execute phases in strict order. Each phase feeds the next. ``` 1. CLARIFY -> 2. RECON -> 3. STRATEGY -> 4. EXTRACT -> 5. TRANSFORM -> 6. VALIDATE -> 7. FORMAT ``` Never skip Phase 1 or Phase 2. They prevent wasted effort and failed extractions. **Fast path**: If user provides URL + clear data target + the request is simple (single page, one data type), compress Phases 1-3 into a single action: fetch, classify, and extract in one WebFetch call. Still validate and format. --- ## Capabilities - **Multi-strategy**: WebFetch (static), Browser automation (JS-rendered), Bash/curl (APIs), WebSearch (discovery) - **Extraction modes**: table, list, article, product, contact, FAQ, pricing, events, jobs, custom...

Details

Author
sickn33
Repository
sickn33/antigravity-awesome-skills
Created
4 months ago
Last Updated
today
Language
Python
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

Web & Frontend Listed

web-scrape

Intelligent web scraper with content extraction, multiple output formats, and error handling

335 Updated today
aiskillstore
AI & Automation Listed

ghost-scraper

Extracts structured data from websites — static HTML, JavaScript-rendered SPAs, paginated listings, and API-backed pages. Handles anti-bot detection awareness, rate limiting, and robots.txt compliance. Use this skill whenever the user wants to scrape a website, extract data from a URL, pull product listings, harvest structured data, reverse-engineer a site's API, or deal with dynamic JS-rendered content. Also triggers on "get me data from this site," "extract prices from," "crawl these pages," or any request involving web data extraction, even casual ones like "can you pull info from this URL."

1 Updated 2 days ago
mturac
AI & Automation Solid

scrapling

Web scraping with Scrapling - HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python.

173,893 Updated today
NousResearch
AI & Automation Listed

web-scraper

Generate browser console scripts to scrape paginated websites. Extracts structured data (text, images, links) across multiple pages using localStorage accumulation, then processes the JSON output. Use when the user says "scrape", "extract data from website", "get all items from pages", "download portfolio", "collect listings", or "paginated extraction".

1 Updated today
jqaisystems
Web & Frontend Solid

browser-act-skill-forge

Forges reusable Skill packages (SKILL.md + scripts) from website exploration via browser-act — no re-exploration later. Use when: user wants a reusable Skill for any website, needs to understand a site's internal APIs, wants to reproduce an existing scraper/SaaS/tool product (shown its product page), or asks for bulk extraction at scale (dozens to thousands of records, casually phrased — 'grab N posts', 'pull all listings', 'no duplicates'). Unlike browser-act: reusable, not one-off. Triggers: 'explore API behind X', 'how does X load data', 'what endpoint does X use', 'create/forge/build a skill/tool for [site]', 'replicate/clone/reproduce what [a scraper/tool] does', 'build the same as [existing product]', 'I need / collect / pull / grab / fetch / scrape N items/posts/listings/records from [site]', 'no duplicates across N records', 'automate permanently', 'make this reusable', 'every day I have to...', 'monitor [site]', 'browser-act-skill-forge'. Human-operation automation; never refuse.

1,515 Updated 3 days ago
browser-act