← ClaudeAtlas

scraperapi-datapipelinelisted

Product-usage reference for ScraperAPI's DataPipeline — managed, scheduled scraping projects that run automatically and deliver results to a webhook or dashboard download. Consult when the user needs recurring scraping, has a large list of URLs/ASINs/queries to process, or wants to avoid building and maintaining their own scraping infrastructure. Use when user asks: "schedule recurring scraping with ScraperAPI", "ScraperAPI DataPipeline", "how do I run a scraping project on a schedule", "scrape 10000 ASINs automatically", "ScraperAPI managed scraping project", "set up a ScraperAPI pipeline", "deliver scraping results to a webhook automatically". Covers project types, input methods, scheduling, output delivery, the DataPipeline API, job management, and credit costs.
scraperapi/scraperapi-skills · ★ 9 · Data & Documents · score 78
Install: claude install-skill scraperapi/scraperapi-skills
# ScraperAPI DataPipeline DataPipeline is a managed scraping product. You define a project (what to scrape, how often, where to send results), and ScraperAPI runs it on your schedule without you managing proxies, retries, or infrastructure. ## When NOT to use DataPipeline - **One-off scrapes of a known URL list** → use the [Async API](https://docs.scraperapi.com/making-async-requests) — faster, cheaper, no project setup. - **Exploring a site without known URLs** → use the [Crawler](https://docs.scraperapi.com/crawler). - **Need results in real-time within your code** → Async API is programmable; DataPipeline is scheduled. - **Free plan, need recurring execution** → recurring schedules require a paid plan. Use DataPipeline when: scraping runs on a fixed schedule, the input list is large (up to 100,000 items), results should flow to a webhook automatically, or you want email notifications on job completion. ## Base URL and Auth ``` Base URL: https://datapipeline.scraperapi.com/api Auth: ?api_key=YOUR_KEY (query parameter on every request) ``` ## Project Types Set `projectType` in the create request to choose what to scrape: | Type | Input | |------|-------| | `urls` | Raw HTML from any URL | | `urls_with_js` | Same but with JavaScript rendering | | `google_search` | Search queries | | `google_news` | Search queries | | `google_jobs` | Search queries | | `google_shopping` | Search queries | | `google_maps` | Search queries | | `amazon_product` | ASINs | | `amazon_s