universal-data-acquisition-pipelinelisted

Trigger when the user wants to collect, structure, evaluate, crawl, extract, refresh, or build reusable data acquisition pipelines from websites, APIs, portals, files, or rendered apps. Use for dataset design, source classification, feasibility, endpoint discovery, authorized/owned-session scraping plans, Patchright warm-session cookie generation, Playwright fallback, source probing, pagination analysis, scraper/pipeline architecture, sample validation, refresh design, and output contracts. Do not trigger for ordinary browsing, exploitative access, credential theft, CAPTCHA solving, auth bypass, rate-limit bypass, or non-data tasks.
Pranjay-kumar/universal-data-acquisition-pipeline-skill · ★ 0 · Data & Documents · score 72

Install: claude install-skill Pranjay-kumar/universal-data-acquisition-pipeline-skill

# Universal Data Acquisition Pipeline Act as the router for a data acquisition skill tree. Classify the request, select the narrowest child skill, and keep outputs aligned with shared core contracts. Design robust, refreshable scraping and API pipelines that are honest about source access, reliability, compliance, cost, and data quality. Do not scrape immediately. First classify the source access, prove that a reliable data path exists, design a reusable pipeline, validate a small sample, and require approval before any full run. For feasibility requests, do not stop at desk research when public probing is possible. Run a bounded pre-report probe ladder first, then generate the feasibility report from the observed evidence. Treat the probes as due diligence, not execution: 1 to 3 URLs, 20 rows maximum, no account pages, no CAPTCHA solving, no rate-limit bypass, and no broad collection. ## Skill Tree Use the child skill that best matches the request: - `data-acquisition-core`: shared contracts, access classes, compliance, scorecards, output schemas, and pipeline quality standards. - `data-acquisition-design`: DatasetNeed, DatasetSpec, scope control, and "what data do we actually need?" - `data-acquisition-feasibility`: feasibility scoring, source comparison, Green/Yellow/Red decisions, and approval gates. - `data-acquisition-discovery`: endpoint discovery, public APIs, GraphQL, XHR/fetch, sitemaps, embedded JSON, and pagination probes. - `data-acquisition-browser`: Patch