← ClaudeAtlas

data-acquisition-browserlisted

Use for Patchright/Playwright-based public or authorized browser probing: warm-session cookie/storage generation, browser network capture, JSON/API route discovery from page loads, rendered DOM fallback, screenshots, tiny DOM samples, and user-owned storage-state workflows. Do not use for CAPTCHA solving, credential extraction, auth bypass, or rate-limit bypass.
Pranjay-kumar/universal-data-acquisition-pipeline-skill · ★ 0 · Web & Frontend · score 72
Install: claude install-skill Pranjay-kumar/universal-data-acquisition-pipeline-skill
# Data Acquisition Browser Act as the browser acquisition specialist. Use Patchright for warm-session capture when a normal browser must mint cookies or storage state before API endpoints are visible. Use Playwright for ordinary rendered-DOM fallback when structured HTTP probes are insufficient and no warm browser context is needed. When the user asks for page loads only or "no API", use Patchright as a renderer and extract DOM/JSON-LD/meta/visible rows only. Do not harvest, replay, or recommend structured endpoints in that mode. ## Shared Core Read from `../data-acquisition-core/references/`: - `source-access.md` - `playwright-rendered-dom.md` - `probing.md` - `compliance-boundaries.md` - `output-contracts.md` ## Helpers Use `scripts/patchright_cookie_endpoint_probe.mjs` for warm-session cookie/storage generation and endpoint discovery. Use `scripts/patchright_page_dom_probe.mjs` for Patchright page-load-only DOM extraction. Use `scripts/playwright_probe.mjs` only for ordinary public rendered-DOM fallback. From the repo root: ```powershell npm install npx patchright install chrome npm run probe:patchright -- "https://example.com/public-category" "outputs/example-patchright-endpoints.json" ``` The Patchright helper opens a persistent Chrome context, lets the page create ordinary browser-issued cookies/storage state, records JSON/API/XHR-looking requests and responses, saves local storage state under `auth/`, and writes a redacted endpoint report. Page-load-only mod