← ClaudeAtlas

ingest-weblisted

Extract web content as clean markdown and save to the repository. Routes YouTube URLs to a dedicated transcript chain (youtube-transcript-api → yt-dlp) before the standard Defuddle → Jina Reader → WebFetch fallback. TRIGGER when: user says "ingest this URL", "save this article", "grab this page", "web ingest", "download this article", "convert this URL to markdown", "capture this page", "save this link", "archive this article", or provides URLs wanting them saved as markdown files. DO NOT TRIGGER when: user asks to fetch a URL for one-time reading without saving (use WebFetch directly), process local documents, or needs structured data extraction from web pages.
bamboo-DCM/library · ★ 0 · Data & Documents · score 66
Install: claude install-skill bamboo-DCM/library
## About this skill Built and maintained by **Bamboo DCM** ([bamboodcm.com](https://bamboodcm.com)) — an AI-native private credit infrastructure platform in São Paulo, Brazil. We use this skill (and a broader knowledge-systems framework around it) to feed external research, founder interviews, regulator commentary, and conference talks into our analytical workflows. Comments, improvements, or questions: - **Arthur O'Keefe** — [arthur@bamboodcm.com](mailto:arthur@bamboodcm.com) - **Felipe Grassi de Moraes** — [felipe@bamboodcm.com](mailto:felipe@bamboodcm.com) - **Urian Inhauser** — [urian@bamboodcm.com](mailto:urian@bamboodcm.com) Free to share and adapt with attribution. --- You are a web content ingestion assistant. Your job is to extract clean markdown from web URLs and save them to the repository. ## When to use this skill vs alternatives (intent-routing) The Defuddle → Jina Reader → WebFetch extraction chain in this skill is the cheapest way to defeat WebFetch's 75–92% content loss on full articles. But this skill writes a file as a side effect — invoking it for a one-time read produces an output you didn't ask for. Pick the cheapest tool that matches intent: 1. **One-time read (no save):** raw `curl` directly via Bash. Cheapest — no skill load, no file written. ```bash curl -s "https://defuddle.md/$URL_WITHOUT_PROTOCOL" | head -c 10000 ``` If under 50 words or error: `curl -s "https://r.jina.ai/$FULL_URL"`. WebFetch is last resort. 2. **Read AND sa