api-vs-selfhost-skilllisted
Install: claude install-skill artvandelay/api-vs-selfhost-skill
# API vs Self-Host
Decide API-vs-self-host LLM economics from whatever context the user gives you.
Fetch live prices, run `scripts/calc.py` for math, write a short report.
## Trigger
- "should I self-host" / "API vs self-host" / "cost to self-host"
- "fine-tune cost" / "fine-tuning ROI"
- "what GPU do I need for \<model\>"
- "OpenAI/Anthropic bill too high" / "is open-source cheaper than \<API\>"
- User pastes a billing screenshot, PRD, or break-even question
Out of scope: pretraining from scratch, image/audio models, non-LLM workloads.
## Workflow
1. **Extract** — read the user's message, open files, and attachments. Map signals (volume, model, spend, traffic shape, quality bar) to fields in [`references/INPUTS.md`](references/INPUTS.md).
2. **Fetch live data** — GPU $/hr from <https://www.runpod.io/pricing> (or Lambda/Modal), API per-token prices from <https://models.dev/> or the vendor page, model quality Elo from <https://lmarena.ai/>. Cite URL + timestamp in the report.
3. **Clarify** — if volume, model, or spend are missing, ask. Don't guess silently. Batch related questions.
4. **Calculate** — `echo '<json>' | python3 scripts/calc.py inference` (or `finetune`). Run more scenarios (different traffic patterns, quants, GPU tiers) when they would change the answer.
5. **Report** — verdict + cost table + assumptions with sources + what would flip the answer.
## Rules
- All VRAM, GPU-hour, and dollar math goes through `scripts/calc.py`. Never compute it in-prompt.
-