llm-cost-optimizerlisted
Install: claude install-skill DROOdotFOO/agent-skills
# LLM Cost Optimizer
Reduce LLM API costs systematically without sacrificing output quality.
## Three Modes
### 1. Cost Audit
Assess current spend and find the 80/20 opportunities.
1. **Instrument** -- Add token counting and cost tracking per request. Log model, input tokens, output tokens, latency, and use case.
2. **Find 80/20** -- Identify which 20% of use cases drive 80% of cost. Sort by total spend, not per-request cost.
3. **Classify** -- Tag each use case by complexity: simple (classification, extraction), medium (summarization, Q&A), complex (reasoning, code generation, multi-step).
### 2. Optimize Existing
Apply techniques to reduce cost on current workloads.
1. **Routing** -- Route simple tasks to cheaper/smaller models. See optimization-techniques.md.
2. **Caching** -- Cache repeated or similar queries. Prompt caching for system prompts.
3. **Compression** -- Reduce prompt size without losing quality. Trim examples, remove redundancy.
### 3. Design Cost-Efficient
Build new systems with cost awareness from day one.
1. **Budget envelopes** -- Set per-feature monthly cost budgets. Alert at 80%.
2. **Routing layer** -- Default to cheapest model that meets quality bar. Escalate on failure.
3. **Observability** -- Track cost per user, per feature, per model. Dashboard with trends.
## Optimization Order
Apply techniques in this order (highest impact first):
1. Model routing (60-80% reduction potential)
2. Prompt caching (40-90% on cached portions)
3. Output