llm-cost-optimizer
SolidUse when you need to reduce LLM API spend, control token usage, route between models by cost/quality, implement prompt caching, or build cost observability for AI features. Triggers: 'my AI costs are too high', 'optimize token usage', 'which model should I use', 'LLM spend is out of control', 'implement prompt caching'. NOT for RAG pipeline design (use rag-architect). NOT for prompt writing quality (use senior-prompt-engineer).
Install
Quality Score: 93/100
Skill Content
Details
- Author
- alirezarezvani
- Repository
- alirezarezvani/claude-skills
- Created
- 7 months ago
- Last Updated
- yesterday
- Language
- Python
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
llm-cost-optimizer
Analyze and reduce LLM API costs through model routing, caching, and prompt optimization. TRIGGER when: user asks about LLM costs, API spend reduction, token optimization, model routing, or prompt caching. DO NOT TRIGGER when: user asks about model quality comparison, fine-tuning, or general prompt engineering.
optimizing-prompts
This skill optimizes prompts for Large Language Models (LLMs) to reduce token usage, lower costs, and improve performance. It analyzes the prompt, identifies areas for simplification and redundancy removal, and rewrites the prompt to be more concise and effective. It is used when the user wants to reduce LLM costs, improve response speed, or enhance the quality of LLM outputs by optimizing the prompt. Trigger terms include "optimize prompt", "reduce LLM cost", "improve prompt performance", "rewrite prompt", "prompt optimization".
langchain-cost-tuning
Optimize LangChain API costs with token tracking, model tiering, caching, prompt compression, and budget enforcement. Trigger: "langchain cost", "langchain tokens", "reduce langchain cost", "langchain billing", "langchain budget", "token optimization".
langfuse-cost-tuning
Monitor and optimize LLM costs using Langfuse analytics and dashboards. Use when tracking LLM spending, identifying cost anomalies, or implementing cost controls for AI applications. Trigger with phrases like "langfuse costs", "LLM spending", "track AI costs", "langfuse token usage", "optimize LLM budget".
cost-aware-pipeline
Cost-aware LLM pipeline patterns for optimal model routing, narrow retry strategies, and prompt caching. Reduces API costs 40-70% through intelligent model selection, targeted retries, and cache-friendly prompt structures. Use when: (1) Building multi-model pipelines, (2) Optimizing API costs, (3) Designing retry strategies for LLM calls, (4) Implementing prompt caching, (5) Choosing between haiku/sonnet/opus for sub-tasks.