cohere-cost-tuning

Featured

Optimize Cohere costs through model selection, token budgets, and usage monitoring. Use when analyzing Cohere billing, reducing API costs, or implementing usage monitoring and budget alerts. Trigger with phrases like "cohere cost", "cohere billing", "reduce cohere costs", "cohere pricing", "cohere expensive", "cohere budget".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Cohere Cost Tuning ## Overview Optimize Cohere costs through model selection, token budgets, embedding compression, and usage monitoring. Cohere pricing is token-based with separate input/output rates. ## Prerequisites - Cohere production key (trial is free but limited) - Access to [dashboard.cohere.com](https://dashboard.cohere.com) billing page ## Cohere Pricing Model **Key principle:** Cohere charges per token. Input tokens and output tokens have different rates. Embed, Rerank, and Classify have separate pricing based on search units. | Tier | Access | Rate Limits | Cost | |------|--------|-------------|------| | Trial | Free | 5-20 calls/min, 1000/month | $0 | | Production | Metered | 1000 calls/min, unlimited | Per-token | ### Model Cost Comparison | Model | Input (per 1M tokens) | Output (per 1M tokens) | Best For | |-------|----------------------|------------------------|----------| | `command-r7b-12-2024` | Lowest | Lowest | High-volume, simple tasks | | `command-r-08-2024` | Low | Low | RAG, cost-effective | | `command-r-plus-08-2024` | Medium | Medium | Complex reasoning | | `command-a-03-2025` | Higher | Higher | Best quality | ### Non-Chat Pricing | Endpoint | Pricing Unit | Notes | |----------|-------------|-------| | Embed | Per input token | Batch 96 texts to minimize calls | | Rerank | Per search unit | 1 query + N docs = 1 search unit | | Classify | Per classification | Charges per input classified | ## Instructions ### Strategy 1: Model Tiering ...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

cohere-performance-tuning

Optimize Cohere API performance with caching, batching, model selection, and streaming. Use when experiencing slow API responses, implementing caching strategies, or optimizing request throughput for Cohere Chat, Embed, and Rerank. Trigger with phrases like "cohere performance", "optimize cohere", "cohere latency", "cohere caching", "cohere slow", "cohere batch".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clade-cost-tuning

Optimize Anthropic API costs — model selection, prompt caching, batches, Use when working with cost-tuning patterns. token reduction, and usage monitoring. Trigger with "anthropic pricing", "claude cost", "reduce anthropic spend", "anthropic billing", "claude cheaper".

2,266 Updated today
jeremylongshore
AI & Automation Featured

anth-cost-tuning

Optimize Anthropic Claude API costs with model routing, prompt caching, batching, and spend monitoring. Use when analyzing Claude API billing, reducing costs, or implementing cost controls and budget alerts. Trigger with phrases like "anthropic cost", "claude billing", "reduce claude spend", "anthropic budget", "claude pricing optimize".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langchain-cost-tuning

Optimize LangChain API costs with token tracking, model tiering, caching, prompt compression, and budget enforcement. Trigger: "langchain cost", "langchain tokens", "reduce langchain cost", "langchain billing", "langchain budget", "token optimization".

2,266 Updated today
jeremylongshore
AI & Automation Solid

together-cost-tuning

Together AI cost tuning for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together cost tuning".

2,266 Updated today
jeremylongshore