cohere-rate-limits

Featured

Implement Cohere rate limiting, backoff, and request queuing patterns. Use when handling 429 errors, implementing retry logic, or optimizing API request throughput for Cohere. Trigger with phrases like "cohere rate limit", "cohere throttling", "cohere 429", "cohere retry", "cohere backoff".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Cohere Rate Limits ## Overview Handle Cohere rate limits with exponential backoff, request queuing, and proactive throttling. Real rate limits from Cohere's documentation. ## Prerequisites - `cohere-ai` SDK installed - Understanding of async/await patterns ## Actual Cohere Rate Limits | Key Type | Endpoint | Rate Limit | Monthly Limit | |----------|----------|-----------|---------------| | **Trial** | Chat | 20 calls/min | 1,000 total | | **Trial** | Embed | 5 calls/min | 1,000 total | | **Trial** | Rerank | 5 calls/min | 1,000 total | | **Trial** | Classify | 5 calls/min | 1,000 total | | **Production** | All endpoints | 1,000 calls/min | Unlimited | Trial keys are free. Production keys require billing at [dashboard.cohere.com](https://dashboard.cohere.com). ## Instructions ### Step 1: Exponential Backoff with Jitter ```typescript import { CohereError, CohereTimeoutError } from 'cohere-ai'; interface RetryConfig { maxRetries: number; baseDelayMs: number; maxDelayMs: number; } const DEFAULT_RETRY: RetryConfig = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 60_000, }; async function withBackoff<T>( operation: () => Promise<T>, config = DEFAULT_RETRY ): Promise<T> { for (let attempt = 0; attempt <= config.maxRetries; attempt++) { try { return await operation(); } catch (err) { if (attempt === config.maxRetries) throw err; // Only retry on rate limits (429) and server errors (5xx) let shouldRetry = false; le...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

cohere-sdk-patterns

Apply production-ready Cohere SDK patterns for TypeScript and Python. Use when implementing Cohere integrations, refactoring SDK usage, or establishing team coding standards for Cohere API v2. Trigger with phrases like "cohere SDK patterns", "cohere best practices", "cohere code patterns", "idiomatic cohere", "cohere wrapper".

2,266 Updated today
jeremylongshore
AI & Automation Featured

intercom-rate-limits

Handle Intercom API rate limits with backoff, queuing, and header monitoring. Use when handling 429 errors, implementing retry logic, or optimizing API request throughput for Intercom. Trigger with phrases like "intercom rate limit", "intercom throttling", "intercom 429", "intercom retry", "intercom backoff", "intercom request limit".

2,266 Updated today
jeremylongshore
AI & Automation Featured

hubspot-rate-limits

Implement HubSpot rate limiting, backoff, and request queuing patterns. Use when handling 429 errors, implementing retry logic, or optimizing API throughput against HubSpot rate limits. Trigger with phrases like "hubspot rate limit", "hubspot throttling", "hubspot 429", "hubspot retry", "hubspot backoff", "hubspot quota".

2,266 Updated today
jeremylongshore
AI & Automation Featured

instantly-rate-limits

Implement Instantly.ai rate limiting, backoff, and request throttling patterns. Use when handling 429 errors, implementing retry logic, or building high-throughput Instantly integrations. Trigger with phrases like "instantly rate limit", "instantly 429", "instantly throttle", "instantly backoff", "instantly retry".

2,266 Updated today
jeremylongshore
AI & Automation Featured

elevenlabs-rate-limits

Implement ElevenLabs rate limiting, concurrency queuing, and backoff patterns. Use when handling 429 errors, implementing retry logic, or managing concurrent TTS request throughput. Trigger: "elevenlabs rate limit", "elevenlabs throttling", "elevenlabs 429", "elevenlabs retry", "elevenlabs backoff", "elevenlabs concurrent requests".

2,266 Updated today
jeremylongshore