langchain-rate-limits

Featured

Implement LangChain rate limiting, retry strategies, and backoff. Use when handling API rate limits, controlling request throughput, or implementing concurrency-safe batch processing. Trigger: "langchain rate limit", "langchain throttling", "langchain backoff", "langchain retry", "API quota", "429 error".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# LangChain Rate Limits ## Overview Handle API rate limits gracefully with built-in retries, exponential backoff, concurrency control, provider fallbacks, and custom rate limiters. ## Provider Rate Limits (2026) | Provider | Model | RPM | TPM | |----------|-------|-----|-----| | OpenAI | gpt-4o | 10,000 | 800,000 | | OpenAI | gpt-4o-mini | 10,000 | 4,000,000 | | Anthropic | claude-sonnet | 4,000 | 400,000 | | Anthropic | claude-haiku | 4,000 | 400,000 | | Google | gemini-1.5-pro | 360 | 4,000,000 | RPM = requests/minute, TPM = tokens/minute. Actual limits depend on your tier. ## Strategy 1: Built-in Retry (Simplest) ```typescript import { ChatOpenAI } from "@langchain/openai"; // Built-in exponential backoff on 429/500/503 const model = new ChatOpenAI({ model: "gpt-4o-mini", maxRetries: 5, // retries with exponential backoff timeout: 30000, // 30s timeout per request }); // This automatically retries on rate limit errors const response = await model.invoke("Hello"); ``` ## Strategy 2: Concurrency-Controlled Batch ```typescript import { ChatOpenAI } from "@langchain/openai"; import { ChatPromptTemplate } from "@langchain/core/prompts"; import { StringOutputParser } from "@langchain/core/output_parsers"; const chain = ChatPromptTemplate.fromTemplate("Summarize: {text}") .pipe(new ChatOpenAI({ model: "gpt-4o-mini", maxRetries: 3 })) .pipe(new StringOutputParser()); const inputs = articles.map((text) => ({ text })); // batch() with maxConcurrency ...

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

anth-rate-limits

Implement Anthropic Claude API rate limiting, backoff, and quota management. Use when handling 429 errors, optimizing request throughput, or managing RPM/TPM limits across usage tiers. Trigger with phrases like "anthropic rate limit", "claude 429", "anthropic throttling", "claude retry", "anthropic backoff".

2,266 Updated today

jeremylongshore