langchain-rate-limits

Featured

Implement LangChain rate limiting, retry strategies, and backoff. Use when handling API rate limits, controlling request throughput, or implementing concurrency-safe batch processing. Trigger: "langchain rate limit", "langchain throttling", "langchain backoff", "langchain retry", "API quota", "429 error".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# LangChain Rate Limits ## Overview Handle API rate limits gracefully with built-in retries, exponential backoff, concurrency control, provider fallbacks, and custom rate limiters. ## Provider Rate Limits (2026) | Provider | Model | RPM | TPM | |----------|-------|-----|-----| | OpenAI | gpt-4o | 10,000 | 800,000 | | OpenAI | gpt-4o-mini | 10,000 | 4,000,000 | | Anthropic | claude-sonnet | 4,000 | 400,000 | | Anthropic | claude-haiku | 4,000 | 400,000 | | Google | gemini-1.5-pro | 360 | 4,000,000 | RPM = requests/minute, TPM = tokens/minute. Actual limits depend on your tier. ## Strategy 1: Built-in Retry (Simplest) ```typescript import { ChatOpenAI } from "@langchain/openai"; // Built-in exponential backoff on 429/500/503 const model = new ChatOpenAI({ model: "gpt-4o-mini", maxRetries: 5, // retries with exponential backoff timeout: 30000, // 30s timeout per request }); // This automatically retries on rate limit errors const response = await model.invoke("Hello"); ``` ## Strategy 2: Concurrency-Controlled Batch ```typescript import { ChatOpenAI } from "@langchain/openai"; import { ChatPromptTemplate } from "@langchain/core/prompts"; import { StringOutputParser } from "@langchain/core/output_parsers"; const chain = ChatPromptTemplate.fromTemplate("Summarize: {text}") .pipe(new ChatOpenAI({ model: "gpt-4o-mini", maxRetries: 3 })) .pipe(new StringOutputParser()); const inputs = articles.map((text) => ({ text })); // batch() with maxConcurrency ...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

langchain-performance-tuning

Optimize LangChain application performance: latency, throughput, streaming, caching, batch processing, and connection pooling. Trigger: "langchain performance", "langchain optimization", "langchain latency", "langchain slow", "speed up langchain".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clade-rate-limits

Handle Anthropic rate limits — understand tiers, implement backoff, Use when working with rate-limits patterns. optimize throughput, and monitor usage. Trigger with "anthropic rate limit", "claude 429", "anthropic throttling", "anthropic usage limits", "claude tokens per minute".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langchain-sdk-patterns

Apply production-ready LangChain SDK patterns for structured output, fallbacks, batch processing, streaming, and caching. Trigger: "langchain SDK patterns", "langchain best practices", "idiomatic langchain", "langchain architecture", "withStructuredOutput", "withFallbacks", "abatch".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langfuse-rate-limits

Implement Langfuse rate limiting, batching, and backoff patterns. Use when handling rate limit errors, optimizing trace ingestion, or managing high-volume LLM observability workloads. Trigger with phrases like "langfuse rate limit", "langfuse throttling", "langfuse 429", "langfuse batching", "langfuse high volume".

2,266 Updated today
jeremylongshore
AI & Automation Featured

anth-rate-limits

Implement Anthropic Claude API rate limiting, backoff, and quota management. Use when handling 429 errors, optimizing request throughput, or managing RPM/TPM limits across usage tiers. Trigger with phrases like "anthropic rate limit", "claude 429", "anthropic throttling", "claude retry", "anthropic backoff".

2,266 Updated today
jeremylongshore