openrouter-load-balancing

Featured

Distribute OpenRouter requests across multiple keys and models for high throughput. Use when scaling beyond single-key rate limits or building high-availability systems. Triggers: 'openrouter load balance', 'openrouter scaling', 'distribute openrouter requests', 'multiple api keys'.

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# OpenRouter Load Balancing ## Overview A single OpenRouter API key has rate limits (requests/minute and tokens/minute). To scale beyond those limits, distribute requests across multiple keys. OpenRouter also provides server-side load balancing via provider routing and the `:nitro` variant for low-latency inference. This skill covers multi-key rotation, health-based routing, circuit breakers, and concurrent request patterns. ## Multi-Key Round Robin ```python import os, itertools, time, logging from openai import OpenAI, RateLimitError from dataclasses import dataclass, field log = logging.getLogger("openrouter.lb") @dataclass class KeyPool: """Round-robin API key pool with health tracking.""" keys: list[str] _cycle: itertools.cycle = field(init=False, repr=False) _health: dict[str, dict] = field(init=False, default_factory=dict) def __post_init__(self): self._cycle = itertools.cycle(self.keys) self._health = {k: {"errors": 0, "last_error": 0, "healthy": True} for k in self.keys} def next_key(self) -> str: """Get next healthy key.""" attempts = 0 while attempts < len(self.keys): key = next(self._cycle) h = self._health[key] # Recover after 60s cooldown if not h["healthy"] and time.time() - h["last_error"] > 60: h["healthy"] = True h["errors"] = 0 if h["healthy"]: return key attempts += 1...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

openrouter-team-setup

Configure OpenRouter for multi-user teams with per-user keys, budget controls, and usage attribution. Triggers: 'openrouter team', 'openrouter multi-user', 'openrouter organization', 'team api keys openrouter'.

2,266 Updated today
jeremylongshore
AI & Automation Featured

openrouter-rate-limits

Understand and handle OpenRouter rate limits. Use when hitting 429 errors, building high-throughput systems, or implementing retry logic. Triggers: 'openrouter rate limit', 'openrouter 429', 'openrouter throttle', 'rate limiting openrouter'.

2,266 Updated today
jeremylongshore
AI & Automation Featured

openrouter-cost-controls

Implement cost controls for OpenRouter API usage. Use when setting budgets, preventing overspend, or managing per-key limits. Triggers: 'openrouter budget', 'openrouter cost limit', 'openrouter spending', 'control openrouter cost'.

2,266 Updated today
jeremylongshore
AI & Automation Featured

openrouter-model-routing

Implement intelligent model routing to optimize cost, quality, and latency on OpenRouter. Use when building multi-model systems or optimizing spend across task types. Triggers: 'openrouter routing', 'model routing', 'route to model', 'model selection openrouter'.

2,266 Updated today
jeremylongshore
AI & Automation Featured

openrouter-performance-tuning

Optimize OpenRouter request latency and throughput. Use when building real-time applications, reducing TTFT, or scaling request volume. Triggers: 'openrouter performance', 'openrouter latency', 'openrouter speed', 'optimize openrouter throughput'.

2,266 Updated today
jeremylongshore