openrouter-caching-strategy

Featured

Implement caching for OpenRouter API responses to reduce cost and latency. Use when optimizing repeat queries, building RAG systems, or reducing API spend. Triggers: 'openrouter cache', 'cache llm responses', 'openrouter caching', 'reduce openrouter cost'.

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# OpenRouter Caching Strategy ## Overview OpenRouter charges per token, so caching identical or similar requests can dramatically cut costs. Deterministic requests (`temperature=0`) with the same model and messages produce identical outputs -- these are safe to cache. This skill covers in-memory caching, persistent caching with TTL, and Anthropic prompt caching via OpenRouter. ## In-Memory Cache ```python import os, hashlib, json, time from typing import Optional from openai import OpenAI client = OpenAI( base_url="https://openrouter.ai/api/v1", api_key=os.environ["OPENROUTER_API_KEY"], default_headers={"HTTP-Referer": "https://my-app.com", "X-Title": "my-app"}, ) class LLMCache: def __init__(self, ttl_seconds: int = 3600): self._cache: dict[str, tuple[dict, float]] = {} self._ttl = ttl_seconds self.hits = 0 self.misses = 0 def _key(self, model: str, messages: list, **kwargs) -> str: blob = json.dumps({"model": model, "messages": messages, **kwargs}, sort_keys=True) return hashlib.sha256(blob.encode()).hexdigest() def get(self, model: str, messages: list, **kwargs) -> Optional[dict]: k = self._key(model, messages, **kwargs) if k in self._cache: data, ts = self._cache[k] if time.time() - ts < self._ttl: self.hits += 1 return data del self._cache[k] self.misses += 1 return None def set(self, mode...

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

openrouter-performance-tuning

Optimize OpenRouter request latency and throughput. Use when building real-time applications, reducing TTFT, or scaling request volume. Triggers: 'openrouter performance', 'openrouter latency', 'openrouter speed', 'optimize openrouter throughput'.

2,266 Updated today

jeremylongshore

AI & Automation Featured

openrouter-reference-architecture

Design production architectures using OpenRouter as the LLM gateway. Use when planning system design, reviewing architecture, or scaling AI applications. Triggers: 'openrouter architecture', 'openrouter system design', 'openrouter at scale', 'llm gateway architecture'.

2,266 Updated today

jeremylongshore

AI & Automation Featured

openrouter-model-routing

Implement intelligent model routing to optimize cost, quality, and latency on OpenRouter. Use when building multi-model systems or optimizing spend across task types. Triggers: 'openrouter routing', 'model routing', 'route to model', 'model selection openrouter'.

2,266 Updated today

jeremylongshore

AI & Automation Listed

caching-strategies

When improving read performance and reducing database load.

4 Updated 6 days ago

KraitDev

AI & Automation Featured

prompt-caching

Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation)

39,227 Updated today

sickn33