anth-performance-tuning

Featured

Optimize Claude API performance with prompt caching, model selection, streaming, and latency reduction techniques. Use when experiencing slow responses, optimizing token usage, or reducing time-to-first-token in production. Trigger with phrases like "anthropic performance", "claude speed", "optimize claude latency", "anthropic caching", "faster claude responses".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Anthropic Performance Tuning ## Overview Optimize Claude API latency and throughput via prompt caching, model selection, streaming, and request optimization. The biggest wins come from prompt caching (90% input cost reduction) and model selection (Haiku is 4x faster than Sonnet). ## Prompt Caching (Biggest Win) ```python import anthropic client = anthropic.Anthropic() # Mark long, reusable content with cache_control # Cached content: 90% cheaper on subsequent requests, near-zero latency for cached portion message = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, system=[ { "type": "text", "text": "You are an expert on the following 50-page document: ...<long document>...", "cache_control": {"type": "ephemeral"} # Cache this block } ], messages=[{"role": "user", "content": "What does section 3.2 say?"}] ) # Check cache performance print(f"Cache read tokens: {message.usage.cache_read_input_tokens}") # Free/cheap print(f"Cache creation tokens: {message.usage.cache_creation_input_tokens}") # First call only print(f"Uncached input tokens: {message.usage.input_tokens}") ``` **Cache requirements:** Minimum 1,024 tokens for Sonnet/Opus, 2,048 for Haiku. Cache lives for 5 minutes (refreshed on each hit). ## Model Selection for Speed | Model | Speed | Cost (per MTok in/out) | Best For | |-------|-------|----------------------|----------| | Claude Haiku | Fastest | $0.80 / ...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

clade-performance-tuning

Optimize Anthropic API latency — streaming, prompt caching, model selection, Use when working with performance-tuning patterns. connection reuse, and parallel requests. Trigger with "anthropic slow", "claude latency", "speed up anthropic", "anthropic performance", "claude response time".

2,266 Updated today
jeremylongshore
AI & Automation Featured

anth-cost-tuning

Optimize Anthropic Claude API costs with model routing, prompt caching, batching, and spend monitoring. Use when analyzing Claude API billing, reducing costs, or implementing cost controls and budget alerts. Trigger with phrases like "anthropic cost", "claude billing", "reduce claude spend", "anthropic budget", "claude pricing optimize".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clade-cost-tuning

Optimize Anthropic API costs — model selection, prompt caching, batches, Use when working with cost-tuning patterns. token reduction, and usage monitoring. Trigger with "anthropic pricing", "claude cost", "reduce anthropic spend", "anthropic billing", "claude cheaper".

2,266 Updated today
jeremylongshore
AI & Automation Listed

claude-api

Anthropic Claude API patterns for Python and TypeScript. Covers Messages API, streaming, tool use, vision, extended thinking, batches, prompt caching, and Claude Agent SDK. Use when building applications with the Claude API or Anthropic SDKs.

0 Updated today
CodeWithBehnam
AI & Automation Solid

claude-api

Anthropic Claude API patterns for Python and TypeScript. Covers Messages API, streaming, tool use, vision, extended thinking, batches, prompt caching, and Claude Agent SDK. Use when building applications with the Claude API or Anthropic SDKs.

199,464 Updated today
affaan-m