clade-incident-runbook

Featured

Respond to Anthropic API incidents — outages, degraded performance, Use when working with incident-runbook patterns. error spikes, and rate limit issues in production. Trigger with "anthropic down", "claude outage", "anthropic incident", "claude not responding", "anthropic 529".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Anthropic Incident Runbook ## Overview Respond to Anthropic API incidents in production — outages, sustained 529 errors, authentication failures, and timeouts. Covers status page checking, severity classification, model fallback activation, communication, and post-incident review. ## Step 1: Confirm the Issue ```bash # Check Anthropic status curl -s https://status.anthropic.com/api/v2/status.json | python3 -c " import json, sys d = json.load(sys.stdin) print(f\"Status: {d['status']['description']} ({d['status']['indicator']})\")" # Test API directly curl -s -w "\nHTTP %{http_code} in %{time_total}s\n" \ https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "claude-version: 2023-06-01" \ -H "content-type: application/json" \ -d '{"model":"claude-haiku-4-5-20251001","max_tokens":5,"messages":[{"role":"user","content":"ping"}]}' ``` ## Step 2: Classify Severity | Symptom | Severity | Action | |---------|----------|--------| | 529 overloaded (intermittent) | Low | SDK auto-retries handle this | | 529 overloaded (sustained 5+ min) | Medium | Switch to fallback model | | 401/403 on all requests | High | API key issue — check console | | All requests timing out | High | Check status page, activate fallback | | Status page shows incident | Varies | Follow status page updates | ## Step 3: Activate Fallback ```typescript async function callWithFallback(params: Anthropic.MessageCreateParams) { try { return await client.messages.create(pa...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

anth-incident-runbook

Execute incident response procedures for Claude API outages and degradation. Use when Claude API is returning errors, experiencing high latency, or showing degraded performance in production. Trigger with phrases like "anthropic incident", "claude api down", "anthropic outage", "claude degraded", "anthropic runbook".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clade-common-errors

Diagnose and fix Anthropic API errors — authentication, rate limits, Use when working with common-errors patterns. overloaded, context length, and content policy issues. Trigger with "anthropic error", "claude 429", "claude overloaded", "anthropic not working", "debug claude api".

2,266 Updated today
jeremylongshore
AI & Automation Featured

anth-common-errors

Diagnose and fix Anthropic Claude API errors by HTTP status code. Use when encountering API errors, debugging failed requests, or troubleshooting authentication, rate limiting, or input validation issues. Trigger with phrases like "anthropic error", "claude api error", "fix anthropic 429", "claude not working", "debug claude api".

2,266 Updated today
jeremylongshore
AI & Automation Featured

apollo-incident-runbook

Apollo.io incident response procedures. Use when handling Apollo outages, debugging production issues, or responding to integration failures. Trigger with phrases like "apollo incident", "apollo outage", "apollo down", "apollo production issue", "apollo emergency".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clade-reliability-patterns

Build fault-tolerant Claude integrations — retries, circuit breakers, Use when working with reliability-patterns patterns. fallbacks, timeouts, and graceful degradation. Trigger with "anthropic reliability", "claude fault tolerance", "anthropic circuit breaker", "claude fallback".

2,266 Updated today
jeremylongshore