posthog-incident-runbook

Featured

PostHog incident response: triage decision tree, immediate actions for 401/429/500 errors, graceful degradation, evidence collection, and postmortem. Trigger: "posthog incident", "posthog outage", "posthog down", "posthog on-call", "posthog emergency", "posthog broken production".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# PostHog Incident Runbook ## Overview Rapid incident response for PostHog integration failures. PostHog Cloud has its own status page (status.posthog.com) — the first step is always determining whether the issue is PostHog-side or your integration. ## Severity Levels | Level | Definition | Response Time | Examples | |-------|------------|---------------|----------| | P1 | Analytics completely down | < 15 min | All capture calls failing, feature flags returning defaults | | P2 | Degraded analytics | < 1 hour | High latency, partial event loss, slow flag eval | | P3 | Minor impact | < 4 hours | Webhook delays, specific event type missing | | P4 | No user impact | Next day | Monitoring gaps, dashboard stale data | ## Quick Triage (Run First) ```bash set -euo pipefail echo "=== PostHog Triage ===" echo "" # 1. Is PostHog Cloud up? echo -n "PostHog US Cloud: " curl -sf -o /dev/null -w "%{http_code}" https://us.i.posthog.com/healthz || echo "UNREACHABLE" echo "" # 2. Can we capture events? echo -n "Event capture: " curl -sf -o /dev/null -w "%{http_code}" -X POST 'https://us.i.posthog.com/capture/' \ -H 'Content-Type: application/json' \ -d "{\"api_key\":\"${NEXT_PUBLIC_POSTHOG_KEY}\",\"event\":\"triage_test\",\"distinct_id\":\"triage\"}" || echo "FAILED" echo "" # 3. Can we evaluate flags? echo -n "Flag evaluation: " curl -sf -o /dev/null -w "%{http_code}" -X POST 'https://us.i.posthog.com/decide/?v=3' \ -H 'Content-Type: application/json' \ -d "{\"api_key\":\"${N...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

hubspot-incident-runbook

Execute HubSpot incident response with triage, mitigation, and postmortem. Use when responding to HubSpot API outages, investigating CRM errors, or running post-incident reviews for HubSpot integration failures. Trigger with phrases like "hubspot incident", "hubspot outage", "hubspot down", "hubspot on-call", "hubspot emergency", "hubspot broken".

2,266 Updated today
jeremylongshore
AI & Automation Featured

fireflies-incident-runbook

Execute Fireflies.ai incident response with triage, remediation, and postmortem. Use when responding to Fireflies.ai API outages, auth failures, or webhook delivery problems. Trigger with phrases like "fireflies incident", "fireflies outage", "fireflies down", "fireflies on-call", "fireflies emergency", "fireflies broken".

2,266 Updated today
jeremylongshore
AI & Automation Featured

posthog-observability

Monitor PostHog integration health: event ingestion rates, feature flag evaluation latency, billing volume tracking, and Prometheus/Grafana alerting. Trigger: "posthog monitoring", "posthog metrics", "posthog observability", "monitor posthog", "posthog alerts", "posthog dashboard".

2,266 Updated today
jeremylongshore
AI & Automation Featured

cohere-incident-runbook

Execute Cohere incident response procedures with triage, mitigation, and postmortem. Use when responding to Cohere API outages, investigating errors, or running post-incident reviews for Cohere integration failures. Trigger with phrases like "cohere incident", "cohere outage", "cohere down", "cohere on-call", "cohere emergency", "cohere broken".

2,266 Updated today
jeremylongshore
AI & Automation Featured

intercom-incident-runbook

Execute Intercom incident response procedures with triage, mitigation, and postmortem. Use when responding to Intercom API outages, investigating integration errors, or running post-incident reviews for Intercom failures. Trigger with phrases like "intercom incident", "intercom outage", "intercom down", "intercom on-call", "intercom emergency", "intercom broken".

2,266 Updated today
jeremylongshore