langfuse-incident-runbook

Featured

Troubleshoot and respond to Langfuse-related incidents and outages. Use when experiencing Langfuse outages, debugging production issues, or responding to LLM observability incidents. Trigger with phrases like "langfuse incident", "langfuse outage", "langfuse down", "langfuse production issue", "langfuse troubleshoot".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Langfuse Incident Runbook ## Overview Step-by-step procedures for Langfuse-related incidents, from initial triage (2 min) through resolution and post-incident review. Your application should work without Langfuse -- these procedures focus on restoring observability. ## Severity Classification | Severity | Description | Response Time | Example | |----------|-------------|---------------|---------| | P1 | Application impacted by tracing | 15 min | SDK throwing unhandled errors, blocking requests | | P2 | Traces not appearing, no app impact | 1 hour | Missing observability data | | P3 | Degraded performance from tracing | 4 hours | High latency from flush backlog | | P4 | Minor issues | 24 hours | Occasional missing traces | ## Instructions ### Step 1: Initial Assessment (2 Minutes) ```bash set -euo pipefail echo "=== Langfuse Incident Triage ===" echo "Time: $(date -u)" # 1. Check Langfuse cloud status echo -n "Status page: " curl -s -o /dev/null -w "%{http_code}" https://status.langfuse.com || echo "UNREACHABLE" echo "" # 2. Test API connectivity HOST="${LANGFUSE_BASE_URL:-${LANGFUSE_HOST:-https://cloud.langfuse.com}}" echo -n "API health: " curl -s -o /dev/null -w "%{http_code} (%{time_total}s)" "$HOST/api/public/health" || echo "FAILED" echo "" # 3. Test auth if [ -n "${LANGFUSE_PUBLIC_KEY:-}" ] && [ -n "${LANGFUSE_SECRET_KEY:-}" ]; then AUTH=$(echo -n "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" | base64) echo -n "Auth test: " curl -s -o /dev/null -w "%{http...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

langfuse-prod-checklist

Langfuse production readiness checklist and verification. Use when preparing to deploy Langfuse to production, validating production configuration, or auditing existing setup. Trigger with phrases like "langfuse production", "langfuse prod ready", "deploy langfuse", "langfuse checklist", "langfuse go live".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langchain-incident-runbook

Incident response procedures for LangChain production issues: provider outages, high error rates, latency spikes, and cost overruns. Trigger: "langchain incident", "langchain outage", "langchain production issue", "langchain emergency", "langchain down", "LLM provider outage".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langfuse-common-errors

Diagnose and fix common Langfuse errors and exceptions. Use when encountering Langfuse errors, debugging missing traces, or troubleshooting integration issues. Trigger with phrases like "langfuse error", "fix langfuse", "langfuse not working", "debug langfuse", "traces not appearing".

2,266 Updated today
jeremylongshore
AI & Automation Featured

langfuse-debug-bundle

Collect Langfuse debug evidence for support tickets and troubleshooting. Use when encountering persistent issues, preparing support tickets, or collecting diagnostic information for Langfuse problems. Trigger with phrases like "langfuse debug", "langfuse support bundle", "collect langfuse logs", "langfuse diagnostic", "langfuse troubleshoot".

2,266 Updated today
jeremylongshore
AI & Automation Solid

langfuse

Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debugging, monitoring, and improving LLM applications in production. Use when: langfuse, llm observability, llm tracing, prompt management, llm evaluation.

27,681 Updated today
davila7