principle-observabilitylisted
Install: claude install-skill lugassawan/swe-workbench
# Observability
Three signals, each with a distinct job. Use the right one for the right question.
## The Three Pillars
**Logs** — discrete events with context. Right for: debugging specific incidents, audit trails, free-text search.
- Emit at decision points and error boundaries, not in loops.
- Use structured key-value format; never interpolate variables into free-form strings.
- Include a correlation ID on every line — trace parent ID if a trace exists.
**Metrics** — numeric aggregations over time. Right for: dashboards, SLOs, alerting thresholds.
- Pre-aggregate; do not stream raw events into a metrics backend.
- Label only stable, low-cardinality dimensions (region, status_code, service) — never user ID, request ID, or email.
**Traces** — causal chain of operations across services. Right for: latency attribution, dependency mapping.
- A span represents one unit of work; parent/child links form the trace.
- Propagate context headers at every service boundary — OpenTelemetry W3C trace context.
- Put high-cardinality attributes (user ID, document ID) in span attributes, not metric labels.
## Structured Logging
Emit events as key-value pairs. The log processor should never parse a free-form string.
- Correlation ID ties log lines to a trace and to a business operation.
- Log level discipline: DEBUG (local dev), INFO (normal flow), WARN (recoverable anomaly), ERROR (action required).
- No `log.debug("entering function X")` in hot paths — cost without signal.
## Cardin