chaos-engineeringlisted
Install: claude install-skill Tibsfox/gsd-skill-creator
# Chaos Engineering
Best practices for systematically injecting failures to discover weaknesses before they cause outages, using steady-state hypotheses, controlled experiments, and progressive blast radius expansion.
## Chaos Engineering Principles
Chaos engineering is not random destruction. It is disciplined experimentation on distributed systems to build confidence in their resilience.
```
Define Steady State --> Form Hypothesis --> Design Experiment --> Control Blast Radius --> Run --> Analyze --> Fix --> Repeat
```
| Principle | Description | Why It Matters |
|-----------|-------------|----------------|
| Define steady state | Identify measurable normal behavior (latency, error rate, throughput) | Without a baseline, you cannot detect degradation |
| Hypothesize around steady state | Predict the system will maintain steady state during fault | Forces explicit thinking about expected behavior |
| Vary real-world events | Inject failures that actually happen (network, disk, process, dependency) | Simulated failures must map to real failure modes |
| Run in production | Test where real complexity exists (with safeguards) | Staging rarely matches production topology |
| Minimize blast radius | Start small, expand gradually, have kill switches | Chaos should reveal problems, not cause outages |
| Automate experiments | Repeatable experiments run in CI/CD or on schedule | Manual experiments don't scale and introduce bias |
| Build a hypothesis backlog | Track what you wa