release-it

Featured

Build production-ready systems with stability patterns: circuit breakers, bulkheads, timeouts, and retry logic. Use when the user mentions "production outage", "circuit breaker", "deployment pipeline", "chaos engineering", "retry storm", "health checks", "my service keeps crashing", "prevent cascading failures", or "make it resilient". Also trigger when designing resilient microservices, planning zero-downtime deployments, or capacity-planning for peak load. Covers stability patterns, capacity planning, deploy/release decoupling, and observability. For data systems, see ddia-systems. For system architecture, see system-design.

Data & Documents 1,754 stars 179 forks Updated 5 days ago MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Release It! Framework Framework for designing, deploying, and operating production-ready software. The software that passes QA is not the software that survives production — production is hostile, and systems must expect and handle failure at every level. ## Core Principle **Every system will eventually be pushed beyond its design limits.** The question is not whether failures happen, but whether your system degrades gracefully or collapses catastrophically. Production-ready software is not just correct — it is resilient, observable, and operates through partial failures without human intervention. ## Scoring **Goal: 8/8.** Score a production system by the Quick Diagnostic: **1 point per row answered "yes"** across the 8 checks (timeouts, circuit breakers, bulkheads, zero-downtime deploy, deep health checks, correlated telemetry, load-tested past peak, failure injection). Bands: **7-8** = every integration point is bounded, isolated, observable, and deploy/release are decoupled; **4-5** = some patterns present but ≥3 diagnostic rows fail (e.g. unbounded retries, shared pools, shallow health checks); **≤2** = relies on the happy path with no breakers, no capacity model, no failure testing. Always state the current score, the failing rows, and the specific fix for each. ## The Release It! Framework Six areas that determine whether software survives contact with production: ### 1. Stability Anti-Patterns **Core concept:** Failures propagate through integration points ...

Details

Author: wondelai
Repository: wondelai/skills
Created: 5 months ago
Last Updated: 5 days ago
Language: Shell
License: MIT

Bundled in these plugins

skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

ddia-systems

Design data systems by understanding storage engines, replication, partitioning, transactions, and consistency models. Use when the user mentions "database choice", "which database should I use", "SQL or NoSQL", "replication lag", "partitioning strategy", "consistency vs availability", "stream processing", "ACID transactions", "eventual consistency", "my queries are slow at scale", or "data is inconsistent across replicas". Also trigger when choosing a datastore, designing data pipelines, or debugging distributed-system consistency issues. Covers data models, batch/stream processing, and distributed consensus. For system design, see system-design. For resilience, see release-it.

1,754 Updated 5 days ago

wondelai

AI & Automation Solid

chaos-and-resilience

Chaos engineering, resilience patterns, failure recovery, and fault tolerance for the {{PROJECT_NAME}}. Covers circuit breaker patterns (HTTP, WS, per-venue), reconnect gate design, graceful shutdown protocol, backpressure strategies (PubSub 1MB/4MB, persistence queues), worker lifecycle management, IPC channel health detection, failure scenario matrix, bounded queue design, and event loop stall recovery. Use when reviewing or writing any code that touches error handling, retry logic, reconnection, circuit breakers, worker management, IPC channels, queue management, backpressure, graceful shutdown, health checks, or any failure-adjacent code path.

3 Updated today

Canhada-Labs

DevOps & Infrastructure Listed

stability

Audits production readiness and incident-response process -- Pillar 3 of The Five Pillars. Use before a major release or migration, after a production incident as part of the retrospective, or when evaluating whether a team is ready to increase deployment frequency.

0 Updated today

ClearMeasureLabs