mytechsonamy
UserA Claude Code plugin that governs the full SDLC through multi-AI consensus, testability-gated requirements, and risk-proportional quality gates — 100% local, file-backed state.
Categories
Indexed Skills (42)
apply-arbiter-patch
Reads the patch manifest produced by consensus-arbiter, shows a git-style diff summary, archives every target file to .vibeflow/archive/ (a timestamped version trail) AND snapshots it for one-command rollback before touching anything, asks for explicit operator confirmation (unless --yes), and applies via `git apply`. Updates `arbiter-decisions.md` with applied:true + timestamp. Optional `--run-tests` runs the project's test command afterward and reminds the operator how to revert if they break.
consensus-arbiter
Reads the verdict + reviewer suggestions from a consensus session, evaluates each suggestion against the current SDLC phase + domain, and emits diff-first patches for the ones that should APPLY. Never writes to source files — only to `.vibeflow/state/patches/<session>/` and `.vibeflow/reports/arbiter-decisions.md`. Invoke via `/vibeflow:apply-arbiter-patch` to actually apply.
architecture-validator
Validates a proposed software architecture against domain policies, the approved PRD, and (optionally) the current codebase's import graph. Writes .vibeflow/reports/architecture-report.md and one .vibeflow/reports/adr-NNN-<slug>.md per accepted decision. Blocks advance if any criticalPolicyViolations > 0. PIPELINE-1 step 2.
business-rule-validator
Extracts business rules from the PRD as a structured catalog, generates a test case per rule, and runs semantic gap analysis against existing tests. Produces business-rules.md + br-test-suite.test.ts + semantic-gaps.md. Gate contract — zero uncovered P0 rules. PIPELINE-1 step 4.
chaos-injector
Injects controlled failures (network, dependency, clock, resource) into a running test environment at one of three intensity profiles (gentle/moderate/brutal), observes whether the system degrades gracefully, and computes a resilience score. Every injection has a mandatory recovery step; blast-radius overflow aborts the run. Gate contract — production is forbidden, every injection must have a verified recovery, no cascading failures on the gentle profile. PIPELINE-3 step 2.
checklist-generator
Emits context-aware review checklists (PR review, release, feature sign-off, accessibility) driven by platform (web / mobile / backend / all) and enriched with scenario-set.md coverage gaps. Every item must be verifiable (action + source of truth + binary outcome) — vague prose is rejected. PIPELINE-2 step 3.
component-test-writer
Generates component-level unit tests from source files and (optionally) a scenario-set.md. Framework-aware (vitest | jest) via repo-fingerprint, never guesses. Enforces strict Arrange-Act-Assert structure, preserves every scenario as a test case, and refuses to overwrite existing tests. PIPELINE-1 step 4.
coverage-analyzer
Parses vitest/jest/istanbul coverage JSON, rolls up line/branch/function coverage to the requirement level via RTM, ranks uncovered gaps by risk, and enforces domain-specific thresholds. Gate contract — every P0 requirement has 100% coverage of its mapped lines, overall coverage meets the domain threshold, and no unjustified exclusions. PIPELINE-5 step 5 / PIPELINE-6 step 4.
cross-run-consistency
Runs the same test N times in one session, diffs the outputs, and classifies non-determinism by root cause. Complements observability's historical flake tracking with an immediate "does this test agree with itself right now?" answer. Gate contract — P0 scenarios must be strict-consistent (same output on N/N runs), no tolerance fuzzing, no silent averaging. PIPELINE-5 step 3.
decision-recommender
Produces structured decision packages (problem statement + options + trade-offs + recommendation + effort estimate) from any findings report. Used when a decision needs framing, not auto-gating. Consumes learning-loop-engine recommendations + L2 skill reports + team context, and emits decision-package.md. Gate contract — every recommendation cites specific findings, every option has trade-offs in both directions, every recommendation carries an effort estimate, 'do nothing' is always included as option zero. PIPELINE-4 step 2 (conditional).
e2e-test-writer
Generates end-to-end tests from scenario-set.md. Web target → Playwright; mobile target → Detox. Every test imports a Page Object (never touches raw selectors), uses a named auth strategy, waits on observable conditions (never sleeps), and preserves scenario ids as trace anchors. Gate contract — zero raw selectors in the test body, zero sleep-based waits, zero xpath selectors. PIPELINE-3 step 2.
environment-orchestrator
Produces env-setup.md — a reproducible, teardown-safe environment recipe for a given test profile (unit/integration/e2e/uat/perf) and platform. Assembles components from a catalog, pins image versions, declares healthchecks, and never inlines secrets. Gate contract — every component has a healthcheck, every setup has a teardown, secrets flow through references only. PIPELINE-3 step 1.
invariant-formalizer
Turns natural-language invariants from business-rules.md / the PRD into machine-checkable predicates (Zod refinements, runtime guards, Z3 SMT constraints, property-based generators). Emits invariant-matrix.md + invariants.ts. Gate contract — zero unformalized P0 invariants. PIPELINE-3 step 2.
learning-loop-engine
Consumes the full history of reports from every L2 skill, detects recurring patterns, traces production bugs back to missed test opportunities, detects quality drift across sprint baselines, mines the cross-session consensus telemetry for slow-converging phases and recurring reviewer themes, and recommends the next maturity-stage improvements. Operates in four modes — test-history / production-feedback / drift-analysis / consensus-history — each with its own pattern-detection flow. Gate contract — every recommendation must be actionable, every pattern must carry ≥ 3 supporting observations, every production bug must trace to a specific test gap or be marked irreducible with justification. PIPELINE-6 step 1 / PIPELINE-7 step 1.
mutation-test-runner
Generates code mutations from a fixed operator catalog, runs the test suite against each mutant, and computes the mutation score. A surviving mutant points directly at an assertion that doesn't actually check what it claims. Gate contract — zero surviving mutants in P0 code + domain-specific mutation score threshold. PIPELINE-2 step 2 (conditional) / PIPELINE-6 step 2.
observability-analyzer
Parses HAR files, Playwright traces, browser console logs, and Chrome DevTools Protocol exports, detects anomalies against a fixed catalog, and emits observability-report.md. Complementary to the observability MCP (which tracks cross-run metrics) — this skill looks at the artifacts a single run produced. Gate contract — zero critical anomalies in P0 scenarios, no console errors above the severity threshold, web vitals meet the domain budget. PIPELINE-5 step 6 / PIPELINE-6 step 5.
prd-quality-analyzer
Analyzes PRD documents for ambiguity, conflicts, and missing flows. Produces testability score (0-100). Writes a report to .vibeflow/reports/prd-quality-report.md and a cost-avoidance breakdown to .vibeflow/reports/prd-cost-avoidance.md. Use when reviewing requirements, validating PRDs, or before starting development. Blocks development if testability score < 60.
reconciliation-simulator
Simulates concurrent ledger operations against a canonical set of financial invariants (double-entry, conservation, sign convention, monetary precision), detects balance drift under contention, and generates reproducible reconciliation test cases. Financial-domain-only — blocks for every other domain. Gate contract — zero invariant violations across every tested concurrency pattern, deterministic simulation (same seed → same outcome), every violation traces to a specific operation sequence. PIPELINE-3 step 4 (financial-only).
regression-test-runner
Runs the project's existing test suite at the scope appropriate for the trigger (smoke on PR, full on release, incremental on file change), diffs results against regression-baseline.json, classifies every test as passing / new-failure / still-failing / fixed / flaky, and enforces a P0 pass-rate gate. Gate contract — P0 pass rate must be exactly 100% before a run can update the baseline. PIPELINE-2 step 4 / PIPELINE-5 step 2.
release-decision-engine
Aggregates all quality signals into a deterministic release decision (GO/CONDITIONAL/BLOCKED). Uses domain-specific weighted scoring. Always runs LAST in staging-uat and release-decision pipelines. Use when deciding whether a release is safe.
test-data-manager
Generates deterministic test data factories and fixtures from TypeScript types, Zod schemas, or JSON Schema. Seeded RNG means same seed → same data on every machine. Injects edge cases (boundary values, nulls, unicode, empty collections) from a canonical catalog and respects declared invariants. Produces <domain>.factory.ts + fixtures/<domain>.json. PIPELINE-1 step 4.
test-priority-engine
Ranks the test suite by risk so the highest-leverage tests run first. Consumes changed files + regression-baseline.json + ob_track_flaky history, applies a deterministic risk model, and emits priority-plan.md. Gate contract — every affected P0 test appears in the plan, regardless of mode budget. PIPELINE-2 step 1 / PIPELINE-5 step 1.
test-result-analyzer
Classifies test failures into bug / flaky / environment / test-defect, links each failure back to its RTM requirement, and generates ready-to-import backlog tickets for the real bugs. Consumes uat-raw-report.md or raw runner JSON via ob_collect_metrics. Gate contract — every failure is classified (no UNCLASSIFIED leaks to downstream), every 'bug' classification has a confidence ≥ 0.7, every generated ticket traces back to a scenario id. PIPELINE-5 step 4 / PIPELINE-6 step 3.
test-strategy-planner
Creates comprehensive test strategy, scenario set, and requirements traceability matrix from PRD. Writes .vibeflow/reports/scenario-set.md (universal input for ALL downstream test skills) and .vibeflow/reports/test-strategy.md. Must run before any test writing skill.
traceability-engine
Maps PRD requirements to test scenarios to source code. Detects untested requirements, orphan tests, and stale traces. Writes the Requirements Traceability Matrix to .vibeflow/reports/rtm.md and a gap summary to .vibeflow/reports/traceability-gaps.md. Use for coverage gap analysis, requirement validation, and audit trails.
uat-executor
Executes UAT scenarios against a live staging environment, walks automated steps via a runner (playwright/detox) and human-in-the-loop steps via prompts, collects evidence (screenshots + timings + console) on every step, and emits uat-raw-report.md. Gate contract — every failed step must carry evidence, every P0 scenario must be executed. PIPELINE-3 step 3.
visual-ai-analyzer
Uses Claude vision to inspect screenshots for layout regressions, accessibility issues, typography drift, and design fidelity. Complementary to design-bridge's db_compare_impl (which does dimension + byte identity) — this skill actually SEES the images and describes what changed. Gate contract — zero critical visual regressions in P0 scenarios, accessibility findings require remediation, design-diff above tolerance needs human review. PIPELINE-5 step 7 / PIPELINE-6 step 6.
advance
Advance the project to the next SDLC phase. Reads projectId from vibeflow.config.json, invokes the sdlc_advance_phase MCP tool, and surfaces the phase-gate's pass/fail reason. Supports humanOverrideNote for advancing under HUMAN_APPROVAL_REQUIRED consensus (Sprint 17-C).
consensus-orchestrator
Orchestrates multi-AI review process. Coordinates Claude, ChatGPT (codex CLI), and Gemini (gemini CLI) reviews. Each CLI's verdict is appended to the session jsonl the aggregator reads, per-CLI timeout (consensus.cliTimeoutSeconds, default 90) so the aggregator never falls through to its 600s global wait. On NEEDS_REVISION verdicts the skill auto-invokes consensus-arbiter to decide which reviewer suggestions are applicable to the current phase.
consensus-specialist
Dispatches a phase-specific specialist subagent (prd-rewriter, adr-author, test-strategy-refiner, coverage-gap-filler, design-spec-refiner, or runbook-editor) to deeply rewrite an artifact based on multi-AI reviewer feedback. Diff-first contract — never writes to source directly. Use when the arbiter's mechanical patches aren't enough and the phase needs structural rework.
deploy-verifier
Verify a completed deployment by cross-checking the CI pipeline status (dev-ops MCP) + service health dashboard (observability MCP) + optional smoke-test endpoints. Auto-satisfies the DEPLOYMENT phase's `deployment.verified` and `health.checks.passed` exit criteria when every declared check returns green. Writes a consensus-needed marker so the third DEPLOYMENT criterion (`consensus.deployment.approved`) still gates through multi-AI review.
flow-status
Shows current VibeFlow project status including SDLC phase, pending tasks, quality metrics, and recent review results. Use when asking about project progress. Renamed from `status` in v2.20.0 to avoid colliding with Claude Code's built-in `/status` (usage/quota).
loop-status
The autonomous-loop dashboard. Consolidates every loop surface — phase-runner progress, auto-apply outcomes + per-key revert rates, cooled-down keys, the armed revert watch, learning-report findings, and the opt-in cross-project signal — into one read-only view, and writes a durable .vibeflow/reports/loop-status.md audit. Where /vibeflow:flow-status gives a quick inline glance, /vibeflow:loop-status is the full pane of glass. Read-only — never changes config or source.
onboard
Onboard a new VibeFlow project (renamed from init in v2.19.0 to avoid colliding with Claude Code's built-in /init). Sets up vibeflow.config.json with domain and tech stack, creates the .vibeflow/ state directory, and records an optional brownfield fingerprint. Runs exactly once at the start of REQUIREMENTS.
phase-runner
End-to-end phase orchestrator. For the current SDLC phase, runs every registered analyzer, then drives a real cross-AI consensus verdict HEADLESSLY via hooks/scripts/consensus-run.sh (runs the codex/gemini reviewer CLIs + finalises verdict.json — no dependency on the disable-model-invocation consensus skills, so phase-runner no longer deadlocks). On APPROVED it records consensus + auto-advances via the sdlc-engine MCP tools; on NEEDS_REVISION/REJECTED it stops with an explicit operator breadcrumb (the deep-rewrite specialist→arbiter→apply chain edits the primary artifact and stays operator-confirmed by design). In TESTING it first generates the coverage artifact (Sprint 29). One command replaces the manual analyzer → consensus → advance walk.
architecture-bootstrap
The ARCHITECTURE-phase author-side guide — the counterpart to design-bootstrap. architecture-validator VALIDATES docs/architecture.md but nothing authors it, so a greenfield ARCHITECTURE phase had no skill to produce the doc. This authors docs/architecture.md (logical/component architecture, data model, integration contracts, a domain-policy-aware security & privacy architecture, NFRs, deployment baseline) + one revisitable ADR per consequential decision, after ASKING the operator the technology/deployment baseline (propose sensible defaults / specify the stack / stack-agnostic). Then arms the consensus marker so architecture-validator + consensus review it. Use at the start of ARCHITECTURE.
brownfield-intake
The brownfield on-ramp. After onboard on an existing codebase, this fingerprints the repo (languages, frameworks, structure, hotspots, tech debt via codebase-intel) so you don't have to explain the architecture, then ASKS how you want to define the work — describe it in plain language, point to an existing requirements/spec doc, answer a few guided questions, or pull a GitHub/GitLab issue — and synthesizes a code-anchored, increment-scoped requirements doc that feeds prd-quality-analyzer and the rest of the SDLC. Use right after onboard on a project that already has code.
design-bootstrap
The DESIGN-phase guide. Produces the design artifacts (design spec, design tokens, wireframes) that DESIGN consensus reviews. It classifies whether the increment is UI-facing, then ASKS the operator which design source — Claude-native (zero-setup attractive accessible UI spec + tokens + wireframes), an existing Figma file (via the design-bridge MCP), designing from scratch in Figma (official Figma MCP), BOTH side-by-side for comparison (design-comparison.md → pick one), or a technical design (low/no-UI backend/infra increments). Writes everything under design/ and arms the consensus marker. Use at the start of DESIGN.
input-validation-matrix
For every input field in the UI, generates and runs a systematic data-validation test matrix — required/presence, type (numeric vs alphanumeric vs email/date/currency), boundary (min/max value + length, just-inside/just-outside), format (regex/IBAN/phone/precision), type-mismatch, and injection-safety (XSS/SQL-ish strings escaped or rejected) — plus output-control checks (formatting/masking of the displayed value). Writes real framework tests into the suite and a validation-matrix.md report. Use in TESTING for any UI-facing change; pairs with e2e-test-writer + visual-ai-analyzer in the front-end battery.
mobile-stability-runner
Mobile crash & stability testing for a UI increment on iOS/Android. Expo-aware — auto-detects Expo (managed/bare) and picks the runner (Expo → Maestro on Expo Go / dev-client, bare React Native → Detox). Runs a crash-focused battery beyond functional E2E — cold-start smoke, background/foreground, low-memory, deep links, permission denial, network loss mid-flow, rotation, rapid navigation — and detects native crashes / JS redbox / ANR, surfacing anything captured by a configured crash reporter (Sentry / Crashlytics / Expo). Writes mobile-stability-report.md. Graceful-degrades to a local-run breadcrumb when no simulator/device is available. Use in TESTING when the platform is mobile.
arch-guardrails
Validates proposed changes against architectural rules — layering, allowed dependencies, forbidden imports, naming conventions. Use before merging or when reviewing a refactor that touches cross-module boundaries. Blocks work that violates ADR-recorded constraints.
repo-fingerprint
Produces a brownfield repository fingerprint — languages, frameworks, test runners, build tools, module layout, and risk hotspots. Run once when adopting VibeFlow on an existing codebase; the fingerprint is consumed by planning and test-strategy skills. Never assumes frameworks — detects them.
Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.