23ag1
UserQuality-first harness for autonomous AI coding agents — deterministic gates + a default-FAIL evaluator over a Beads task spine. Claude Code plugin; done is earned, not asserted.
Categories
Indexed Skills (15)
completelyinit
Scaffold the completely thin layer into the current repository — Definition of Done, the harness rules snippet in CLAUDE.md, and an optional project-specific quality command. Use when setting up claude-harness in a new project, or when the user says "harness init", "set up the harness", or "wire up the quality gates here".
completelyauto
Run completely AUTONOMOUSLY — loop the FULL task engine over the Beads queue, a fresh `claude -p` per task, until the queue is empty. Same complete recipe as /completely:control. Backed by `cmpl auto`. Run it in the FOREGROUND (blocking) — nested workers only run while the launching session is active; it is idempotent/resumable.
completelyplan
Plan a phase or feature DIRECTLY into Beads — no PLAN.md, no markdown bridge. Runs GSD-style socratic discovery + decomposition + a goal-backward self-check, then emits a structured plan that `cmpl plan-apply` materializes as a Beads epic + worker-contract tasks + dependency waves + human checkpoints. One source of truth from the first planning act. Use to turn an idea or phase into ready Beads work.
completelyrun
Drive the Beads queue (`bd ready`) with one engine in two autonomy modes — supervised (GSD wave subagents, human gates at phase boundaries) or unattended (Ralph-style fresh-context loop, stops when the queue is empty). Quality gates + the default-FAIL evaluator run underneath both. Use to execute planned work. Backed by `cmpl run`.
completelysync
MIGRATION (one-time). Import existing markdown task state (Ralph IMPLEMENTATION_PLAN.md, checkbox task lists) into Beads, idempotently. Use when adopting completely in a repo that has markdown plans, or after an upstream update, to keep Beads the single source of truth. Backed by `cmpl sync`.
agent-eval
Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
ai-regression-testing
Regression testing strategies for AI-assisted development. Sandbox-mode API testing without database dependencies, automated bug-check workflows, and patterns to catch AI blind spots where the same model writes and reviews code.
benchmark-optimization-loop
Use when the user asks to make something faster, try many variants, run recursive optimization, benchmark latency/throughput/cost, or choose the best implementation by repeated measured tests.
e2e-testing
Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.
plan-orchestrate
Read a plan document, decompose it into steps, design a per-step agent chain from the ECC catalogue, and emit ready-to-paste /orchestrate custom prompts. Generative only — never invokes /orchestrate itself. Use when the user has a multi-step plan and wants to drive it through orchestrate without composing chains by hand.
research-ops
Evidence-first current-state research workflow for ECC. Use when the user wants fresh facts, comparisons, enrichment, or a recommendation built from current public evidence and any supplied local context.
security-review
Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns.
completelycheck
Run all configured quality checks (lint, types, tests) in one pass with terse output — reports "clean" when green, and only the failing check's output when not. Token-frugal; configured via completely.toml [check] or auto-detected per stack (front+back). Use before committing or to verify a change. Backed by `cmpl check`.
completely
Overview and entry point for the completely harness — a quality-first agent workflow unifying GSD (planning depth), Ralph (autonomous loop), and Beads (the spine), under deterministic gate hooks and a default-FAIL evaluator. Use to see what's installed, the command surface, and how to start.
completelycontrol
Run completely UNDER CONTROL — execute the SINGLE next Beads task through the FULL task engine (understand → map → plan-check → parallel subagents → TDD → checks → reviewers → verifier → evaluator → debug-on-fail → commit → close), in this session, showing every step and pausing at human gates. One task done excellently, then stop. control = one observed step of auto, same engine, no cuts.
Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.