← ClaudeAtlas

genesis-evalslisted

Use this skill to run the genesis maintainer-side eval suite against a target model (default: claude-opus-4.7). Activate when validating a genesis PR, when changing the genesis catalogue (architectural-patterns, primitives, design-patterns, refactor-patterns, composition-substrate, pattern-tradeoffs, SKILL.md), or when the operator asks to "run evals" or "regenerate the eval matrix". This skill orchestrates parallel cold sub-agent spawns via the harness's task tool, scores deterministically, and converges P>=0.8 / N>=0.8 / R==1.0 within max 3 iteration loops. This skill is contributor-only -- it lives under dev/skills/ (OUTSIDE .apm/) and is NOT shipped inside the user-facing skills/genesis/ bundle (BUNDLE LEAKAGE discipline). See "Why this lives outside .apm/" below.
danielmeppiel/genesis · ★ 27 · AI & Automation · score 83
Install: claude install-skill danielmeppiel/genesis
# genesis-evals: maintainer-side eval runner Run the genesis self-eval suite. Steers the parent LLM session to orchestrate cold sub-agent spawns, capture responses, score deterministically, and report convergence. ## Why this lives outside `.apm/` Genesis ships to USERS via npx / `apm install`. Eval scenarios LOOK LIKE real user requests (that is the point). Colocating them under `skills/genesis/evals/` would risk DISPATCH CONTAMINATION (an over-eager harness loader pulling scenario prompts into the active context) and PAYLOAD BLOAT for users who never run evals. We also keep this OUTSIDE `.apm/` because APM treats `.apm/` as the publishable source root: its local-content scanner picks up anything under `.apm/skills/` regardless of dev-marker, so `apm pack --format plugin` would leak this maintainer-only skill into the shipped artifact. Living under `dev/skills/` keeps it scanner-invisible while still letting `apm install --dev` deploy it via the local-path devDependency in the root `apm.yml`. This is the inverse of PHANTOM DEPENDENCY (referenced-but-not-bundled): BUNDLE LEAKAGE (bundled-but-not-consumed-at-runtime). See `skills/genesis/assets/composition-substrate.md` "Anti-patterns flagged at this step". ## When to activate - Validating a genesis PR before merge - Any change to a file under `skills/genesis/` (catalogue or SKILL.md) - Operator says "run evals", "regenerate eval matrix", "score on Opus" - Adding a new scenario (run validate first) ## Hard rules - The