evallisted
Install: claude install-skill kagura-ai/kagura-engineer
# kagura-engineer: eval
Thin wrapper around the `kagura-engineer eval` CLI verb (moat lever M3). It drives the
**same** fixed issue set through two arms — **grounded** (the normal `run` loop with
recall + pinned + graph-expanded memory) and **control** (the identical loop with
grounding disabled) — and prints an A/B table on objective signals already in the
pipeline: PR-reached rate, gate-verdict rate, and (with `--review`) review findings +
re-fix-loop iterations. It reimplements nothing — the orchestration lives in the CLI;
this skill discovers config, gates on `doctor`, shells out, and surfaces the result.
> ⚠️ **This is a Harness.** Before launching, confirm the user understands:
> - **Repo mutation × 2N** — the full run loop runs twice per issue (a worktree, commits,
> and a PR per arm). With `--review` the auto-fix loop also mutates each arm's branch.
> - **Cost** — `run`'s `claude -p` budget, doubled across both arms of every issue.
> - **Disposable issue set** — run it on a pinned, throwaway issue set, not production work.
**Announce:** "Using the kagura-engineer:eval skill — this is a Harness that runs the loop twice per issue."
## No-argument usage
If invoked without an issue set, do NOT guess or shell out — print this and stop:
```
kagura-engineer:eval <issue> [<issue> ...]
Measure memory-grounded uplift: run the same issues with recall ON vs OFF, print an A/B table.
⚠ Harness (high cost): the full run loop runs twice per issue; mutates the repo.
Exam