live-evaluatorlisted
Install: claude install-skill neuralforge-labs/tlmforge
# Live evaluator — fresh-context skeptical QA
The implementer's context is full of "I just made this work" — they're optimistic about
what works. Stage 6 verification needs an adversarial perspective from a reviewer that
DIDN'T write the code. This skill defines that pattern.
## When to use
**Triggers:**
- feature-development Stage 6 ("Live verification + operator tooling")
- User says "live verify this" / "QA this against the deployed env"
- After deploying a feature to a staging/canary environment, before promoting to prod
**When NOT to use:**
- Unit / integration test verification — that's Stage 4 / Stage 5 territory
- Pure code review — code-reviewer / red-team-reviewer handle that
- Smoke tests on synthetic features — direct execution is fine; this skill is for real-environment verification
## How it works
The skill is launched via `Agent(subagent_type="general-purpose", model="sonnet", ...)` with
a fresh context (the launch prompt is self-contained — no prior conversation history).
Inside the prompt:
1. **Skeptical-QA framing** ("you are a skeptical QA engineer; assume every check is wrong
until you've reproduced it yourself") — counteracts implementer-praises-own-work bias.
2. **Acceptance criteria** are listed verbatim from `specs/<feature>/STATUS.md` or
`phase-N-spec.md`'s "Verification criteria" section.
3. **Tool-use plan** — the agent uses Bash for backend (`curl`, `pytest`, log greps) and
Playwright MCP for UI (clicks the actual UI, validates rend