← ClaudeAtlas

subagent-pre-existing-misattributionlisted

Catch subagent reviewers misattributing test failures across the pre-existing / PR-introduced boundary — in BOTH directions. Use when: (1) a multi-task subagent-driven plan reports "N pre-existing failures, unrelated" across consecutive tasks and the count never changes, (2) an architectural / final reviewer at the end catches assertions that should have been updated 5 tasks ago, (3) an implementer reports "verified pre-existing on clean tree by running git stash" but the breaking change is already committed on the branch, (4) you're building or executing a multi-task plan with subagent reviewers that compare suite-wide pass/fail counts to a "baseline", (5) **a single code-review subagent reviewing one PR flags failures as "Critical / PR-introduced" and reasons "the PR correctly updated X but forgot the test" — without ever diffing against origin/main** (the over-flagging mirror image: the reviewer infers causation from the *current file state* and assumes your PR made a change it never made). Root cause (bot
wan-huiyan/agent-traffic-control · ★ 2 · Code & Development · score 79
Install: claude install-skill wan-huiyan/agent-traffic-control
# Subagent-Driven Development: Pre-Existing Failure Misattribution ## Problem In subagent-driven development, each task's reviewer evaluates spec compliance + code quality for THAT task in isolation. When the implementer or reviewer runs the full suite and finds N failures, they look up whether those failures are "new from this task" or "pre-existing baseline." If they classify them wrong, the failures get carried forward as "known unrelated failures" through every subsequent task — and the final architectural reviewer is the only checkpoint that catches the misattribution. The specific failure mode: an early task removes an item from a list (e.g., a sidebar nav entry, a collection of dashboard cards). Other test files have HARDCODED COUNT ASSERTIONS for that list (`assert count == 5`) that nobody notices because: 1. The implementer of the breaking task fixes the parametrize lists they SEE but misses count assertions in OTHER test files. 2. The implementer reports "8 failures, all pre-existing — verified on clean tree." 3. Subsequent task implementers/reviewers see "baseline = 8 failures" and treat anything matching that count as expected. 4. The architectural final reviewer compares against the actual mainline branch (not the branch tip pre-commit) and discovers N of those "pre-existing" failures are actually caused by the early task. ## Context / Trigger Conditions - Multi-task plan executed via `superpowers:subagent-driven-development` or similar - Reviewer report co