← ClaudeAtlas

time-benchmarklisted

walk the user through a low/medium/high effort A/B/C throughput benchmark on their current model
anipotts/claude-code-tips · ★ 25 · AI & Automation · score 79
Install: claude install-skill anipotts/claude-code-tips
<!-- tested with: claude code v2.1.122 --> # /time-benchmark guided A/B/C benchmark: run the same fixed task three times at different effort levels, record active time and tool count, compare to the time rule's matrix. **this skill never switches effort automatically.** it prompts the user to run `/effort <level>` between runs and reads back the session's actual effort (which may differ from what was requested, per the rule's fallback behavior). ## what to do when invoked 1. **explain the plan to the user.** three runs, same task, the user changes effort between each. plan the task first (something reproducible, 3-5 min active at low). 2. **pick a fixed task.** default: "read `plugins/cc/rules/time.md` and summarize the effort matrix in 5 bullets". the user can substitute anything bounded and read-heavy. 3. **run A (low):** - ask the user: "please run `/effort low` then say 'go'". - wait. - on 'go', confirm active effort by the same resolution chain as `/time-estimate` (env → flag → session → settings → default). record the effort the session actually reports. - start a timer. run the fixed task. stop the timer. - record: wall seconds, estimated active seconds (subtract user-idle gaps when visible), tool count, any thinking indicators. 4. **run B (medium):** repeat with `/effort medium`. 5. **run C (high):** repeat with `/effort high`. 6. **report**: ``` benchmark: model=<model>, task="<task>" run requested actual wall(s) active(s) too