time-benchmark

Install

View on GitHub

Quality Score: 80/100

Stars 20%

48

Recency 20%

100

Frontmatter 20%

70

Documentation 15%

100

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# /time-benchmark guided A/B/C benchmark: run the same fixed task three times at different effort levels, record active time and tool count, compare to the time rule's matrix. **this skill never switches effort automatically.** it prompts the user to run `/effort <level>` between runs and reads back the session's actual effort (which may differ from what was requested, per the rule's fallback behavior). ## what to do when invoked 1. **explain the plan to the user.** three runs, same task, the user changes effort between each. plan the task first (something reproducible, 3-5 min active at low). 2. **pick a fixed task.** default: "read `plugins/cc/rules/time.md` and summarize the effort matrix in 5 bullets". the user can substitute anything bounded and read-heavy. 3. **run A (low):** - ask the user: "please run `/effort low` then say 'go'". - wait. - on 'go', confirm active effort by the same resolution chain as `/time-estimate` (env → flag → session → settings → default). record the effort the session actually reports. - start a timer. run the fixed task. stop the timer. - record: wall seconds, estimated active seconds (subtract user-idle gaps when visible), tool count, any thinking indicators. 4. **run B (medium):** repeat with `/effort medium`. 5. **run C (high):** repeat with `/effort high`. 6. **report**: ``` benchmark: model=<model>, task="<task>" run requested actual wall(s) active(s) too...

Details

Author: anipotts
Repository: anipotts/claude-code-tips
Created: 5 months ago
Last Updated: today
Language: Python
License: MIT

Install

Quality Score: 80/100

Skill Content

Details

Bundled in these plugins

Similar Skills

time-estimate

benchmark

benchmark-models