← ClaudeAtlas

add-bench-scenariolisted

Use when adding a new scenario to remindb's benchmark suite — symptoms include "compare X tool against grep/cat", "add a token-savings benchmark for Y", "extend `internal/bench/scenarios.go`", "wire a new scenario into `bench.Run`", or any task that adds a row to the `scenario / naive (tok) / remindb (tok) / saved` output table. Distinct from Go `testing.B` benchmarks in `*_bench_test.go`.
radimsem/remindb · ★ 114 · AI & Automation · score 83
Install: claude install-skill radimsem/remindb
# Add a benchmark scenario `internal/bench/` is remindb's external-facing benchmark — it compares "how many tokens does an agent consume to do X via remindb tools" against "how many would they consume doing X with `grep` / `cat` / `find`". The output is a token-savings table rendered via `text/tabwriter`. It's invoked from the `remindb bench` CLI subcommand and exercised end-to-end by `scripts/bench-agents.sh`. This is *not* the Go `testing.B` benchmark surface — those live as `Benchmark*` functions inside `pkg/*/bench_test.go` and have their own discipline. ## Where it lands Two files for a typical scenario, three if it needs CLI flag plumbing. | File | What changes | |---|---| | `internal/bench/scenarios.go` | New `benchXxx(ctx, session, srcDir, ...) (scenarioResult, error)` function | | `internal/bench/bench.go` | Wire the new scenario into `Run`, append to `results` | | `cmd/remindb/...` (only if new flag needed) | Surface a new flag on the `bench` subcommand and pass it into `bench.Config` | ## The scenario function shape Every scenario implements the same contract: produce one (or more) `scenarioResult{name, naiveTok, remindbTok}` by measuring **two paths to the same answer** — the naive path (token count of what an agent would have to read using shell tools) and the remindb path (token count of the tool's response). Mirror `benchTree`, `benchSearch`, or `benchFetch` in `scenarios.go`: ```go func benchExample(ctx context.Context, s *gomcp.ClientSession, srcDir