benchlisted
Install: claude install-skill zapgun-ai/clawback
# clawback benchmark analyzer
Analyze per-turn NDJSON logs emitted by a running clawback proxy.
## Usage
```bash
node benchmark/bin/analyze.js --out runs/report-$(date +%s) <turn-log-path>...
```
Inputs may be individual `.ndjson` files or directories containing them.
Each record's `arm` field determines whether it counts as treatment,
passthrough, or `treatment-ping` (keep-alive overhead).
## Inputs
Turn-logs are produced by clawback when started with `--turn-log <path>`:
```bash
clawback --turn-log ./runs/turns.ndjson ... # treatment
clawback --turn-log ./runs/turns.ndjson --passthrough ... # baseline
```
Both arms write to the same file — arm label is embedded per record.
## Outputs
Written to the `--out` directory:
- `report.md` — leads with the **billable input tokens reclaimed vs
passthrough** headline (per-turn rate + bootstrap CI; full-rate quota
clawback keeps off your bill — no pricing). The `$`/turn cost detail is
a demoted appendix below it. Then two diagnostic sections:
- **Prefix fragmentation** — distinct clawback SESSION KEYs seen per
stable system prefix, per knobProfile. `1` = one logical context maps
to one Anthropic cache key (ideal); `>1` (flagged ⚠️) means the same
context was split across keys, each cold-starting Anthropic's cache —
strip-ephemeral collapses this toward 1. This is the headline finding
on hot loops where passthrough fragments but the stack does not.
- **Keep-alive ping coverage** — shar