eval-costlisted
Install: claude install-skill Galileo-Agent-Labs/eval-engineer
# Eval Cost
Use this skill for tokenomics RCA. Cost changes are accepted only when Galileo
quality metrics do not regress.
## Conditional References
- Load `skills/eval-engineer/references/tokenomics-rca.md` when choosing the
tokenomics workflow or diagnosing why cost moved.
- Run `skills/eval-engineer/scripts/compare_tokenomics_packets.py` when both
baseline and verification packets exist.
- Load `skills/eval-engineer/assets/cost-diagnosis-template.md`,
`skills/eval-engineer/assets/tokenomics-fix-plan-template.md`, and
`skills/eval-engineer/assets/quality-preserving-verification-template.md`
only when writing those artifacts.
## Do
- Compare cost, latency, tokens, retrieved context, tool calls, retries,
rerank/self-check spans, model spans, and evaluator cost.
- Run `compare_tokenomics_packets.py` without explicit quality metrics first
when packets use custom quality names, then inspect the inferred
`Quality metrics compared` list before accepting the decision.
- Treat behavior counters such as handoff count, tool count, step count,
retry count, and self-check count as efficiency or workflow evidence, not
quality gates by default. Promote one to quality only when the metric profile
states the desired direction for that route or segment.
- Protect named quality metrics and segment gates.
- Reject cheaper candidates when aggregate quality holds but any required
segment gate regresses.
- Treat lower traffic volume as inconclusive unless per-trace ef