omniroute-cli-eval
SolidRun and manage OmniRoute eval suites from the CLI — create suites, run benchmarks, watch live results, view scorecards, and compare model performance. Use when the user wants to benchmark models, validate quality regressions, or automate LLM evals in CI.
Install
Quality Score: 93/100
Skill Content
Details
- Author
- diegosouzapw
- Repository
- diegosouzapw/OmniRoute
- Created
- 3 months ago
- Last Updated
- today
- Language
- TypeScript
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
cli-eval
Create and run evaluation suites, watch live benchmark progress, view scorecards, compare model performance, and integrate eval runs with CI workflows from the CLI.
run-evals
Run the Nomos eval suite -- recall@5, per-user isolation, and the end-to-end agent eval with the Opus-4.8 DB-content audit + the spec-driven feature-manifest audit. Use /run-evals when asked to run the evals, verify the memory system, check tenant isolation, or audit that features are actually wired and their DB effects land.
eval-runner
Run eval scenarios to benchmark Mycelium effectiveness. Execute tasks using reflexion loop, validate against success criteria, record metrics.