← ClaudeAtlas

os-eval-runnerlisted

Stateless evaluation engine that scores and gates skill improvement iterations using headless Python evaluation scripts. Use when the user says "evaluate this skill", "run autoresearch loop on", "optimize this skill", "run the eval loop", or when another agent proposes a change to an existing skill and needs empirical validation before applying it. Supports autonomous loop mode for iterative improvement and single-shot QA mode for validating one specific proposed change. Requires Python 3.8+ and a git repository.
richfrem/agent-plugins-skills · ★ 3 · AI & Automation · score 67
Install: claude install-skill richfrem/agent-plugins-skills
<example> <commentary>Start autonomous improvement loop on a skill.</commentary> user: "Run the autoresearch loop on plugins/dev-utils/skills/link-checker-agent for 20 iterations" assistant: [triggers os-eval-runner, runs Mode 1 intake, establishes baseline, begins iteration loop] </example> <example> <commentary>Incomplete optimize request — runs intake interview first.</commentary> user: "Optimize the commit skill" assistant: [triggers os-eval-runner, runs Phase 0 intake interview to gather path, mode, and iteration count] </example> <example> <commentary>Another agent proposes a skill edit and needs validation.</commentary> assistant: [autonomously] "Before I apply this description change, I'll run os-eval-runner to confirm the score doesn't regress." </example> <example> <commentary>Negative — user is asking about a skill, not evaluating a proposed change.</commentary> user: "Tell me about the os-clean-locks skill." assistant: "It cleans up stale lock files..." [does NOT trigger os-eval-runner] </example> ## Dependencies This skill requires **Python 3.8+** and standard library only. No external packages needed. **To install this skill's dependencies:** ```bash pip-compile ./requirements.in pip install -r ./requirements.txt ``` See `./requirements.txt` for the dependency lockfile (currently empty — standard library only). > **Prerequisites:** The target skill must be inside a **git repository** (`git init` first if needed). Python 3.8+ must be available as `python`