eval-runlisted
Install: claude install-skill Fr-e-d/GAAI-framework
# Eval Run
## Purpose / When to Activate
Activate when:
- The Discovery Agent runs the Skill Optimize protocol and needs to score a skill output
- A skill's instructions have been modified and a before/after quality comparison is needed
- A baseline score is being established for a skill that has never been evaluated
This skill is generic: it accepts any output file and any evals.yaml, regardless of skill domain.
It follows the GAAI principle "skills never chain" — it evaluates the output it receives; it does not invoke the skill that produced the output.
---
## Process
### Step 1 — Load inputs
1. Read the `output_file` path. Confirm the file exists and is non-empty. If missing: FAIL immediately with error "output_file not found: {path}".
2. Read the `evals_file` path. Confirm the file exists and is valid YAML. If missing: FAIL immediately with error "evals_file not found: {path}".
3. Parse the `evals.yaml` structure. Validate:
- `skill`, `version`, `description`, and `assertions` fields are present
- `assertions` list is non-empty
- Each assertion has `id`, `type`, and `description` fields
- If any required field is missing: FAIL with error "evals.yaml validation error: {details}"
For the full `evals.yaml` format spec, see `references/evals-format.md`.
### Step 2 — Run `code` assertions
For each assertion where `type: code`:
1. Read the `check` field. Execute the corresponding mechanical verification:
| `check` | Verification method |
|---|---|