eval-result-interpreterlisted
Install: claude install-skill varunk130/AI-Eval-Skills
## Purpose
This skill takes eval results - a Copilot Studio evaluation CSV file (the primary worked example), an export from your own evaluator/harness, a pasted summary, or a plain-English description of results - and produces a structured triage report. It is the final step in the eval lifecycle: plan → generate → run → **interpret**. The output tells you whether to ship, what broke, why it broke, and what to fix first.
> **Platform context.** All the operational examples below use Microsoft Copilot Studio because its evaluation CSV format is well-documented and concrete. The triage framework itself (4 layers, 3 root cause types, SHIP/ITERATE/BLOCK verdict) is platform-agnostic - point it at any evaluator output with per-case results and the same analysis applies.
This skill serves **Stages 2-4** of the [MS Learn 4-stage evaluation framework](https://learn.microsoft.com/en-us/microsoft-copilot-studio/guidance/evaluation-checklist). In Stage 2 (Set Baseline & Iterate), it interprets your first eval results and guides fixes. In Stage 3 (Systematic Expansion), it identifies coverage gaps worth expanding into. In Stage 4 (Operationalize), it triages regression failures after agent updates. Use the [evaluation checklist template](https://github.com/microsoft/PowerPnPGuidanceHub/tree/main/guidance/agentevalguidancekit) to track which stage you are in and what to interpret next.
**Knowledge source:** This skill's analysis framework is grounded in **Microsoft's Triage & Improve