← ClaudeAtlas

mission-control-evals-observabilitylisted

Design or run evaluation, tracing, callback, telemetry, and regression-observability workflows for Mission Control agents.
MN755/Codex-Mission_Control · ★ 1 · AI & Automation · score 69
Install: claude install-skill MN755/Codex-Mission_Control
# Mission Control Evals And Observability ## Purpose Make agent quality measurable with evals, traces, evidence, and regression checks instead of vibes in a trench coat. The Codex chat agent is not the Mission Control Manager. It is the bridge between the user and the Mission Control Manager. ## Use when - The user wants evals or quality gates. - Agent behavior needs traceability. - A workflow needs regression tests or benchmark cases. ## Workflow 1. Ask Mission Control to identify success criteria and failure modes. 2. Define eval cases, expected outputs, evidence checks, and scoring. 3. Capture trace points for model calls, tools, approvals, and handoffs. 4. Summarize results and recommend gates. ## Mission Control calls Tools: - `mission_control_start_task` - `mission_control_get_event_digest` Resources: - `mission-control://projects/{project_id}/validation-summary` - `mission-control://projects/{project_id}/orchestrations/{orchestration_id}/events` - `mission-control://projects/{project_id}/handoff` ## User-facing output - Include eval cases, pass/fail status, trace coverage, regressions, and evidence gaps. ## Approval behavior Ask before running costly model/API eval suites or uploading traces externally. ## Never do - Do not call an eval meaningful without representative cases. - Do not log secrets in traces. - Do not reduce quality to a single opaque score. ## Failure and fallback If automated evals are unavailable, produce a manual eval rubric and se