retro

Solid

Post-run retrospective: reads .experiments/ JSONL, computes Wilcoxon significance, detects dead iterations, flags suspicious jumps, generates next-hypothesis queue for --hypothesis flag.

AI & Automation 23 stars 3 forks Updated yesterday Apache-2.0

Install

View on GitHub

Quality Score: 87/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

<objective> Post-run retrospective analysis. After `/research:run` completes, reads `.experiments/state/<run-id>/experiments.jsonl`, computes statistical significance, detects dead iterations, flags suspicious metric jumps, generates learning summary with next-hypothesis queue. NOT for: running experiments (use `/research:run`); designing experiments (use `/research:plan`); validating methodology (use `/research:judge`); verifying paper implementation (use `/research:verify`); comparing runs from different programs/goals — `--compare` valid only for same-program, same-metric runs. Read-only — never modifies code, commits, or experiment state. </objective> <workflow> ## Agent Resolution **Agent resolution**: load and follow the protocol below. Contains: foundry check + fallback table. `research:scientist` in same plugin — no fallback needed if research plugin installed. ```bash _RESEARCH_SHARED=$(python "${CLAUDE_PLUGIN_ROOT:-plugins/cc_research}/bin/resolve_shared.py" 2>/dev/null) # timeout: 5000 [ -z "$_RESEARCH_SHARED" ] && { echo "! Plugin path resolution failed — ensure research plugin installed and CLAUDE_PLUGIN_ROOT set, or invoke /research:retro from project root."; exit 1; } cat "$_RESEARCH_SHARED/agent-resolution.md" ``` ## Retro Mode (Steps T1–T7) Triggered by `retro`, `retro <run-id>`, or `retro <run-id> --compare <run-id-2>`. **Defaults**: `--threshold 0.001`, `--alpha 0.05`. **Unsupported flag check**: load and follow the protocol below. Supported flag...

Details

Author: Borda
Repository: Borda/AI-Rig
Created: 5 months ago
Last Updated: yesterday
Language: Python
License: Apache-2.0

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

judge

Research-supervisor review of program.md — validates experimental methodology (hypothesis clarity, measurement validity, control adequacy, scope, strategy fit), emits APPROVED / NEEDS-REVISION / BLOCKED verdict before expensive run loop.

23 Updated yesterday

Borda

AI & Automation Listed

retro

Use when a Claude Code session ends, a friction needs fixing, a reusable learning needs capturing, local memory needs promoting upward, or for cross-session audits — detect friction AND learnings and route each to the right destination. Triggers: /retro, 'retrospective', 'capture this learning', 'fix this skill', 'promote memory', 'audit'.

3 Updated 2 days ago

netresearch

AI & Automation Solid

run

Sustained metric-improvement loop with atomic commits, auto-rollback, and experiment logging. Iterates with specialist agents, commits atomically, auto-rolls back on regression. Accepts a program.md file path. Supports --resume, --team, --colab, --codex, --researcher, --architect, --journal, --hypothesis.

23 Updated yesterday

Borda