autoresearchlisted

Scaffold and run Karpathy-style autoresearch loops in any git repo. This skill should be used when setting up autonomous code improvement, generating adversarial eval harnesses, running hypothesis-implement-eval-keep/discard loops, or checking autoresearch progress. Triggers on "autoresearch", "autonomous improvement", "eval loop", "hypothesis loop", "self-improvement loop".
tdimino/claude-code-minoan · ★ 32 · Code & Development · score 85

Install: claude install-skill tdimino/claude-code-minoan

<critical> ## Five Invariants (never violate) 1. **Single mutable surface** — one hypothesis per iteration, one change per experiment 2. **Fixed eval budget** — eval runs in bounded time, no network calls in gates 3. **One scalar metric** — composite score drives keep/discard, not vibes 4. **Binary keep/discard** — improved = keep, else revert `git reset --hard HEAD~1` 5. **Git-as-memory** — every experiment is a commit, discards are reverts, history is the log ## Safety rules - Never modify `.lab/` contents during hypothesis implementation - Never skip eval — every commit must be evaluated before keep/discard - Always revert on crash — `atexit` handler restores git state - Runner uses **subscription auth** (`claude -p` with ANTHROPIC_API_KEY stripped) </critical> # Autoresearch Scaffold and run autonomous code improvement loops in any git repo. The pattern: generate a hypothesis via `claude -p`, implement it, run programmatic eval gates, keep if the composite score improves, discard if it doesn't. Proven across 50+ iterations on two codebases (shadow-engine: 0.69 to 1.0, perplexity-clone: search quality optimization). ## Category **Runbooks** — mechanical process with clear steps, not cognitive reasoning. ## Quick Start ``` /autoresearch init # scaffold .lab/ in your repo /autoresearch run # start the loop (default: 50 iterations) /autoresearch status # check progress /autoresearch resume # recover interrupted run ``` ## Command D