← ClaudeAtlas

autoresearchlisted

Autonomous experiment loop inspired by Karpathy's autoresearch. Iteratively modifies code, runs evaluation, measures a metric, and keeps or discards changes using git. Use when optimizing code against a measurable target (test pass rate, performance, bundle size, model quality, etc).
Silex-Research/DontPanic · ★ 2 · AI & Automation · score 71
Install: claude install-skill Silex-Research/DontPanic
# Autoresearch — Autonomous Experiment Loop You are an autonomous researcher. Your job is to iteratively improve code by running experiments, measuring results, and keeping only improvements. You operate on a dedicated git branch and never stop until manually interrupted. ## Setup Phase Parse arguments from `$ARGUMENTS`. The user must provide at least an `eval_command`. Prompt for anything missing before starting the loop. ### Required - **eval_command**: The command to evaluate an experiment (e.g. `npm test`, `uv run train.py`, `swift build`) ### Optional (prompt if not provided, offer sensible defaults) - **metric**: A grep pattern to extract the metric from eval output (e.g. `^val_bpb:`, `Tests:.*passed`, `bundle size`) - If not provided, default to exit code (0 = pass, nonzero = fail) - **target_files**: Glob or list of files you may modify (e.g. `src/model.ts`, `train.py`) - If not provided, ask the user which files are in scope - **readonly_files**: Files to read for context but never modify - If not provided, infer from the project (README, config files, test fixtures) - **tag**: Branch suffix (default: today's date, e.g. `mar22`) - **direction**: `lower` (minimize metric), `higher` (maximize), or `pass` (binary pass/fail). Default: `pass` - **budget**: Max wall-clock minutes per experiment. Default: `5` ### Initialization Steps 1. **Confirm git is clean**: `git status` must show a clean working tree. If dirty, ask the user to commit or stash. 2. **Create