meta-harnesslisted

Run a Meta-Harness-style optimization loop NATIVELY — automatically search over the scaffolding around a FIXED base model (memory, retrieval, context construction, prompt templates, summarization, tool-selection logic) by proposing candidate variants, scoring each on a cheap deterministic eval, and keeping a Pareto frontier of quality vs cost — using native Agent / Workflow / loop tools instead of a standalone Python harness. Use this whenever the user wants to optimize, evolve, tune, distill, or search over a harness, scaffold, prompt system, memory or retrieval policy, context-assembly code, or summarizer while keeping the model fixed; whenever they mention Meta-Harness, harness optimization, scaffold evolution, automatic prompt/memory optimization, an evolutionary or Pareto search over candidate implementations, or "make the harness/agent better without retraining"; and whenever the gain must come from the code AROUND the model rather than the model weights. Reproduces the Meta-Harness paper's method nativ
001TMF/harness-forge · ★ 56 · AI & Automation · score 84

Install: claude install-skill 001TMF/harness-forge

# Meta-Harness (native) ## What this is **Meta-Harness optimizes the *harness*, not the model.** The harness is the code around a fixed base model that decides what to store, retrieve, compress, and show while the model works. You hold the model frozen and search over that scaffolding: propose candidate variants, score each on a cheap deterministic eval, keep a **Pareto frontier** (quality up, cost down), and iterate. The proposer is an LLM agent writing code; the inner loop is a cheap scorer. The Stanford repo (`stanford-iris-lab/meta-harness`) ships a Python driver — `claude_wrapper.py` (~720 lines) + `meta_harness.py` (~540 lines) — that **reimplements an agent runtime to drive a headless Claude**: spawn a session, parse stream-json, track tool calls, log everything, loop. **You already are that runtime.** So you run the same loop with native tools (`Agent`, `Workflow`, `/loop`) and keep only the irreducible domain logic — a $0 scorer. The orchestration was never the hard part; your harness provides it. This skill is the **method**, reusable for any harness-optimization task. A fully worked example (optimizing proteus's campaign-memory summarizer) lives at `~/mh-proteus/` and is walked through in `references/proteus-example.md`. ## When to use this Strong fit when **several** of these hold (full criteria in `references/method.md`): - The base model is **fixed** and the opportunity is better retrieval / memory / context / prompting / tool scaffolding. (This is the