harness-design

Solid

Design and build multi-agent harness architectures for long-running AI application development. GAN-inspired Generator-Evaluator pattern, Sprint Contract negotiation, context management, quality criteria calibration. Based on Anthropic Engineering patterns. Use when: "build a harness", "multi-agent architecture", "agent orchestration", "generator-evaluator", "long-running app", "harness design", "agent pipeline", "quality evaluation loop", "sprint contract", "build app with agents", "Claude Agent SDK architecture", or when building complex full-stack apps that need planning → generation → evaluation cycles. Also use when discussing context degradation, self-evaluation bias, or assumption testing in AI workflows.

AI & Automation 126 stars 19 forks Updated 2 days ago MIT

Install

View on GitHub

Quality Score: 89/100

Stars 20%
70
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# Multi-Agent Harness Design Источники: - Anthropic Engineering — "Harness design for long-running apps" - OpenClaw-RL paper (arxiv 2603.10165) — personal agent verification - DenisSergeevitch/repo-task-proof-loop — execution protocol with durable proof См. также: `references/proof-loop-research.md` — детали paper + repo mapping ## Когда нужен harness, а когда хватит solo agent | Сигнал | Solo agent | Harness | |--------|-----------|---------| | Scope | Одна фича, bug fix, refactor | Full-stack app, multi-feature product | | Длительность | < 30 мин | 1-6+ часов | | Качество | Baseline достаточно | Нужен polish, originality, craft | | Стоимость | ~$5-15 | ~$100-200+ | | Проверка | Manual review | Automated evaluation + Playwright | **Правило:** Evaluator оправдан когда задача **за пределами reliable solo performance**. Не фиксированное yes/no — зависит от complexity tier. --- ## Архитектура: Three-Agent System ### 1. Planner (Планировщик) - Расширяет 1-4 предложения пользователя в **детальную спецификацию** - Амбициозный scope — находит возможности для AI-фич - **НЕ** over-specify реализацию — только what, не how - Вписывает AI features в продукт органично ### 2. Generator (Генератор) - Реализует фичи итеративно - Включает **self-evaluation** перед handoff (но она ненадёжна — см. ниже) - Работает в рамках Sprint Contract ### 3. Evaluator (Оценщик) - **Независимый** от генератора — отдельный контекст, отдельный промпт - Валидирует через Playwright MCP — скриншоты, нав...

Details

Author
AnastasiyaW
Repository
AnastasiyaW/claude-code-config
Created
2 months ago
Last Updated
2 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category