create-skill-autoresearchlisted

Factory skill that creates production-grade, benchmarked, autonomously improved, and verified agent skills. Orchestrates a 5-phase pipeline: interview the user to discover purpose and gold standards, research domain materials with parallel subagents, draft the skill with a design-first approach, invoke autoresearch to iterate against gold-standard-driven LLM-as-judge evaluation, and verify quality through multi-agent consensus with a devil's advocate. Use when building a new skill, creating a skill from existing materials, or upgrading a skill to production quality with benchmarking and autonomous improvement.
a-tokyo/agent-skills · ★ 11 · AI & Automation · score 76

Install: claude install-skill a-tokyo/agent-skills

# Create Skill via Autoresearch Factory A factory for forging production-grade agent skills through gold-standard-driven autoresearch, multi-agent verification, and structured consensus. The factory orchestrates 4 agent roles through 5 phases: | Phase | What Happens | Agent Role | |-------|-------------|------------| | 1. Interview | Discover purpose, gold standards, scope | ORCHESTRATOR | | 2. Research | Study domain materials, build dossier, propose rubric | RESEARCHER (N parallel) | | 3. Draft | Design structure, generate SKILL.md, measure baseline | BUILDER | | 4. Autoresearch | Iterate skill against gold standards (LLM-as-judge, or an objective real-world metric for procedural skills — see 3.4) | BUILDER + autoresearch skill | | 5. Verify | Premortem, panel scoring, consensus, ship/iterate | PANEL (3 subagents) | Key constraint: BUILDER and PANEL never share context. Panel receives only the skill output, gold standards, and rubric -- no bias from the building process. ## Relation to create-skill This factory **extends** the official single-pass skill creators (Anthropic's Skills best-practices and `skill-creator`; Cursor's `create-skill`) rather than replacing them. It adds what a one-shot generator cannot: a research dossier, gold-standard benchmarking, an autonomous improvement loop, and independent multi-agent verification. The skills it produces follow the same official conventions -- see [references/skill-authoring-best-practices.md](references/skill-authoring