lythoskill-arenalisted

Test play for skills and deck configurations. DEFAULT: agent reads config, spawns parallel subagents via native Agent tool, judges outputs. Single-deck test AND multi-deck A/B comparison both run agent-orchestrated (no CLI). Cross-player comparison (kimi vs codex) is the ONLY case that needs the CLI runner. Always restores parent deck. No install, no working-set pollution, no deck overwrite. Subagent-friendly: resumes interrupted runs from saved state. CRITICAL: experiments run in `/tmp`, never in committed directories. Subagent inherits parent CWD — prompt must explicitly set workDir.
lythos-labs/lythoskill · ★ 2 · AI & Automation · score 81

Install: claude install-skill lythos-labs/lythoskill

# Skill Arena > Test play for skills and deck configurations. Not "which is best" — "which is best for what." ## Decision Tree (READ FIRST) ``` User says: "test/compare/arena/benchmark/A vs B" │ ├── Cross-PLAYER? (kimi vs codex vs claude) │ OR user explicitly says useAgent/specific player │ OR platform doesn't support Agent tool subagents │ → CLI runner REQUIRED (useAgent → Bun.spawn) │ → bunx @lythos/skill-arena vs --config arena.toml │ → Each side spawns its player CLI process │ └── Same player, different DECKS? (DEFAULT) → Agent-orchestrated — NO CLI → YOU spawn subagents via Agent tool → CLI prepare-workdir + CLI archive + parallel dispatch → Judge subagent collects + scores ``` ## Default: Agent-Orchestrated (single & cross-deck vs) **This is how arena works 95% of the time.** The agent and CLI operate as a two-way control transfer protocol. Agent delegates mechanical invariants to CLI. CLI hands control back via its exit paths (success → next step; error → fix command). Agent stays in its own main loop — the subagent pattern is container spawn, not external RPC. ```mermaid flowchart TD A["🤖 Agent: parse request"] --> B{Cross-PLAYER?} B -->|Yes| C[🔧 CLI vs --config] B -->|No — DEFAULT| D["🤖 Agent → 🔧 CLI: prepare-workdir"] D -->|"✅ workdir ready"| E["🤖 Agent: spawn subagents"] E --> F["🤖 Subagents: execute + write artifacts"] F --> G["🤖 Agent: c