battlelisted

Benchmark AI model combinations on the same coding task. Runs Claude, Antigravity CLI (agy — replaces the sunset Gemini CLI on consumer plans as of 2026-06-18), Codex CLI, Kimi (Moonshot API), and DeepSeek (OpenAI-compatible API) solo and in pairs, then scores each on tokens, cost, wall time, and output quality. Produces a leaderboard to inform future model selection. Use when asked to "battle", "benchmark models", "compare models", "which model is best", "pair programming battle", or "/battle".
eprouveze/claude-skills · ★ 0 · AI & Automation · score 78

Install: claude install-skill eprouveze/claude-skills

# /battle — Pair-Programming Model Battle Run the same coding task across different AI model combinations and score each on **time**, **tokens**, **cost**, and **quality** to find the optimal setup. ## When to Use - User says "battle", "benchmark", "compare models", "which model combo is best" - Before committing to a model strategy for a large project - To validate whether multi-LLM delegation actually outperforms single-model - Manual: `/battle <task description>` ## Contestants ### Solo Runs | ID | Label | How It Runs | |----|-------|-------------| | C | **Claude Solo** | Claude writes code directly (native tools) — Opus 4.7 | | G | **Gemini Solo** | Antigravity CLI (`agy`) generates code — gemini-2.5-pro (model name unchanged) | | X | **Codex Solo (gpt-5.5)** | Codex CLI default — `codex exec --model gpt-5.5` | | X5 | **Codex Solo (gpt-5.5)** | Codex CLI premium — `codex exec --model gpt-5.5` | | K | **Kimi Solo** | Moonshot K2.6 via API curl — needs `KIMI_API_KEY` | | D | **DeepSeek Solo** | DeepSeek-v4-flash via OpenAI-compatible API — needs `DEEPSEEK_API_KEY` | ### Delegation Combos (from `/codex-write` pattern) | ID | Label | How It Runs | |----|-------|-------------| | CG | **Claude + Gemini** | Claude architects, Gemini generates, Claude reviews | | CX | **Claude + Codex** | Claude architects, Codex generates, Claude reviews | | GX | **Gemini + Codex** | Gemini generates, Codex reviews | | CGX | **All Three** | Claude architects, Gemini + Codex generate