← ClaudeAtlas

shipkit-semantic-qalisted

Semantic QA — define inputs/criteria, generate test scripts, Claude judges outputs or screenshots against criteria. Triggers: 'semantic qa', 'quality check', 'visual qa', 'judge outputs', 'QA suite'.
stefan-stepzero/shipkit · ★ 1 · Testing & QA · score 78
Install: claude install-skill stefan-stepzero/shipkit
# shipkit-semantic-qa - Semantic Quality Assurance **Purpose**: Define test inputs and quality criteria, generate test scripts, run them, and let Claude semantically judge outputs (API responses or UI screenshots) against human-defined criteria. **Pattern**: One skill, one loop — Setup → Run → Judge. Two suite types: backend (API/LLM pipeline) and frontend (visual components). --- ## When to Invoke **User triggers:** - "Semantic QA", "Set up QA", "Quality check" - "Visual QA", "Screenshot QA", "Check my UI" - "Judge outputs", "Run QA suite", "Check quality" - "Set up quality criteria", "Define test inputs" **Workflow position:** - After features are implemented (something to test) - Before verify/preflight (catches quality issues early) - Can run standalone against any API or UI --- ## Prerequisites **Required:** None (Setup mode creates everything) **Helpful:** - `.shipkit/stack.json` — Tech stack informs script generation - `.shipkit/specs/` — Acceptance criteria can seed quality criteria - Playwright installed (for frontend suites only) --- ## Process ### Completion Tracking In `--full` mode (all 3 phases sequential), create tasks at the start: - `TaskCreate`: "Setup: Define criteria + generate test script" - `TaskCreate`: "Run: Execute tests + verify output count" - `TaskCreate`: "Judge: Evaluate ALL outputs against ALL criteria" - `TaskCreate`: "Write judgment.md + judgment.json" `TaskUpdate` each task to `in_progress` when starting it, `completed` when do