bakw00ds

a11y-scan

Run axe-core / Pa11y / Lighthouse against changed pages and classify findings by WCAG level (A / AA / AAA)

about-page-scaffold

Drop in a 3-paragraph "what is this?" page for new users, per frontend framework; optionally collapses into the architecture page when both standards are enabled

adr-write

Scaffold a new Architecture Decision Record in Michael Nygard format (Context / Decision / Consequences) and update the ADR index

agent-audit

Scan dispatch-log for misuse patterns — lead-did-specialist-work, wrong-runtime, budget-violations, repeated-failures

api-diff

Diff an OpenAPI spec across commits and classify each change as major/minor/patch per SemVer-for-APIs

architecture-viz-scaffold

Generate a static architecture page that reads ADRs, Mermaid diagrams, CHANGELOG, SBOM, tech-debt log into one navigable UI; per frontend stack

Web & Frontend Listed

changelog-ui-scaffold

Drop in a UI component / page that surfaces the project's CHANGELOG.md and current version, per frontend framework

cve-triage

Pull CVEs against the current dependency set (osv.dev / GHSA) and classify each as exploitable / theoretical / not-applicable

DevOps & Infrastructure Listed

deploy-check

Pre-deploy verification — build, smoke, env-var sanity

design-tokens-audit

Scan codebase for hardcoded values that should be design tokens (colors, spacing, font-size, radius) and surface drift against the registry

dispatch-as-project-agent

Spawn a generic Agent wearing a project-specific agent's discipline (workaround for v0.1 runtime not discovering project agents)

evidence-based-debugging

Constrain debugging to cite runtime evidence before proposing fixes

feedback-scaffold

Drop in the user-feedback subsystem (DB migration + API handlers + UI submit widget + view-my-feedback page + CHANGELOG citation wiring) per project stack

flake-quarantine

Auto-tag tests that flaked >N times in M runs, move them to a quarantine suite, and cap how long they stay there

gather-feedback

Pull and triage user feedback from configured sources

hallucination-check

Citation-grounding verifier — for each claim in an LLM response, confirm support in the retrieved context, report ungrounded claims

hashed-edit

Hash-anchored line edits — refuse the edit if the target line content changed since the read

hook-bypass-review

Audit work/current/hook-bypass.md for stale entries, expired bypasses, and patterns suggesting a hook needs tuning

i18n-audit

Scan for hardcoded UI strings, missing translation keys, RTL layout breakages, untested locales, and date/number/currency hardcoding

interaction-patterns

Heuristic eval against Nielsen's 10 plus a WCAG 2.2 first-pass for keyboard, focus, contrast, and motion

iterate-until

Loop a work-then-verify cycle until a human-checkable verifier passes, with a hard iteration cap and an audit trail

kanban-tend

Maintain the project kanban — add tasks at dispatch, move on lifecycle events, archive completed work at release time

license-audit

Scan the dependency tree for license-policy violations — copyleft in proprietary, unknown licenses, license downgrades

llm-output-gate

CI hook that refuses to ship if prompt-eval golden set regresses past threshold or prompt-injection-test fails on HIGH severity

local-llm

Hand off bulk transformation work to a local LLM via Ollama

DevOps & Infrastructure Listed

logging-scaffold

One-shot scaffold dropping in a structured logger for the project's primary backend language; wires JSON output, sensible defaults, example call sites

mcp-as-agent

Wrap an MCP server as a yakOS agent so tool-side and LLM-side specialists share the same dispatch surface

mockup-review

Structured review of a mockup — info hierarchy, interaction states, responsive behavior, copy slots, design-token usage

monitor-scaffold

Drop in supervisor config + /healthz endpoint + restart runbook for each service in profile.monitors.targets, per supervisor (systemd / pm2 / k8s / docker-compose)

peer-handoff

Hand work to another developer cleanly in multi-dev co-pilot mode

peer-sync

Read recent peer activity for a yakOS multi-dev session; summarize for the lead at session-start time

perf-budget-check

Compare Core Web Vitals (LCP/INP/CLS) and bundle size against the project's perf budget; fail the PR if regressed

persona-write

Scaffold a user persona document with required fields — demographics, goals, frustrations, behaviors, JTBD, devices, accessibility considerations

phase-complete

Verify a phase's exit criteria before declaring it done

postmortem-write

Produce a blameless postmortem — UTC timeline, impact, root cause, what worked / didn't, action items with owners and due dates

pre-commit

Pre-commit verification — lint, secret scan, fast tests

project-init

Bootstrap a new project's .claude/ config; complement to `yakos init`

promote-branch

Promote code between environments (dev → test → prod) via PR with structured changelog and gated by the promotion-aware pre-push hook

prompt-eval

Run a prompt against a golden dataset and report per-rubric regression vs the prior pinned version, not just an aggregate score

prompt-injection-test

Run an OWASP LLM01 injection corpus against the system prompt + tool surface and report which payloads succeeded

Pre-release audit orchestrator — runs advisory multi-domain audits (Security, Code Quality, UI/UX, Docs, Performance, Regulated-Data, Mobile, Infra) across the project, produces versioned per-domain findings reports plus an executive summary with a lead-review prompt for human disposition. Two modes (audit / audit+remediate). Stack-aware — Phase 0 detects back-end / front-end / mobile profiles and dispatches only the relevant playbooks. Use whenever the user asks for a pre-release audit, release-readiness review, production-readiness check, security/QA/code-quality/doc/performance/regulated-data audit, compliance gate, or a "ship checklist" / "go-live review."

API & Backend Listed

runbook-author

Given an alert definition (or a postmortem), produce a runbook entry — symptom, first 5 actions, escalation

sbom-generate

Emit a CycloneDX 1.6 or SPDX 3.0 SBOM for the project's locked dependency set, suitable for EO 14028 / enterprise procurement

session-recovery

Reconstruct working context after a session interrupt or fresh shell

Data & Documents Listed

session-summary

End-of-session markdown report — agents dispatched, total cost, key decisions, time-to-completion

cost-summary

Roll up yakos dispatch-log into a per-runtime / per-agent / per-day cost summary, optionally posting to a webhook

finops-review

Analyze the dispatch-log for per-feature spend, cache hit rate, and model routing — surface optimization opportunities

runtime-pick

Given a task, recommend which yakOS agent + runtime to dispatch to (claude / codex / gemini)

plan-quality-eval

Score work/current/plan.md against a 6-dimension rubric using a 3-judge cross-vendor LLM panel; write a structured record to ~/.yakos-state/plan-quality-log.ndjson