Testing & QA
53 curated skills in this category
golang-benchmark
Golang benchmarking, profiling, and performance measurement. Use when writing, running, or comparing Go benchmarks, profiling hot paths with pprof, interpreting CPU/memory/trace profiles, analyzing results with benchstat, setting up CI benchmark regression detection, or investigating production performance with Prometheus runtime metrics. Also use when the developer needs deep analysis on a specific performance indicator - this skill provides the measurement methodology, while golang-performance provides the optimization patterns.
nw-abr-critique-dimensions
Review dimensions for validating agent quality - template compliance, safety, testing, and priority validation
exp-dotnet-test-frameworks
Reference data for .NET test framework detection patterns, assertion APIs, skip annotations, setup/teardown methods, and common test smell indicators across MSTest, xUnit, NUnit, and TUnit. DO NOT USE directly — loaded by test analysis skills (exp-test-smell-detection, exp-assertion-quality, exp-test-maintainability, exp-test-tagging) when they need framework-specific lookup tables.
dotnet-test-frameworks
Reference data for .NET test framework detection patterns, assertion APIs, skip annotations, setup/teardown methods, and common test smell indicators across MSTest, xUnit, NUnit, and TUnit. DO NOT USE directly — loaded by test analysis skills (test-anti-patterns, exp-test-smell-detection, exp-assertion-quality, exp-test-maintainability, exp-test-tagging) when they need framework-specific lookup tables.
browser-testing-with-devtools
Tests in real browsers. Use when building or debugging anything that runs in a browser. Use when you need to inspect the DOM, capture console errors, analyze network requests, profile performance, or verify visual output with real runtime data via Chrome DevTools MCP.
ab-test-analysis
Analyze A/B test results with statistical significance, sample size validation, confidence intervals, and ship/extend/stop recommendations. Use when evaluating experiment results, checking if a test reached significance, interpreting split test data, or deciding whether to ship a variant.
e2e-testing
AI-powered E2E testing for any app — Flutter, React Native, iOS, Android, Electron, Tauri, KMP, .NET MAUI. Test 8 platforms with natural language through MCP. No test code needed. Just describe what to test and the agent sees screenshots, taps elements, enters text, scrolls, and verifies UI state automatically.
openspec-ff-change
Fast-forward through OpenSpec artifact creation. Use when the user wants to quickly create all artifacts needed for implementation without stepping through each one individually.
playwright-skill
Battle-tested Playwright patterns for writing, debugging, and scaling reliable test suites. Use when you need guidance for E2E, API, component, visual, accessibility, or security testing, plus CI/CD, CLI automation, page objects, and migration from Cypress or Selenium. TypeScript and JavaScript.
cherry-pr-test
Test Cherry Studio PRs by checking out the branch, launching the Electron app in debug mode, and running interactive UI tests via CDP.
golang-testing
Go 测试模式,包括表格驱动测试、子测试、基准测试、模糊测试和测试覆盖率。遵循具有惯用 Go 实践的 TDD 方法论。
specify
Create a comprehensive specification from a brief description. Manages specification workflow including directory creation, README tracking, and phase transitions.
angular-testing
Write Angular component tests using TestBed, ComponentHarness, and HttpTestingController with proper signal input handling. Use when writing component tests, mocking HTTP calls, or testing signal inputs. (triggers: **/*.spec.ts, TestBed, ComponentFixture, TestHarness, provideHttpClientTesting)
playwright-core
Battle-tested Playwright patterns for writing and debugging reliable E2E, API, component, visual, accessibility, and security tests. Use when you need locator strategy, assertions, fixtures, network mocking, auth flows, trace debugging, or framework recipes for React, Next.js, Vue, and Angular. TypeScript and JavaScript.
test-harness-auditor
This skill should be used when auditing a repo's test, lint, type-check, static analysis, build, and debug infrastructure for AI coding agents. Use when entering a new repo, when asked to 'audit tests', 'audit harness', 'check test infrastructure', 'lint audit', 'what testing tools are configured', or when a repo has no .claude/lint-rules.json. Generates optimized configs for the lint-on-write hook.
code-spec-backfill
Backfill function-level contracts (docstrings, type annotations) where missing. Report unresolvable gaps with misuse scenarios. Incremental by default (state-driven).
dev-tpu-ray
Use the legacy `scripts/ray/dev_tpu.py` workflow to allocate a temporary Ray-backed TPU VM for fast debugging, testing, and benchmark iteration. Use only when you specifically need the Ray-backed dev TPU path.
swift-testing-expert
Expert guidance for Swift Testing: test structure, #expect/#require macros, traits and tags, parameterized tests, test plans, parallel execution, async waiting patterns, and XCTest migration. Use when writing new Swift tests, modernizing XCTest suites, debugging flaky tests, or improving test quality and maintainability in Apple-platform or Swift server projects.
crap-analyzer
Use to produce a risk-based refactor + test plan for recently-changed code on a diff/branch/PR by computing CRAP (complexity × untested) on changed methods. Multi-language — TypeScript, JavaScript, Python, Java, Kotlin, Go, Ruby, C#, Rust, PHP — auto-discovers how the repo generates coverage. Triggers — "/crap-analyzer", "analyze CRAP", "compute CRAP", "find risky methods", "find complex untested methods".
arch-check
Use to check a feature's code against the charter's architecture rules — dependency layering, cycles, forbidden patterns, file naming, file size. Triggers — "/engineer.arch-check", "architecture check", "check architecture fitness", "does this follow the charter", "check layering".
curate-tests
Phase 2: Generate and discover tests, validate against real library. Only invoke when explicitly requested by the user or by the yoink orchestrator.
playwright-skill
Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.
secret-detection--prevention
Automated detection and prevention of leaked secrets, API keys, passwords, and tokens in code using tools like gitleaks, trufflehog, and pre-commit hooks.
speckit-analyze
Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
speckit-checklist
Generate a custom checklist for the current feature based on user requirements.
eslint-fixes
Resolve specific ESLint errors and warnings that appear in this project. Use when fixing lint failures, ESLint reported issues, or autofix conflicts (e.g. no-void, canonical/export-specifier-newline vs prettier, no-shadow trailing underscores, sonarjs/deprecation, you-dont-need-lodash-underscore, testing-library/prefer-screen-queries, testing-library/await-async-events, jest-dom/prefer-*).
speckit-implement
Execute the implementation plan by processing and executing all tasks defined in tasks.md
property-based-testing
Use when writing tests for serialization, validation, normalization, or pure functions - provides property catalog, pattern detection, and library reference for property-based testing
property-based-testing
Use when writing tests for serialization, validation, normalization, or pure functions - provides property catalog, pattern detection, and library reference for property-based testing
sddp-projectplan
Create or refine the canonical project-level Project Implementation Plan (`specs/project-plan.md`)
raam-audit
Audit mobile applications (iOS and Android) against RAAM 1.1 (Luxembourg Mobile Accessibility Assessment Framework). Use when reviewing existing mobile app code for accessibility compliance, generating audit reports, checking conformance levels, or preparing for Luxembourg accessibility certification. Covers all 15 themes with platform-specific test procedures. Default target: Level AA.
tdd
Use for every coding task. Enforce strict TDD workflow: activate Serena, investigate first, clarify+confirm requirements, write per-task REQUIREMENTS.md in .requirements/<datetime>_<feature_name>/, verify APIs via web search, then implement in tiny test-verified steps.
fec-component-testing
Use when authoring or reviewing frontend unit, component, or light integration tests close to UI code, including React Testing Library, Vue Test Utils, hooks/composables, props/emits, callbacks, accessible queries, user-event interactions, mocks, loading/error/empty states, and regression coverage. For layer planning, real-browser journeys, or existing validation failures, choose the matching testing or validation workflow first; Chinese triggers include 组件测试, 组件单测, 单元测试, 轻量集成测试.
migrate-to-rstest
Migrate Jest or Vitest test suites and configs to Rstest. Use when asked to move from Jest/Vitest to Rstest, replace framework APIs with `@rstest/core`, translate test config to `rstest.config.ts`, or update test scripts and setup files for Rstest equivalents.
misc
Capture private, user-specific conventions and one-off workflows that are not intended to be shared as standalone skills. Use when applying personal rules, preferences, or formatting conventions, especially when handling Google Maps links, `maps.app.goo.gl` shared URLs, redirect expansion, or when writing map URLs into Markdown or other durable text. Also use when running small local Python utilities for testing, validation, verification, or fixups, especially when `uv` is available and preferable to relying on the system Python environment.
typescript-strictest
TypeScript Strictest Standards
general-best-practices
General software development best practices covering code quality, testing, security, performance, and maintainability across technology stacks
solveit
Read, edit, and execute SolveIt dialogs (notebooks) via CLI. Add code/note/prompt cells, run them, and inspect outputs.
solveit
Read, edit, and execute SolveIt dialogs (notebooks) via CLI. Add code/note/prompt cells, run them, and inspect outputs.
integration-test
Integration test for repowire peer-to-peer messaging. Supports claude-code, opencode, or mixed-agent-type testing with circle boundaries and cross-agent-type communication. Can run all modes in parallel via agent teams.
openspec-apply-change
Implement tasks from an OpenSpec change. Use when the user wants to start implementing, continue implementation, or work through tasks.
openspec-archive-change
Archive a completed change in the experimental workflow. Use when the user wants to finalize and archive a change after implementation is complete.
perseus-logic
Business logic, race conditions, and AI security analysis
playwright-test-diagnosis
Analyze playwright test results
spec-flow
Spec-driven development workflow. Interactive phase-by-phase confirmation from proposal to implementation. Trigger: 'spec-flow', 'spec mode', 'need a plan', 'structured development', 'write a spec', 'feature spec', 'technical spec', '需求文档', '技术方案', '任务拆解', '规格驱动', '写个方案', '做个规划', '结构化开发', 'plan this feature', 'break this down', 'design doc'. Creates .spec-flow/ directory with proposal, requirements, design, and tasks.
spec-creator
Turn a feature request into implementation-ready spec files for a coding agent. Use whenever the user wants to plan, scope, or write a spec for a feature about to be built — "write a spec for X", "plan feature Y", "how should we build Z", "draft an implementation plan". Produces a parent epic plus one file per independently-buildable slice. Prefer over write-spec when the audience is a coding agent, not an exec. Auto-applies the KeeForge overlay when the repo looks like KeeForge.
qa
Use when completing any task (final validation step), running audits, preparing for deployment, or when ESLint/TypeScript/build errors occur.
cqa-00b-directory-structure
Validates directory naming conventions for titles, assemblies, and modules. Use when restructuring content or adding new directories.
finclaw
AI-native quantitative finance toolkit for OpenClaw. Use when: querying stock/crypto prices, running backtests, scanning stocks (US + China A-shares + Crypto), generating trading strategies from natural language, detecting market regimes, or checking backtest plausibility. Triggers: stock quote, backtest, trading strategy, market analysis, RSI, MACD, portfolio, paper trading, regime detection, A-share scanner, crypto signals, DeFi yields. NOT for: general math, non-financial data analysis, or web scraping.
unit-testing
Unit testing patterns: Vitest config with v8 coverage, Testing Library behavior testing, MSW for HTTP mocking (vs jest.mock), it.each parametrized tests, spies vs mocks vs stubs, testing async code, snapshot testing guidelines. Use when writing unit and component tests.
dbt-coder
dbt (data build tool) patterns for model organization, incremental strategies, and testing.
e2e
Activate for any work in the tests/e2e/ directory: creating or editing test files (tests/*.test.ts), page objects (pages/), helpers (helpers/), or vitest config. Enforces agent-browser conventions specific to this project.
testing-tauri-apps
Guides developers through testing Tauri applications including unit testing with mock runtime, mocking Tauri APIs, WebDriver end-to-end testing with Selenium and WebdriverIO, and CI integration with GitHub Actions.