← ClaudeAtlas

ab-test-designlisted

Use before running any experiment that will inform a product decision. All six elements — hypothesis, metric, sample size, duration, stopping rule, and decision rule — must be defined before the test starts. Blocks "we'll stop when we see something" completions.
RBraga01/builder-product · ★ 2 · AI & Automation · score 75
Install: claude install-skill RBraga01/builder-product
# A/B Test Design ## The Law ``` A TEST WITHOUT A STOPPING RULE RUNS UNTIL THE RESULT LOOKS GOOD. "We'll stop when we see something significant" is peeking — it inflates false positive rates until the next meaningless spike ships to production. Hypothesis + metric + sample size + duration + stopping rule + decision rule IS an experiment. ``` ## When to Use Trigger before: - Running any A/B or multivariate test that will produce a ship/no-ship decision - Adding feature flag logic that will split users into groups - Launching any "let's try this and see what happens" experiment ## When NOT to Use - Observational studies with no control group (not an A/B test — use a different analysis method) - Rollouts to < 1% of users as a smoke test (no statistical inference intended — label it as such) - Shadow mode tests where both groups see the same experience (monitoring, not experimentation) ## The Six Required Elements ### 1 — Hypothesis One testable claim in the form: If [change], then [metric] will [increase/decrease] by [magnitude] because [mechanism]. ``` If we [specific change to control], then [primary metric] will [direction] by [minimum detectable effect], because [causal mechanism]. ``` **What does NOT count:** - "We think this will improve conversion" — no direction, no magnitude, no mechanism - "Let's test it and see" — not a hypothesis; a hypothesis is a prediction, not an openness to observation The mechanism matters. A hypothesis without a mechanism cannot be