experiment-design

Solid

A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.

AI & Automation 280 stars 37 forks Updated 2 days ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
82
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# Experiment Design A senior product manager's playbook for running experiments that produce trustworthy decisions. The default state of experimentation in most companies is sloppy. PMs run tests against vague hypotheses, look at results too early, ignore guardrails, stratify into noise, and ship features whose lift is mostly measurement error. The cost is real: ship the wrong thing, kill the right thing, learn the wrong lesson, repeat. This skill is the discipline that prevents most of those mistakes. It assumes you have a working experimentation platform (Statsig, PostHog, GrowthBook, Optimizely, Amplitude, Eppo, Kameleoon; the platform does not matter for the principles). It assumes you have product-design and engineering pipelines that can deliver real treatment changes. The hard part is the thinking, and that is what is here. When to use this skill: any time you are about to design or interpret an experiment. Read the relevant section before you start, not after the test is running. --- ## What this skill covers The skill spans the full experiment lifecycle. Pre-experiment readiness (is this thing even worth testing). Hypothesis design (cause, effect, magnitude, mechanism). Sample size and minimum detectable effect (do you have enough traffic to learn anything). Duration (how long is long enough, when does the cycle bias the result). Running discipline (no peeking, guardrails, sequential testing). Interpretation (the three buckets and the inconclusive case). Decis...

Details

Author
rampstackco
Repository
rampstackco/claude-skills
Created
1 months ago
Last Updated
2 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

Web & Frontend Listed

experiment-design

Design hypothesis-driven experiments and A/B tests with proper methodology. Use when asked to design an A/B test, validate a hypothesis, plan an experiment, or set up a test for a product change. Covers hypothesis writing, sample size, and common mistakes.

2 Updated today
AashutoshR2062
AI & Automation Solid

experiment-designer

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

16,642 Updated yesterday
alirezarezvani
Web & Frontend Solid

experiment-designer

Design statistically rigorous A/B tests and interpret experiment results. Use when asked to design an experiment, run an A/B test, calculate sample size, interpret test results, or assess whether an experiment was successful. Produces a complete experiment design with hypothesis, sample size, run time, success criteria, and risk flags — or a results interpretation with ship/iterate/kill recommendation.

915 Updated 3 days ago
mohitagw15856
AI & Automation Solid

experimentation-analytics

How to read experiment results without fooling yourself. Confidence intervals, p-values, multiple testing, sequential testing, CUPED, heterogeneous treatment effects, ratio metrics, network effects, dashboard reconciliation, and the interpretation failures that produce confidently wrong shipping decisions.

280 Updated 2 days ago
rampstackco
Web & Frontend Listed

experiment-designer

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

0 Updated 2 days ago
SanctifiedOps