experimentation-analytics

Solid

How to read experiment results without fooling yourself. Confidence intervals, p-values, multiple testing, sequential testing, CUPED, heterogeneous treatment effects, ratio metrics, network effects, dashboard reconciliation, and the interpretation failures that produce confidently wrong shipping decisions.

AI & Automation 280 stars 37 forks Updated 2 days ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
82
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# Experimentation Analytics A data-team-mentor's playbook for interpreting experiment results without fooling yourself. The result panel is the moment-of-truth for an experiment. The numbers on it determine whether you ship, kill, or iterate. They also expose every shortcut taken in the design phase: an underpowered test produces wide confidence intervals; a peeked test produces a too-narrow p-value; a ratio metric without delta-method correction produces overconfident lift estimates. Most ship-the-wrong-thing decisions trace back to misreading the result panel. This skill is the discipline that prevents misreading. It assumes the experiment was designed well (see the `experiment-design` skill). It assumes the platform's results panel is technically correct (most modern platforms are; some older ones are not). It assumes you can read a number off a screen. The hard part is knowing what each number actually means and what it does not, and that is what is here. When to use this skill: any time you are reading an experiment result panel and about to make a ship, kill, or iterate decision. --- ## What this skill is for This skill covers result interpretation, the statistical concepts that make the numbers trustworthy, and the dashboard reconciliation work that prevents executive-level confusion when the experiment number does not match the BI number. The audience is product managers and data analysts who read experiment results together and need a shared vocabulary that do...

Details

Author
rampstackco
Repository
rampstackco/claude-skills
Created
1 months ago
Last Updated
2 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

experiment-design

A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.

280 Updated 2 days ago
rampstackco
Web & Frontend Solid

experiment-designer

Design statistically rigorous A/B tests and interpret experiment results. Use when asked to design an experiment, run an A/B test, calculate sample size, interpret test results, or assess whether an experiment was successful. Produces a complete experiment design with hypothesis, sample size, run time, success criteria, and risk flags — or a results interpretation with ship/iterate/kill recommendation.

915 Updated 3 days ago
mohitagw15856
AI & Automation Solid

experimentation-platform-orchestrator

A platform decision framework for experimentation. When to use Statsig vs PostHog vs GrowthBook vs Optimizely vs Amplitude vs Eppo vs Kameleoon. How to migrate between them. How to coordinate when multi-platform is genuinely warranted. The decisions that compound for years and the ones you can defer. Triggers on which experimentation platform, choose Statsig vs PostHog, evaluate experimentation tools, switch experimentation platform, migrate from Optimizely, consolidate experimentation tools, multi-platform experimentation, experimentation platform decision, ab test platform selection, feature flag platform vs experiment platform, warehouse-native experiments, vendor lock-in experimentation. Also triggers when a team is asking about cost, governance, or migration cost across experimentation tools, or when an evaluation is starting.

280 Updated 2 days ago
rampstackco
AI & Automation Listed

experiment-results-interpreter

Interpret A/B test results in plain language and get a ship/rollback/extend recommendation with a stakeholder summary. Use when you have experiment results from any analytics tool and need a clear go/no-go decision. Triggers: 'interpret experiment results', 'read my A/B test results', 'should I ship this experiment', 'интерпретируй результаты эксперимента', 'помоги прочитать результаты A/B теста'.

8 Updated 2 days ago
KirKruglov
AI & Automation Solid

experiment-designer

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

16,642 Updated yesterday
alirezarezvani