experimentation-analytics

Featured

How to read experiment results without fooling yourself. Confidence intervals, p-values, multiple testing, sequential testing, CUPED, heterogeneous treatment effects, ratio metrics, network effects, dashboard reconciliation, and the interpretation failures that produce confidently wrong shipping decisions.

AI & Automation 501 stars 64 forks Updated 6 days ago MIT

Install

View on GitHub

Quality Score: 97/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Experimentation Analytics A data-team-mentor's playbook for interpreting experiment results without fooling yourself. The result panel is the moment-of-truth for an experiment. The numbers on it determine whether you ship, kill, or iterate. They also expose every shortcut taken in the design phase: an underpowered test produces wide confidence intervals; a peeked test produces a too-narrow p-value; a ratio metric without delta-method correction produces overconfident lift estimates. Most ship-the-wrong-thing decisions trace back to misreading the result panel. This skill is the discipline that prevents misreading. It assumes the experiment was designed well (see the `experiment-design` skill). It assumes the platform's results panel is technically correct (most modern platforms are; some older ones are not). It assumes you can read a number off a screen. The hard part is knowing what each number actually means and what it does not, and that is what is here. When to use this skill: any time you are reading an experiment result panel and about to make a ship, kill, or iterate decision. --- ## What this skill is for This skill covers result interpretation, the statistical concepts that make the numbers trustworthy, and the dashboard reconciliation work that prevents executive-level confusion when the experiment number does not match the BI number. The audience is product managers and data analysts who read experiment results together and need a shared vocabulary that do...

Details

Author: rampstackco
Repository: rampstackco/claude-skills
Created: 3 months ago
Last Updated: 6 days ago
Language: Python
License: MIT

Integrates with

Anthropic · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

experiment-design

A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.

501 Updated 6 days ago

rampstackco

AI & Automation Featured

experimentation-platform-orchestrator

A platform decision framework for experimentation. When to use Statsig vs PostHog vs GrowthBook vs Optimizely vs Amplitude vs Eppo vs Kameleoon. How to migrate between them. How to coordinate when multi-platform is genuinely warranted. The decisions that compound for years and the ones you can defer. Triggers on which experimentation platform, choose Statsig vs PostHog, evaluate experimentation tools, switch experimentation platform, migrate from Optimizely, consolidate experimentation tools, multi-platform experimentation, experimentation platform decision, ab test platform selection, feature flag platform vs experiment platform, warehouse-native experiments, vendor lock-in experimentation. Also triggers when a team is asking about cost, governance, or migration cost across experimentation tools, or when an evaluation is starting.

501 Updated 6 days ago

rampstackco

AI & Automation Solid

experiment-results-interpreter

Interpret A/B test results in plain language and get a ship/rollback/extend recommendation with a stakeholder summary. Use when you have experiment results from any analytics tool and need a clear go/no-go decision. Triggers: 'interpret experiment results', 'read my A/B test results', 'should I ship this experiment', 'интерпретируй результаты эксперимента', 'помоги прочитать результаты A/B теста'.

14 Updated yesterday

KirKruglov