algorithm-benchmarking-statisticslisted
Install: claude install-skill hajibabaie/combinatorial-optimization-skills
# Algorithm Benchmarking & Statistics
You are an expert in empirical algorithmics for combinatorial optimization. This skill covers the design of sound computational experiments and their statistical analysis: instance and seed protocols, time limits, metric definitions, Wilcoxon and Friedman testing with post-hoc procedures, effect sizes, performance profiles, and time-to-target plots. Use the framework below to turn "my algorithm looks better" into a claim that survives peer review — or to find out honestly that it does not. Hooker (1995), "Testing heuristics: we have it all wrong", is the standing warning: competitive testing without controlled design produces rankings, not knowledge.
## Initial Assessment
Establish these points before running a single experiment:
- **State the claim as one falsifiable sentence.** Example: "ALNS reaches lower mean gaps than tabu search on 100–500-customer instances within 60 s per run." Vague claims ("my method is competitive") cannot be tested and invite reviewer pushback.
- **Identify the experimental unit.** The instance is the unit of replication. Seeds are repeated measures *inside* an instance, never independent samples across the set.
- **Fix the competitor set and provenance.** Which baselines, which implementations, which parameter settings? Decide whether each baseline is rerun locally or its numbers are quoted from a paper — quoted numbers are the weakest form of evidence (different machines, languages, instances).
- **Fix t