results-analysis

Solid

This skill should be used when the user asks to "analyze experimental results", "run strict statistical analysis", "compare model performance", "generate scientific figures", "check significance", "do ablation analysis", or mentions interpreting experiment data with rigorous statistics and visualization. It focuses on strict analysis bundles, not Results-section prose.

AI & Automation 4,111 stars 371 forks Updated 3 days ago MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Results Analysis Run **strict, evidence-first experimental analysis** for ML/AI research. Use this skill to produce a **strict analysis bundle**: - `analysis-report.md` - `stats-appendix.md` - `figure-catalog.md` - `figures/` Do **not** use this skill to draft a paper `Results` section or a full experiment wrap-up report. Those belong to `ml-paper-writing` or `results-report`. ## Core contract ### This skill is responsible for - validating experiment artifacts and comparison units, - running rigorous descriptive and inferential statistics, - generating **real scientific figures** when data/logs are available, - writing figure purposes, caption requirements, and interpretation checklists, - surfacing limits, blockers, and missing evidence explicitly. ### This skill is not responsible for - paper-ready `Results` prose, - manuscript narrative polishing, - project-level experiment retrospectives. If the user wants the complete post-experiment summary report, hand off to `results-report` after this bundle is ready. ## Non-negotiable quality bar 1. **Prefer real figures over figure specs.** If the data can be read, generate real figures. Do not stop at “recommended visualization”. 2. **Never fabricate statistics.** If sample size, seeds, or raw metrics are missing, state the blocker clearly. 3. **Report complete statistics.** Do not report only best scores or only p-values. 4. **Interpret every main figure.** Every major figure must have purpose, caption requ...

Details

Author
Galaxy-Dawn
Repository
Galaxy-Dawn/claude-scholar
Created
4 months ago
Last Updated
3 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

results-report

This skill should be used when the user asks to "write an experiment report", "summarize experimental results", "do experiment retrospection", "write a results report", "写实验总结报告", "写实验复盘", or mentions turning completed experiment artifacts into a structured, decision-oriented research report. It assumes strict analysis should come from `results-analysis` first.

4,111 Updated 3 days ago
Galaxy-Dawn
AI & Automation Listed

analysis

Use when analyzing experiment results, comparing models, interpreting metrics, debugging unexpected outputs, or performing ablation analysis. Trigger phrases include "analyze results", "compare models", "why is the loss", "debug training", "interpret", "ablation analysis", "what went wrong", "check metrics". Even if the user says "look at the numbers" or "explain these results", use this skill.

0 Updated 1 months ago
hyojun01
Data & Documents Listed

analysis-results-collector

Transform completed analysis work into structured, communication-ready documentation by conducting guided conversations that extract key findings, evidence, and recommendations. Use when a user has completed an analysis (typically following an analysis plan created by the analysis-planner skill) and needs to document results, create findings summaries, build executive reports, or prepare analysis outcomes for stakeholder communication. This skill systematically gathers what was tested, what was found, and what should be done next, then generates a professional markdown document ready for distribution.

23 Updated 3 months ago
florianbonnet14
AI & Automation Solid

analyze-results

Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says "analyze results", "compare", or needs to interpret experimental data.

11,051 Updated today
wanshuiyin
AI & Automation Listed

autoresearch

When the user wants a rigorous iteration loop for an artifact, prompt, briefing, content structure, or Agentic SEO skill. Also use for Karpathy-style experiment runs that need baseline scoring, explicit metrics, stop rules, and keep/reject decisions.

30 Updated today
agencia-conversion