analyze-results

Solid

Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says "analyze results", "compare", or needs to interpret experimental data.

AI & Automation 11,051 stars 1037 forks Updated today MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Analyze Experiment Results Analyze: $ARGUMENTS ## Workflow ### Step 1: Locate Results Find all relevant JSON/CSV result files: - Check `figures/`, `results/`, or project-specific output directories - Parse JSON results into structured data ### Step 2: Build Comparison Table Organize results by: - **Independent variables**: model type, hyperparameters, data config - **Dependent variables**: primary metric (e.g., perplexity, accuracy, loss), secondary metrics - **Delta vs baseline**: always compute relative improvement ### Step 3: Statistical Analysis - If multiple seeds: report mean +/- std, check reproducibility - If sweeping a parameter: identify trends (monotonic, U-shaped, plateau) - Flag outliers or suspicious results ### Step 4: Generate Insights For each finding, structure as: 1. **Observation**: what the data shows (with numbers) 2. **Interpretation**: why this might be happening 3. **Implication**: what this means for the research question 4. **Next step**: what experiment would test the interpretation ### Step 5: Update Documentation If findings are significant: - Propose updates to project notes or experiment reports - Draft a concise finding statement (1-2 sentences) ## Output Format Always include: 1. Raw data table 2. Key findings (numbered, concise) 3. Suggested next experiments (if any)

Details

Author: wanshuiyin
Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Created: 2 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

analysis

Use when analyzing experiment results, comparing models, interpreting metrics, debugging unexpected outputs, or performing ablation analysis. Trigger phrases include "analyze results", "compare models", "why is the loss", "debug training", "interpret", "ablation analysis", "what went wrong", "check metrics". Even if the user says "look at the numbers" or "explain these results", use this skill.

0 Updated 1 months ago

hyojun01

AI & Automation Solid

monitor-experiment

Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.

11,051 Updated today

wanshuiyin

AI & Automation Solid

results-analysis

This skill should be used when the user asks to "analyze experimental results", "run strict statistical analysis", "compare model performance", "generate scientific figures", "check significance", "do ablation analysis", or mentions interpreting experiment data with rigorous statistics and visualization. It focuses on strict analysis bundles, not Results-section prose.

4,111 Updated 3 days ago

Galaxy-Dawn

AI & Automation Listed

analyze

Answer data questions -- from quick lookups to full analyses. Use when looking up a single metric, investigating what's driving a trend or drop, comparing segments over time, or preparing a formal data report for stakeholders.

2 Updated yesterday

nota-america

AI & Automation Listed

analyze

15 Updated yesterday

charlieviettq