monitor-experiment

Solid

Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.

AI & Automation 11,051 stars 1037 forks Updated today MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Monitor Experiment Results Monitor: $ARGUMENTS ## Workflow ### Step 1: Check What's Running ```bash ssh <server> "screen -ls" ``` ### Step 2: Collect Output from Each Screen For each screen session, capture the last N lines: ```bash ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt" ``` If hardcopy fails, check for log files or tee output. ### Step 3: Check for JSON Result Files ```bash ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20" ``` If JSON results exist, fetch and parse them: ```bash ssh <server> "cat <results_dir>/<latest>.json" ``` ### Step 4: Summarize Results Present results in a comparison table: ``` | Experiment | Metric | Delta vs Baseline | Status | |-----------|--------|-------------------|--------| | Baseline | X.XX | — | done | | Method A | X.XX | +Y.Y | done | ``` ### Step 5: Interpret - Compare against known baselines - Flag unexpected results (negative delta, NaN, divergence) - Suggest next steps based on findings ### Step 6: Feishu Notification (if configured) After results are collected, check `~/.codex/feishu.json`: - Send `experiment_done` notification: results summary table, delta vs baseline - If config absent or mode `"off"`: skip entirely (no-op) ## Key Rules - Always show raw numbers before interpretation - Compare against the correct baseline (same config) - Note if experiments are still running (check progress bars, iteratio...

Details

Author: wanshuiyin
Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Created: 2 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

analyze-results

Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says "analyze results", "compare", or needs to interpret experimental data.

11,051 Updated today

wanshuiyin

Web & Frontend Listed

experiment

Automated optimization loop with scalar fitness function. Proposes changes in isolated worktrees, measures with a metric command, keeps improvements, discards failures. Supports convergence detection and diminishing returns.

1 Updated today

allysgrandiose674

AI & Automation Solid

status

Show experiment dashboard with results, active loops, and progress.

16,642 Updated yesterday

alirezarezvani

AI & Automation Solid

experiment-loop

Autonomous experiment loop: hypothesize > modify > test > evaluate > keep/discard > repeat. Run N experiments automatically with measurable metrics. Works for performance optimization, A/B testing, prompt engineering, and any measurable improvement task.

495 Updated 1 months ago

vibeeval

AI & Automation Solid

run-experiment

Deploy and run ML experiments on local or remote GPU servers. Use when user says "run experiment", "deploy to server", "跑实验", or needs to launch training jobs.

11,051 Updated today

wanshuiyin