← ClaudeAtlas

ml-experiment-reproducibilitylisted

Pin seeds/config/dataset versions and provide a deterministic rerun path.
authenticfake/clike · ★ 1 · AI & Automation · score 70
Install: claude install-skill authenticfake/clike
# Skill: ML Experiment Reproducibility ## Intent Ensure ML, data science, model evaluation, and experiment requirements are reproducible, measurable, and traceable. This skill separates product code from experiments and prevents unverifiable model-quality claims. ## Use when Use this skill when a REQ touches ML models, datasets, training, fine-tuning, feature engineering, evaluation metrics, model comparison, notebooks, pipelines, data quality, experiment tracking, or batch inference. ## Do not use when Do not use this skill for generic LLM prompt/RAG work unless the REQ includes ML datasets, model metrics, training, offline evaluation, or experiment comparison. ## Signals - The REQ mentions ML, model training, fine-tuning, dataset, feature, label, metric, accuracy, precision, recall, F1, ROC, drift, experiment, notebook, inference, pipeline, validation split, baseline, or model registry. - Acceptance criteria include measurable model quality. - Generated files include notebooks, data loaders, evaluation scripts, model wrappers, or dataset fixtures. ## Required behavior - Define datasets, fixtures, or sample data boundaries explicitly. - Define metrics and thresholds before implementation. - Keep training/experiment code separate from production inference code when practical. - Make evaluation commands reproducible. - Record assumptions about data availability, privacy, sampling, and labels. - Include deterministic smoke tests for data loading and metric computatio