ml-training-recipes

Solid

Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.

AI & Automation 9,609 stars 724 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
100
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# ML Training Recipes Battle-tested patterns for PyTorch training across domains. Drawn from production codebases (Karpathy's autoresearch/nanochat, torchvision, HuggingFace) and modern training practice. ## Reference files (read when needed) - `references/architecture.md` — Transformer/LLM architecture code patterns, weight init - `references/optimizers.md` — Muon, AdamW hybrid, per-group LR, compiled optimizer steps - `references/domain-specific.md` — Vision, diffusion, contrastive, distributed, checkpointing, data loading - `references/scaling-and-selection.md` — Scaling laws, compute budget tables, decision trees, DGX Spark - `references/biomedical.md` — Drug discovery, protein models, medical imaging, genomics, clinical NLP - `references/experiment-loop.md` — Autonomous experiment loop (autoresearch keep/discard/revert) --- ## Architecture Selection Pick the right model by **data type** and **data scale**: | Data Type | < 10K samples | 10K-100K | > 100K | |-----------|--------------|----------|--------| | **Images** | Pretrained CNN + fine-tune | Fine-tune ViT or CNN | ViT from scratch | | **Text (gen)** | Few-shot prompting | Fine-tune GPT/LLaMA (LoRA) | Pretrain from scratch | | **Tabular** | XGBoost/LightGBM | Still XGBoost | Neural viable | | **Audio** | Pretrained Whisper | Fine-tune AST | Train from scratch | | **Molecules** | Pretrained GNN | Fine-tune molecular LM | Train GNN from scratch | | **Proteins** | ESM-2 embeddings + hea...

Details

Author
Orchestra-Research
Repository
Orchestra-Research/AI-Research-SKILLs
Created
7 months ago
Last Updated
1 months ago
Language
TeX
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category