miles-rl-training
FeaturedProvides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.
Install
Quality Score: 99/100
Skill Content
Details
- Author
- davila7
- Repository
- davila7/claude-code-templates
- Created
- 11 months ago
- Last Updated
- today
- Language
- Python
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
miles-rl-training
Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.
slime-rl-training
Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.
slime-rl-training
Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.