slime-rl-training

Featured

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

AI & Automation 27,984 stars 2901 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# slime: LLM Post-Training Framework for RL Scaling slime is an LLM post-training framework from Tsinghua's THUDM team, powering GLM-4.5, GLM-4.6, and GLM-4.7. It connects Megatron-LM for training with SGLang for high-throughput rollout generation. ## When to Use slime **Choose slime when you need:** - Megatron-LM native training with SGLang inference - Custom data generation workflows with flexible data buffers - Training GLM, Qwen3, DeepSeek V3, or Llama 3 models - Research-grade framework with production backing (Z.ai) **Consider alternatives when:** - You need enterprise-grade stability features → use **miles** - You want flexible backend swapping → use **verl** - You need PyTorch-native abstractions → use **torchforge** ## Key Features - **Training**: Megatron-LM with full parallelism support (TP, PP, DP, SP) - **Rollout**: SGLang-based high-throughput generation with router - **Data Buffer**: Flexible prompt management and sample storage - **Models**: GLM-4.x, Qwen3, DeepSeek V3/R1, Llama 3 ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────┐ │ Data Buffer │ │ - Prompt initialization and management │ │ - Custom data generation and filtering │ │ - Rollout sample storage │ └─────────────┬───────────────────────────┬───────────────┘ │ │ ┌─────────────▼───────────┐ ┌─────────────▼────...

Details

Author
davila7
Repository
davila7/claude-code-templates
Created
11 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category