slime-rl-training

Solid

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

AI & Automation 191,515 stars 33299 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# slime: LLM Post-Training Framework for RL Scaling slime is an LLM post-training framework from Tsinghua's THUDM team, powering GLM-4.5, GLM-4.6, and GLM-4.7. It connects Megatron-LM for training with SGLang for high-throughput rollout generation. ## When to Use slime **Choose slime when you need:** - Megatron-LM native training with SGLang inference - Custom data generation workflows with flexible data buffers - Training GLM, Qwen3, DeepSeek V3, or Llama 3 models - Research-grade framework with production backing (Z.ai) **Consider alternatives when:** - You need enterprise-grade stability features → use **miles** - You want flexible backend swapping → use **verl** - You need PyTorch-native abstractions → use **torchforge** ## Key Features - **Training**: Megatron-LM with full parallelism support (TP, PP, DP, SP) - **Rollout**: SGLang-based high-throughput generation with router - **Data Buffer**: Flexible prompt management and sample storage - **Models**: GLM-4.x, Qwen3, DeepSeek V3/R1, Llama 3 ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────┐ │ Data Buffer │ │ - Prompt initialization and management │ │ - Custom data generation and filtering │ │ - Rollout sample storage │ └─────────────┬───────────────────────────┬───────────────┘ │ │ ┌─────────────▼───────────┐ ┌─────────────▼────...

Details

Author: NousResearch
Repository: NousResearch/hermes-agent
Created: 10 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

slime-rl-training

9,609 Updated 1 months ago

Orchestra-Research

AI & Automation Featured

slime-rl-training

27,984 Updated today

davila7

AI & Automation Featured

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

27,984 Updated today

davila7