← ClaudeAtlas

vllm-reasoning-parserslisted

vLLM reasoning-parser operator + developer reference. `--reasoning-parser` CLI wiring, `ReasoningParser` contract (non-streaming `extract_reasoning` + per-delta `extract_reasoning_streaming`), `is_reasoning_end` xgrammar gating, `--structured-outputs-config.enable_in_reasoning` bypass, 22 built-in parsers with per-model quirks, 15 production pitfalls, authoring custom parsers via `@ReasoningParserManager.register_module` or plugin.
air-gapped/skills · ★ 2 · AI & Automation · score 78
Install: claude install-skill air-gapped/skills
# vLLM reasoning parsers Target: operators wiring up `--reasoning-parser NAME` on a chat-completion endpoint, or developers authoring a parser for a new thinking model. Source of truth: `vllm/reasoning/` on `main`. ## What a reasoning parser actually does When a reasoning-trained model emits a single token stream like ``` <think>user asked X, let me check Y...</think>The answer is 42. ``` vLLM splits this into two fields on the chat-completion response: `reasoning` (the CoT) and `content` (the final answer). `--reasoning-parser NAME` selects the class that does the split. Without it, the whole stream lands in `content`. > **Field-name note.** On current `main` the field is `reasoning` (see `ChatCompletionResponseMessage.reasoning` / `DeltaMessage.reasoning` in `vllm/entrypoints/openai/chat_completion/protocol.py`). Pre-v0.19 code and many third-party docs / clients call it `reasoning_content`. If a client is reading `reasoning_content` against a current-main server it will see `null` every time even when the parser ran correctly. The parser is also the gating authority for **xgrammar / structured output**: by default, grammar enforcement is held off until `is_reasoning_end(input_ids)` flips true, so the model thinks freely before being constrained to JSON. Flip that default with `--structured-outputs-config.enable_in_reasoning=true` — then the grammar applies from token 0 regardless of reasoning state (useful for structured CoT). ## The contract (`ReasoningParser` ABC