← ClaudeAtlas

transformers-config-tokenizers-expertlisted

Preflight reference for HuggingFace snapshots — what vLLM, sglang, and transformers.generate see at runtime. Covers config-file precedence (tokenizer.json, tokenizer_config.json, generation_config.json, chat_template.jinja), transformers v5 tokenizer-class taxonomy (TokenizersBackend, PythonBackend, MistralCommonBackend, TikTokenTokenizer), special-token discovery (all_special_ids, added_tokens_decoder, extra_special_tokens, backend_tokenizer.get_added_tokens_decoder), chat-template Jinja contract (ImmutableSandboxedEnvironment, loopcontrols, raise_exception, strftime_now, tojson, add_generation_prompt), and engine knobs (skip_special_tokens, trust_request_chat_template, chat_template_kwargs allowlist, adjust_request, incremental detokenizer, EOS merge). Ships verified 2026 hall-of-shame for Kimi-K2.6, GLM-5.1, Gemma-4, Qwen3, DeepSeek-V3, plus drop-in Python for resolving markers to IDs, detecting turn-primer-as-EOS leaks, and cross-referencing tokenizer.json vs tokenizer_config.json.
air-gapped/skills · ★ 2 · AI & Automation · score 78
Install: claude install-skill air-gapped/skills
# Transformers config + tokenizers expert Target: engineers writing a preflight tool (or a vLLM/sglang operator) that must decide, before handing a HuggingFace snapshot to an inference engine, *which* files win, *which* tokens are structural, and *which* class will actually instantiate. Almost every major 2026 release has shipped with drift between `tokenizer_config.json`, `generation_config.json`, `config.json`, and the Rust-backed tokenizer state. The skill exists so a preflight tool can answer that drift authoritatively — not guess. --- ## Stance - **Cite, don't paraphrase.** Every load-bearing claim has a file:line or URL citation in `references/`. Point at the source. - **Version-gate.** Transformers v5 (GA 2026-01-26) renamed the tokenizer classes and changed serialization shapes. Pre-5.0 and post-5.0 diverge — check `transformers.__version__` before claiming. - **Rust is truth.** For any model with `tokenizer.json`, the authoritative added-token state is `tokenizer.backend_tokenizer.get_added_tokens_decoder()`. Python-side `all_special_ids` / `special_tokens_map` / `added_tokens_decoder` are views; treat them as such. - **Engines disagree.** vLLM and sglang both union-merge `generation_config.eos_token_id`, but apply it through different pipelines (see `engine-knobs.md`). Predict per engine, not in the abstract. --- ## Triage: symptom → layer → reference Use this table first. Deep dives live in `references/`. | Symptom | Layer | Open | |---