transformers-config-tokenizers-expert

Solid

Preflight reference for HuggingFace snapshots — what vLLM, sglang, and transformers.generate see at runtime. Covers config-file precedence (tokenizer.json, tokenizer_config.json, generation_config.json, chat_template.jinja), transformers v5 tokenizer-class taxonomy (TokenizersBackend, PythonBackend, MistralCommonBackend, TikTokenTokenizer), special-token discovery (all_special_ids, added_tokens_decoder, extra_special_tokens, backend_tokenizer.get_added_tokens_decoder), chat-template Jinja contract (ImmutableSandboxedEnvironment, loopcontrols, raise_exception, strftime_now, tojson, add_generation_prompt), and engine knobs (skip_special_tokens, trust_request_chat_template, chat_template_kwargs allowlist, adjust_request, incremental detokenizer, EOS merge). Ships verified 2026 hall-of-shame for Kimi-K2.6, GLM-5.1, Gemma-4, Qwen3, DeepSeek-V3, plus drop-in Python for resolving markers to IDs, detecting turn-primer-as-EOS leaks, and cross-referencing tokenizer.json vs tokenizer_config.json.

AI & Automation 3 stars 1 forks Updated yesterday MIT

Install

View on GitHub

Quality Score: 79/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Transformers config + tokenizers expert Target: engineers writing a preflight tool (or a vLLM/sglang operator) that must decide, before handing a HuggingFace snapshot to an inference engine, *which* files win, *which* tokens are structural, and *which* class will actually instantiate. Almost every major 2026 release has shipped with drift between `tokenizer_config.json`, `generation_config.json`, `config.json`, and the Rust-backed tokenizer state. The skill exists so a preflight tool can answer that drift authoritatively — not guess. --- ## Stance - **Cite, don't paraphrase.** Every load-bearing claim has a file:line or URL citation in `references/`. Point at the source. - **Version-gate.** Transformers v5 (GA 2026-01-26) renamed the tokenizer classes and changed serialization shapes. Pre-5.0 and post-5.0 diverge — check `transformers.__version__` before claiming. - **Rust is truth.** For any model with `tokenizer.json`, the authoritative added-token state is `tokenizer.backend_tokenizer.get_added_tokens_decoder()`. Python-side `all_special_ids` / `special_tokens_map` / `added_tokens_decoder` are views; treat them as such. - **Engines disagree.** vLLM and sglang both union-merge `generation_config.eos_token_id`, but apply it through different pipelines (see `engine-knobs.md`). Predict per engine, not in the abstract. --- ## Triage: symptom → layer → reference Use this table first. Deep dives live in `references/`. | Symptom | Layer | Open | |---...

Details

Author: air-gapped
Repository: air-gapped/skills
Created: 3 months ago
Last Updated: yesterday
Language: Python
License: MIT

Integrates with

Anthropic · AI Hugging Face · AI Kubernetes · Infrastructure

Bundled in these plugins

skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

vllm-configuration

Configure vLLM completely — YAML config file format, CLI arg precedence, full VLLM_*/HF_*/TRANSFORMERS_* env-var catalog, end-to-end recipe for air-gapped environments (internal HF mirrors, hf-mirror.com, ModelScope, HF_HUB_OFFLINE with pre-seeded cache, gated models offline, trust_remote_code supply-chain implications). VLLM_HOST_IP vs API-host confusion, Kubernetes-service-named-`vllm` env-var poisoning, usage-stats triple opt-out, YAML precedence surprises.

3 Updated yesterday

air-gapped

AI & Automation Solid

jinja-expert

Author, read, and debug Jinja2 templates across the three places Jinja lives in 2026 — HuggingFace `chat_template.jinja` (rendered by `apply_chat_template` for vLLM / sglang), Ansible playbooks + `.j2` files, and Jinja-adjacent Kubernetes workflows (`values.yaml.j2`, `kubernetes.core.k8s + template`, Helm post-renderers). Companion to the `helm` skill — Helm charts are Go `text/template` + Sprig, not Jinja, and this skill makes that disambiguation explicit.

3 Updated yesterday

air-gapped

AI & Automation Featured

huggingface-tokenizers

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.

221,168 Updated today

NousResearch