← ClaudeAtlas

open-webui-valkey-websocketlisted

Deploy Open WebUI multi-pod with WebSockets and Valkey/Redis Sentinel at 1000+ user scale on Kubernetes. Centerpiece is the structural Socket.IO+Redis frame-amplification bug (#23733) that cripples multi-pod streaming, and the maintainer-endorsed mitigation (`CHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZE`). Covers all multi-pod env vars, the custom-model-icon perf history (base64-in-/api/models, fixed late 2025–Apr 2026), the official helm chart's gaps (bundled Redis is unsuitable for production; no HPA/PDB/probes/sticky sessions), and the catalog of known multi-pod issues with current status.
air-gapped/skills · ★ 2 · AI & Automation · score 78
Install: claude install-skill air-gapped/skills
# Open WebUI multi-pod with Valkey Sentinel + WebSockets — operator reference Target: deploying Open WebUI on Kubernetes with 3+ replicas, WebSocket support enabled, Valkey Sentinel for shared state and Socket.IO pub/sub, at 1000+ user scale. Sentinel is the topology — not a recommendation, just the operating reality. The single most important thing to internalize before going multi-pod: **issue #23733 (Socket.IO frame amplification) is open and structural**. It is the most likely cause of "we had to turn off multi-pod and websockets" in production. The mitigation is one env var. Read `references/issue-23733.md` first. ## The big bug, in 60 seconds (#23733) **What it does:** Open WebUI streams assistant responses via Socket.IO. By design, **every single SSE token causes the backend to re-serialize the entire accumulated assistant message and emit it as a new Socket.IO frame**. The frame contains the full message-so-far, not the delta. **Why it was designed that way (maintainer's words):** *"Every Socket.IO frame carries the complete rendered content of the assistant message. […] If a WebSocket frame is dropped, the connection flaps, the user switches tabs and comes back, or the browser GC causes a missed event, the very next frame self-corrects because it contains the complete truth."* Self-healing client, fully stateless frontend. **Why it falls apart with `WEBSOCKET_MANAGER=redis` + multi-pod:** Each emit goes through `socketio.AsyncRedisManager`, which serializes the