← ClaudeAtlas

backend-hang-debuglisted

Diagnose and fix FastAPI hangs caused by blocking ThreadPoolExecutor shutdown in the news stream route; includes py-spy capture and non-blocking executor pattern.
aiskillstore/marketplace · ★ 329 · API & Backend · score 79
Install: claude install-skill aiskillstore/marketplace
# Backend Hang Debug ## Purpose - Detect and resolve event-loop hangs where the FastAPI app stops responding (e.g., `curl http://localhost:8000/` times out) due to synchronous executor shutdown in the SSE news stream. - Provide a repeatable triage flow using `py-spy` to capture live stacks and pinpoint blocking code. ## Scope - Backend: `backend/app/api/routes/stream.py` (news stream), `backend/app/services/rss_ingestion.py` (RSS workers), startup processes. - Tooling: `py-spy` for live stack dumps; `curl` with timeouts for smoke tests. ## Quick Triage 1. **Reproduce hang**: `curl -m 5 http://localhost:8000/` and `curl -m 5 http://localhost:8000/health`; note timeouts. 2. **Process check**: `ss -tlnp | grep 8000` to confirm listener; `ls /proc/$(pgrep -f "uvicorn app.main")/fd | wc -l` to rule out FD leak. 3. **Stack capture** (inside backend venv): `uv pip install py-spy` then `sudo /home/bender/classwork/Thesis/backend/.venv/bin/py-spy dump --pid $(pgrep -f "uvicorn app.main")` (and worker pid if multiprocess). Look for `ThreadPoolExecutor.shutdown` in `api/routes/stream.py` frames. ## Fix Pattern (non-blocking executor) - Replace synchronous context manager `with ThreadPoolExecutor(...):` inside `event_generator` with a long-lived executor plus explicit **non-blocking** shutdown: - Create executor outside the context manager. - On client disconnect, cancel pending futures instead of awaiting shutdown. - In `finally`, call `executor.shutdown(wait=False, cancel_fut