debug-cuda-crashlisted

Tutorial for debugging CUDA crashes using API logging
aiskillstore/marketplace · ★ 329 · Code & Development · score 79

Install: claude install-skill aiskillstore/marketplace

# Tutorial: Debugging CUDA Crashes with API Logging This tutorial shows you how to debug CUDA crashes and errors in FlashInfer using the `@flashinfer_api` logging decorator. ## Goal When your code crashes with CUDA errors (illegal memory access, out-of-bounds, NaN/Inf), use API logging to: - Capture input tensors BEFORE the crash occurs - Understand what data caused the problem - Track tensor shapes, dtypes, and values through your pipeline - Detect numerical issues (NaN, Inf, wrong shapes) ## Why Use API Logging? **Problem**: CUDA errors often crash the program, leaving no debugging information. **Solution**: FlashInfer's `@flashinfer_api` decorator logs inputs BEFORE execution, so you can see what caused the crash even after the program terminates. ## Step 1: Enable API Logging ### Basic Logging (Function Names Only) ```bash export FLASHINFER_LOGLEVEL=1 # Log function names export FLASHINFER_LOGDEST=stdout # Log to console python my_script.py ``` Output: ``` [2025-12-18 10:30:45] FlashInfer API Call: batch_decode_with_padded_kv_cache ``` ### Detailed Logging (Inputs/Outputs with Metadata) ```bash export FLASHINFER_LOGLEVEL=3 # Log inputs/outputs with metadata export FLASHINFER_LOGDEST=debug.log # Save to file python my_script.py ``` Output in `debug.log`: ``` ================================================================================ [2025-12-18 10:30:45] FlashInfer API Logging - System Information =======================================