debug-stuck-evallisted
Install: claude install-skill METR/hawk
## Quick Checklist
1. **Verify auth**: `hawk auth access-token > /dev/null || echo "Run 'hawk login' first"`
2. **Get eval-set-id** from user
3. **Check status**: `hawk status <eval-set-id>` - JSON report with pod state, logs, metrics
4. **View logs**: `hawk logs <eval-set-id>` or `hawk logs -f` for follow mode
5. **List samples**: `hawk list samples <eval-set-id>` - see completion status
6. **Look for error patterns** (see below)
7. **Test API directly** if logs show retries without clear errors
## Error Patterns
| Log Pattern | Meaning | Resolution |
|-------------|---------|------------|
| `[uuid task/id/epoch model] Retrying request to /responses` | OpenAI SDK retry with sample context | Test API directly with curl to see real error |
| `[uuid task/id/epoch model] -> model retry N ... [ErrorType code]` | Inspect retry with error summary | Check error type; use curl for full details |
| `500 - Internal server error` | API issue | Download buffer, find failing request, test through middleman AND directly to provider |
| `400 - invalid_request_error` | Token/context limit exceeded | Check message count and model context window |
| `Pod UID mismatch` | Sandbox pod was killed and restarted | No fix needed—sample errored out, Inspect will retry |
| Empty output, `pending: true` | API returned malformed response | Restart eval (buffer resumes) |
| OOMKilled in pod status | Memory exhaustion | Increase pod memory limits |
## Key Techniques
1. **Retry messages have sample con