pod-troubleshooting

Solid

Systematic diagnosis of Kubernetes pod failures — CrashLoopBackOff, OOMKilled, Pending, ImagePullBackOff, and service connectivity issues. Use when the user encounters pods not starting, container restart loops, scheduling failures, or service unreachability in a K8s cluster.

AI & Automation 14 stars 3 forks Updated 3 days ago MIT

Install

View on GitHub

Quality Score: 86/100

Stars 20%
39
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# Skill: Pod Troubleshooting > **Expertise:** Systematic K8s failure diagnosis — from symptom to root cause in under 10 commands. ## When to load When a pod is not Running, a service is unreachable, or a deployment is stuck. ## Diagnostic Decision Tree ``` Pod not Running? ├── Status: Pending │ ├── No nodes match → check node selectors, taints, resource requests │ └── PVC not bound → check StorageClass, PV availability ├── Status: CrashLoopBackOff │ ├── Exit code 0 → process exited cleanly but K8s restarts it → check command │ ├── Exit code 1 → app error → check logs │ ├── Exit code 137 → OOMKilled → increase memory limit │ └── Exit code 143 → SIGTERM not handled → fix graceful shutdown ├── Status: ImagePullBackOff │ ├── Image doesn't exist → check tag/digest │ └── Registry auth fails → check imagePullSecret └── Status: Error / Init:Error └── Init container failed → check init container logs ``` ## Command Cheatsheet ```bash # 1. Overview — what's wrong kubectl get pods -n <ns> -o wide kubectl describe pod <pod> -n <ns> # events section is the first place to look # 2. Logs kubectl logs <pod> -n <ns> # current container kubectl logs <pod> -n <ns> --previous # last crashed container (CrashLoop) kubectl logs <pod> -n <ns> -c <container> # specific container in multi-container pod # 3. Exec into running pod kubectl exec -it <pod> -n <ns> -- /bin/sh # 4. Resource pressure check kubectl top nodes kubectl top pods -n <n...

Details

Author
sawrus
Repository
sawrus/agent-guides
Created
3 months ago
Last Updated
3 days ago
Language
Shell
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

DevOps & Infrastructure Listed

kubeview-debug

Debug and diagnose Kubernetes clusters using KubeView MCP server tools. Use when investigating cluster issues (pod crashes, deployment failures, service connectivity problems, node issues, resource constraints), performing cluster health checks, or troubleshooting any Kubernetes workload. Trigger phrases include "cluster health", "pod won't start", "CrashLoopBackOff", "service unreachable", "deployment stuck", "node pressure", "OOMKilled", "ImagePullBackOff".

4 Updated 4 days ago
mikhae1
DevOps & Infrastructure Solid

kubernetes-skill

Prevent Kubernetes hallucinations by diagnosing and fixing failure modes: insecure workload defaults, resource starvation, network exposure, privilege sprawl, fragile rollouts, and API drift. Use when generating, reviewing, refactoring, or migrating manifests, Helm charts, Kustomize overlays, cluster policies, and platform-specific Kubernetes work for EKS, GKE, AKS, OpenShift, GitOps controllers, or observability stacks.

344 Updated 1 weeks ago
LukasNiessen
DevOps & Infrastructure Listed

managing-kubernetes

Manages Kubernetes clusters via kubectl. Supports pod/deployment/service management, log viewing, port-forwarding, and debugging. Use for "k8s", "kubectl", "파드", cluster management tasks.

1 Updated 3 days ago
Open330
DevOps & Infrastructure Solid

kubernetes-ops

Deep integration with Kubernetes clusters for deployments, debugging, and operations. Execute kubectl commands, analyze pod logs/events/resources, generate and validate manifests, and debug cluster issues.

1,034 Updated today
a5c-ai
Data & Documents Listed

eks-troubleshooting

Investigate Kubernetes/EKS issues by running run-investigation.sh with the issue type, resource name, kubectl context, and namespace.

0 Updated 2 months ago
kjenney