9.12 Troubleshooting and Debugging
Diagnosing and Fixing Kubernetes Issues
Effective troubleshooting requires systematic approaches and the right tools to identify and resolve problems quickly.
Pod Troubleshooting
Common Pod Issues
# Check pod status
kubectl get pods
kubectl describe pod problematic-pod
kubectl logs problematic-pod
kubectl logs problematic-pod --previous
# Multiple containers
kubectl logs pod-name -c container-name
kubectl exec -it pod-name -c container-name -- sh
Pod Debugging
# Debug pod with tools
apiVersion: v1
kind: Pod
metadata:
name: debug-pod
spec:
containers:
- name: debug
image: nicolaka/netshoot
command: ["sleep", "3600"]
# Debug running pod
kubectl debug pod-name -it --image=busybox
Service and Networking
Network Connectivity Issues
# Test service connectivity
kubectl run test-pod --image=busybox --rm -it -- sh
nslookup service-name
wget -qO- service-name:port
# Check endpoints
kubectl get endpoints service-name
kubectl describe service service-name
Network Policy Debugging
# Check network policies
kubectl get networkpolicies
kubectl describe networkpolicy policy-name
# Test connectivity between pods
kubectl exec pod1 -- ping pod2-ip
Resource Issues
Resource Constraints
# Check resource usage
kubectl top nodes
kubectl top pods --all-namespaces
kubectl describe node node-name
# Check events for resource issues
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl describe pod pending-pod
Storage Issues
# Check PVC status
kubectl get pvc
kubectl describe pvc my-pvc
kubectl get pv
# Check storage class
kubectl get storageclass
kubectl describe storageclass standard
Application Issues
Health Check Failures
# Check probe configurations
kubectl describe pod app-pod
kubectl logs app-pod
# Test health endpoints manually
kubectl port-forward pod/app-pod 8080:8080
curl localhost:8080/health
Configuration Problems
# Check ConfigMaps and Secrets
kubectl get configmaps
kubectl describe configmap app-config
kubectl get secrets
# Verify environment variables
kubectl exec pod-name -- env
Cluster-Level Issues
Node Problems
# Check node status
kubectl get nodes
kubectl describe node node-name
kubectl get events --field-selector involvedObject.kind=Node
# Check node resources
kubectl top node node-name
API Server Issues
# Check cluster components
kubectl get componentstatuses
kubectl cluster-info
kubectl get events --all-namespaces
Essential Debugging Commands
# General debugging
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl describe <resource-type> <resource-name>
kubectl logs -f deployment/app-name
# Resource inspection
kubectl get <resource> -o yaml
kubectl get <resource> -o wide
kubectl top nodes/pods
# Interactive debugging
kubectl exec -it pod-name -- /bin/bash
kubectl port-forward pod/pod-name 8080:8080
kubectl debug node/node-name -it --image=busybox
What’s Next?
Finally, we’ll explore Kubernetes Operators for extending Kubernetes functionality.