Runbooks Index¶
This directory contains focused diagnostic and troubleshooting procedures for the KELT testbed.
Quick Navigation¶
5G Protocol Diagnostics¶
| Runbook | Protocol | Interface | Purpose |
|---|---|---|---|
pfcp-diagnostics.md |
PFCP | N4 | SMF ↔ UPF control plane |
ngap-diagnostics.md |
NGAP | N2 | gNB ↔ AMF signaling |
gtpu-path.md |
GTP-U | N3 | gNB/UE ↔ UPF data plane |
Infrastructure Diagnostics¶
| Runbook | Component | Purpose |
|---|---|---|
multus-nad-ipam.md |
Multus/Whereabouts | Multiple network interfaces per pod |
ovs-vxlan-health.md |
OVS/VXLAN | Overlay network infrastructure |
kubectl: All commands run from master (
vagrant ssh master) usingsudo k3s kubectl.
Usage¶
Each runbook follows a consistent structure:
- Quick Health Check - One-liner to verify status
- Step-by-Step Diagnostics - Detailed troubleshooting procedure
- Common Issues & Solutions - Frequent problems and fixes
- Advanced Diagnostics - Deep troubleshooting and performance testing
Quick Commands¶
Check All 5G Interfaces¶
# PFCP (N4) - SMF ↔ UPF
sudo k3s kubectl -n 5g exec deploy/smf -- bash -lc 'ss -unap | grep 8805 && echo "SMF listening" || echo "SMF not listening"'
# NGAP (N2) - gNB ↔ AMF
sudo k3s kubectl -n 5g exec deploy/amf -- bash -lc 'ss -S -na | grep 38412 && echo "AMF SCTP listening" || echo "AMF SCTP not listening"'
# GTP-U (N3) - gNB/UE ↔ UPF
sudo k3s kubectl -n 5g exec deploy/upf-cloud -- bash -lc 'ss -unap | grep 2152 && echo "UPF GTP-U listening" || echo "UPF GTP-U not listening"'
Check Infrastructure¶
# Multus and NADs
sudo k3s kubectl -n kube-system get ds kube-multus-ds && echo "Multus OK" || echo "Multus FAIL"
# OVS and VXLAN
sudo k3s kubectl -n kube-system get ds | grep -E "ds-net-setup|kube-multus-ds" && echo "DaemonSets OK" || echo "DaemonSets FAIL"
Troubleshooting Flow¶
- Start with Quick Health Check - Identify which component is failing
- Follow Step-by-Step Diagnostics - Systematic approach to isolate the issue
- Check Common Issues & Solutions - Look for known problems and fixes
- Use Advanced Diagnostics - Deep dive if needed
Related Documentation¶
- Handbook:
../operations/handbook.md- operator cheat-sheet (IPs, ports, commands) - Documentation index:
../README.md- links to every doc, grouped by area - Troubleshooting:
../operations/troubleshooting.md- start here when something is wrong
Contributing¶
When adding new runbooks:
- Follow the established structure
- Include Quick Health Check one-liner
- Provide step-by-step diagnostics
- Document common issues and solutions
- Add advanced diagnostics for complex scenarios
- Update this index
Examples¶
Real-time Log Monitoring¶
# AMF logs with NGAP context
sudo k3s kubectl -n 5g logs deploy/amf -c amf -f | grep -i ngap
# SMF logs with PFCP context
sudo k3s kubectl -n 5g logs deploy/smf -c smf -f | grep -i pfcp
# UPF logs with GTP-U context
sudo k3s kubectl -n 5g logs deploy/upf-cloud -c upf-cloud -f | grep -i gtpu
Network Interface Verification¶
# Check all interfaces on AMF
sudo k3s kubectl -n 5g exec deploy/amf -- ip addr show
# Check specific interface
sudo k3s kubectl -n 5g exec deploy/amf -- ip -o -4 addr show dev n2
# Check network status annotation
sudo k3s kubectl -n 5g get pod -l app=amf -o json | jq -r '.items[0].metadata.annotations["k8s.v1.cni.cncf.io/network-status"]' | jq '.'
Performance Testing¶
# Test VXLAN tunnel performance
sudo k3s kubectl -n 5g exec deploy/gnb -- iperf3 -c 10.203.0.102 -t 30
# Test PFCP message exchange
sudo k3s kubectl -n 5g exec deploy/smf -- bash -c 'for i in {1..10}; do nc -zuvw1 10.204.0.101 8805 && echo "Success $i" || echo "Failed $i"; sleep 1; done'