OVS/VXLAN Health Diagnostics¶
Open vSwitch (OVS) provides the overlay network infrastructure with VXLAN tunnels connecting worker and edge nodes. This runbook provides comprehensive diagnostics for OVS and VXLAN issues.
Quick Health Check¶
# One-liner: Check OVS and VXLAN status
sudo k3s kubectl -n kube-system get ds | grep -E "ds-net-setup|kube-multus-ds" && echo "DaemonSets OK" || echo "DaemonSets FAIL"; sudo ovs-vsctl show | grep -c "Bridge br-n" && echo "Bridges found"
Step-by-Step Diagnostics¶
1. Kubernetes DaemonSet Status¶
# Check OVS DaemonSets
sudo k3s kubectl -n kube-system get ds | grep -E "ds-net-setup|kube-multus-ds"
# Expected output:
# ds-net-setup-edge 2/2 2 2 2d
# ds-net-setup-worker 2/2 2 2 2d
# kube-multus-ds 2/2 2 2 2d
# Check DaemonSet details
sudo k3s kubectl -n kube-system describe ds ds-net-setup-worker
sudo k3s kubectl -n kube-system describe ds ds-net-setup-edge
# Check DaemonSet pods
sudo k3s kubectl -n kube-system get pods -l app=ds-net-setup-worker
sudo k3s kubectl -n kube-system get pods -l app=ds-net-setup-edge
2. OVS DaemonSet Logs¶
# Check OVS DaemonSet logs
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=200
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-edge --tail=200
# Check OVS DaemonSet logs with timestamps
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=200 --timestamps
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-edge --tail=200 --timestamps
# Follow OVS DaemonSet logs in real-time
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker -f
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-edge -f
3. OVS Bridge Configuration¶
# Check OVS configuration
sudo ovs-vsctl show
# Expected output should show:
# - Bridge br-n1, br-n2, br-n3, br-n4, br-n6e, br-n6c
# - VXLAN interfaces (vxlan-n1, vxlan-n2, etc.)
# - Port mappings
# List all bridges
sudo ovs-vsctl list-br
# Expected: br-n1, br-n2, br-n3, br-n4, br-n6e, br-n6c
# Check specific bridge details
sudo ovs-vsctl show br-n3
sudo ovs-vsctl show br-n4
4. VXLAN Tunnel Status¶
# Check VXLAN interfaces
ip -d link show | grep -A2 vxlan-
# Expected output should show:
# - vxlan-n1, vxlan-n2, vxlan-n3, vxlan-n4, vxlan-n6
# - VXLAN configuration details
# Check VXLAN tunnel configuration
sudo ovs-vsctl list interface | grep -A10 vxlan
# Check VXLAN tunnel status
ip -d link show vxlan-n3
ip -d link show vxlan-n4
5. Bridge Port Configuration¶
# List ports for each bridge
sudo ovs-vsctl list-ports br-n1
sudo ovs-vsctl list-ports br-n2
sudo ovs-vsctl list-ports br-n3
sudo ovs-vsctl list-ports br-n4
sudo ovs-vsctl list-ports br-n6e
sudo ovs-vsctl list-ports br-n6c
# Check port details
sudo ovs-vsctl list port vxlan-n3
sudo ovs-vsctl list port vxlan-n4
# Check interface details
sudo ovs-vsctl list interface vxlan-n3
sudo ovs-vsctl list interface vxlan-n4
6. VXLAN Tunnel Connectivity¶
# Check VXLAN tunnel endpoints
sudo ovs-vsctl list interface vxlan-n3 | grep -E "remote_ip|local_ip"
# Check VXLAN tunnel status
sudo ovs-vsctl list interface vxlan-n3 | grep -E "status|admin_state"
# Check VXLAN tunnel statistics
sudo ovs-vsctl list interface vxlan-n3 | grep -E "statistics"
# Test VXLAN tunnel connectivity
ping -c 3 10.203.0.102 # UPF-edge N3 IP
ping -c 3 10.203.0.101 # UPF-cloud N3 IP
7. OVS Flow Table¶
# Check OVS flow table
sudo ovs-ofctl dump-flows br-n3
sudo ovs-ofctl dump-flows br-n4
# Check OVS flow table with statistics
sudo ovs-ofctl dump-flows br-n3 --statistics
# Check OVS flow table with counters
sudo ovs-ofctl dump-flows br-n3 --counters
Common Issues & Solutions¶
1. OVS DaemonSet Not Running¶
Symptoms:
kubectl -n kube-system get ds ds-net-setup-workershows 0/2 Ready- No OVS bridges created
Diagnosis:
# Check DaemonSet status
sudo k3s kubectl -n kube-system get ds ds-net-setup-worker
sudo k3s kubectl -n kube-system get ds ds-net-setup-edge
# Check pod logs
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=100
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-edge --tail=100
# Check node resources
sudo k3s kubectl describe node worker
sudo k3s kubectl describe node edge
Solution:
- Check node resources and taints
- Restart DaemonSet:
kubectl -n kube-system rollout restart ds ds-net-setup-worker - Check OVS installation on nodes
2. OVS Bridges Missing¶
Symptoms:
sudo ovs-vsctl list-brshows missing bridges- No br-n1, br-n2, br-n3, etc.
Diagnosis:
# Check OVS bridges
sudo ovs-vsctl list-br
# Check OVS DaemonSet logs
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=200
# Check OVS service status
sudo systemctl status openvswitch-switch
Solution:
- Restart OVS service:
sudo systemctl restart openvswitch-switch - Restart OVS DaemonSet
- Check OVS installation
3. VXLAN Tunnels Not Created¶
Symptoms:
- No VXLAN interfaces in
ip -d link show - No connectivity between worker↔edge
Diagnosis:
# Check VXLAN interfaces
ip -d link show | grep vxlan
# Check OVS interface list
sudo ovs-vsctl list interface | grep vxlan
# Check OVS DaemonSet logs
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=200
Solution:
- Check worker↔edge connectivity
- Verify VXLAN configuration
- Restart OVS DaemonSet
4. VXLAN Tunnel Connectivity Issues¶
Symptoms:
- VXLAN interfaces exist but no connectivity
- Ping fails between worker↔edge
Diagnosis:
# Check VXLAN tunnel endpoints
sudo ovs-vsctl list interface vxlan-n3 | grep -E "remote_ip|local_ip"
# Check VXLAN tunnel status
sudo ovs-vsctl list interface vxlan-n3 | grep -E "status|admin_state"
# Test connectivity
ping -c 3 10.203.0.102
ping -c 3 10.203.0.101
Solution:
- Verify VXLAN tunnel configuration
- Check firewall rules
- Restart VXLAN interfaces
Advanced Diagnostics¶
OVS Database Analysis¶
# Check OVS database
sudo ovs-vsctl list bridge
sudo ovs-vsctl list port
sudo ovs-vsctl list interface
# Check OVS database with details
sudo ovs-vsctl list bridge br-n3
sudo ovs-vsctl list port vxlan-n3
sudo ovs-vsctl list interface vxlan-n3
VXLAN Tunnel Statistics¶
# Check VXLAN tunnel statistics
sudo ovs-vsctl list interface vxlan-n3 | grep -A20 statistics
# Check VXLAN tunnel counters
sudo ovs-ofctl dump-flows br-n3 --statistics
# Check VXLAN tunnel errors
sudo ovs-vsctl list interface vxlan-n3 | grep -E "error|drop"
Network Performance Testing¶
# Test VXLAN tunnel performance
iperf3 -c 10.203.0.102 -t 10
# Test VXLAN tunnel with different packet sizes
ping -c 10 -s 1472 10.203.0.102
# Test VXLAN tunnel with UDP
iperf3 -u -c 10.203.0.102 -b 100M -t 10
OVS Flow Table Analysis¶
# Check OVS flow table
sudo ovs-ofctl dump-flows br-n3
# Check OVS flow table with statistics
sudo ovs-ofctl dump-flows br-n3 --statistics
# Check OVS flow table with counters
sudo ovs-ofctl dump-flows br-n3 --counters
# Check OVS flow table for specific flows
sudo ovs-ofctl dump-flows br-n3 | grep vxlan
Troubleshooting Commands¶
Quick Fixes¶
# Restart OVS DaemonSet
sudo k3s kubectl -n kube-system rollout restart ds ds-net-setup-worker
sudo k3s kubectl -n kube-system rollout restart ds ds-net-setup-edge
# Restart OVS service
sudo systemctl restart openvswitch-switch
# Restart VXLAN interfaces
sudo ovs-vsctl del-port br-n3 vxlan-n3
sudo ovs-vsctl add-port br-n3 vxlan-n3 -- set interface vxlan-n3 type=vxlan options:key=3 options:remote_ip=192.168.56.12
Configuration Validation¶
# Validate OVS configuration
sudo ovs-vsctl show
# Validate VXLAN tunnel configuration
sudo ovs-vsctl list interface vxlan-n3 | grep -E "remote_ip|local_ip|key"
# Validate bridge configuration
sudo ovs-vsctl list bridge br-n3
sudo ovs-vsctl list port vxlan-n3
Log Analysis¶
# OVS DaemonSet logs with timestamps
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-worker --tail=200 --timestamps
sudo k3s kubectl -n kube-system logs -l app=ds-net-setup-edge --tail=200 --timestamps
# OVS service logs
sudo journalctl -u openvswitch-switch --tail=100
# OVS daemon logs
sudo journalctl -u ovs-vswitchd --tail=100
sudo journalctl -u ovsdb-server --tail=100
Health Monitoring¶
# Monitor OVS bridges
watch -n 1 'sudo ovs-vsctl list-br'
# Monitor VXLAN interfaces
watch -n 1 'ip -d link show | grep vxlan'
# Monitor OVS flow table
watch -n 1 'sudo ovs-ofctl dump-flows br-n3 --statistics'
OVS Garbage Collection¶
The testbed includes automated OVS garbage collection to clean up stale ports:
# Check OVS GC DaemonSet
sudo k3s kubectl -n kube-system get ds ds-ovs-gc
# Check OVS GC logs
sudo k3s kubectl -n kube-system logs -l app=ds-ovs-gc --tail=100
# Check OVS GC configuration
sudo k3s kubectl -n kube-system get configmap ovs-scripts -o yaml
Performance Tuning¶
VXLAN Tunnel Optimization¶
# Check VXLAN tunnel MTU
ip link show vxlan-n3 | grep mtu
# Set VXLAN tunnel MTU
sudo ip link set vxlan-n3 mtu 1450
# Check VXLAN tunnel offload
sudo ethtool -k vxlan-n3 | grep -E "tx-checksumming|tx-checksum-ipv4"
OVS Flow Table Optimization¶
# Check OVS flow table size
sudo ovs-ofctl dump-flows br-n3 | wc -l
# Clear OVS flow table
sudo ovs-ofctl del-flows br-n3
# Add specific flows
sudo ovs-ofctl add-flow br-n3 "priority=100,actions=normal"