Network Troubleshooting Guide -- Latency, Packet Loss & Bandwidth Issues
Network performance issues like high latency, packet loss, and bandwidth saturation degrade application responsiveness and user experience -- this guide teaches you how to measure, diagnose, and resolve each one using standard network tools and systematic analysis.
What You'll Learn
Why It Matters
Slow networks cost money. A 100ms increase in latency can reduce conversion rates by 7%. Knowing how to isolate whether the problem is your ISP, your router, your DNS, or your application code is essential for any Linux administrator.
Real-World Use
When your video conferencing app stutters, database queries take 10x longer than usual, or your CI/CD pipeline times out pulling dependencies, these techniques find the bottleneck and fix it.
Common Network Performance Issues Table
| Issue | Symptom | Cause | Diagnostic Tool |
|---|---|---|---|
| High latency | Slow page loads, laggy SSH | Long network path, slow DNS, bufferbloat | mtr, ping |
| Packet loss | Retransmissions, TCP timeouts | Faulty cable, bad NIC, congestion | ping -f, iperf3 |
| Bandwidth saturation | Slow transfers, high interface utilization | Application consuming all capacity | nload, iftop |
| Interface flapping | "Link up/down" in logs | Faulty cable, duplex mismatch, driver bug | dmesg, ethtool |
| DNS slow resolution | Delayed first-byte on all connections | Slow upstream DNS, too many lookups | dig, resolvectl |
| TCP window scaling issue | Slow throughput despite low latency | Incompatible TCP parameters | ss -i, tcpdump |
Step-by-Step Fixes
Fix 1: Diagnose High Latency with MTR
# Run MTR to see the route and latency per hop
mtr --report --report-cycles 10 google.com
# Continuous real-time MTR
mtr -t google.com
# Focus on a specific hop with issues
mtr --report --report-cycles 20 8.8.8.8 | awk '$3 > 100 {print $0}'
Expected output:
Start: 2026-06-23T10:00:00+0000
HOST: host.example.com Loss% Snt Last Avg Best Wrst StDev
1. 192.168.1.1 0.0% 10 1.2 1.5 0.8 3.2 0.7
2. 10.0.0.1 0.0% 10 5.1 5.8 4.2 8.1 1.2
3. 203.0.113.1 0.0% 10 12.3 13.1 11.2 18.4 2.3
4. 72.14.237.1 10.0% 10 145.2 156.3 134.1 189.2 18.9
Fix 2: Detect Packet Loss
# Flood ping to a destination (requires root)
ping -f -c 1000 8.8.8.8
# Check interface statistics for errors
ip -s link show eth0
# Run iperf3 for 30 seconds to measure retransmits
iperf3 -c 10.0.0.2 -t 30
# Check for TCP retransmissions
ss -ti | grep -A1 retrans
Expected output: --- 8.8.8.8 ping statistics --- 1000 packets transmitted, 987 received, 1.3% packet loss, time 1042ms rtt min/avg/max/mdev = 8.142/8.456/12.345/0.678 ms
### Fix 3: Find Bandwidth Hogs
```bash
# Monitor bandwidth per interface in real time
nload
# See traffic per connection
sudo iftop -i eth0
# Find top talkers by connection count
ss -tupn | awk '{print $6}' | cut -d: -f1 | sort | uniq -c | sort -rn | head -10
# Limit bandwidth with tc (rate limit to 10Mbps)
sudo tc qdisc add dev eth0 root tbf rate 10mbit burst 32kbit latency 50ms
Expected output:
Device eth0 (10.0.0.1/24) ════════════════════════════════
Incoming: Total: 45.2 MB/s
Outgoing: Total: 12.8 MB/s
Fix 4: Fix Interface Flapping
# Check kernel messages for link state changes
dmesg | grep -i "link\|down\|up" | tail -10
# Check interface speed and duplex
ethtool eth0
# Force speed and duplex on both ends
sudo ethtool -s eth0 speed 1000 duplex full autoneg off
# Check and reset the interface
sudo ip link set eth0 down && sleep 2 && sudo ip link set eth0 up
Expected output:
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
Link detected: yes
Fix 5: Debug Slow DNS Resolution
# Measure DNS lookup time
dig google.com | grep "Query time"
# Test specific DNS servers
dig @8.8.8.8 google.com +stats
# Compare DNS server performance
for dns in 8.8.8.8 1.1.1.1 208.67.222.222; do
echo -n "$dns: "
dig @$dns google.com +stats 2>&1 | grep "Query time"
done
# Clear the local DNS cache
sudo resolvectl flush-caches
Expected output:
; <<>> DiG 9.18.1-1ubuntu1 <<>> google.com
;; Query time: 12 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
Network Troubleshooting Flowchart
flowchart TD
A[Network Performance Issue] --> B{Check latency}
B -->|High| C[Run mtr to find slow hop]
C --> D[Check for bufferbloat or ISP issue]
B -->|Normal| E{Check packet loss}
E -->|Loss detected| F[Check cables and interface stats]
F --> G[Run iperf3 for retransmit test]
E -->|No loss| H{Check bandwidth}
H -->|Saturated| I[Run nload and iftop]
I --> J[Identify top talker and rate limit]
H -->|Available| K{Check DNS}
K -->|Slow| L[Measure with dig +stats]
L --> M[Change DNS server or cache locally]
K -->|Fast| N{Interface flapping}
N -->|Yes| O[Check ethtool and dmesg]
O --> P[Force speed/duplex and replace cable]
N -->|No| Q[Baseline looks normal]
Prevention Tips
- Set up latency monitoring with Prometheus and alert on anything above 50ms baseline
- Use
tcqdiscs withfq_codelto reduce bufferbloat on egress interfaces - Deploy a local DNS cache like
dnsmasqorunboundto reduce lookup times - Replace copper cables every 3 years and keep spare SFPs for fiber links
- Document your baseline latency, throughput, and packet loss numbers so anomalies stand out
Practice Questions
What is the difference between a connection timeout and high latency on a network? Answer: A timeout means the connection never completed -- packets are not reaching the destination or responses are not returning. High latency means the connection succeeds but with noticeable delay -- packets are arriving but taking longer than normal.
How do you determine whether network slowness is caused by the application or the network itself? Answer: Run
mtrto check latency and loss between client and server. Runiperf3to measure raw throughput independent of the application. If the network baseline is good, the issue is likely in the application layer (slow queries, large payloads, inefficient protocols).What causes interface flapping and how do you fix it? Answer: Interface flapping is caused by faulty cables, duplex mismatch, bad NIC hardware, or driver issues. Fix by checking
dmesgfor link state changes, forcing speed/duplex withethtool -s, replacing the cable, and checking the NIC driver for updates.Challenge: Write a script that pings 5 common destinations (google.com, cloudflare.com, your DNS server, your gateway, and a remote server), records min/avg/max latency for each, and alerts if any exceed a threshold. Answer:
#!/bin/bash threshold=200 for target in google.com <a href="/web-servers-hosting/cloudflare/">cloudflare</a>.com 8.8.8.8 192.168.1.1 server.internal; do result=$(ping -c 10 -q "$target" 2>/dev/null | tail -1) avg=$(echo "$result" | awk -F'/' '{print $5}') if [ -n "$avg" ] && [ "$(printf "%.0f" "$avg")" -gt "$threshold" ]; then echo "ALERT: $target avg latency ${avg}ms exceeds ${threshold}ms" else echo "OK: $target avg latency ${avg}ms" fi done
Quick Reference
| Issue | Diagnostic | Resolution |
|-------|-----------|------------|
| High latency | mtr --report target | Identify slow hop, check with ISP |
| Packet loss | ping -f target | Replace cable, check NIC stats |
| Bandwidth saturation | nload or iftop | Rate-limit with tc |
| Interface flapping | dmesg | grep link | Force speed/duplex, replace cable |
| Slow DNS | dig target +stats | Switch DNS server, use local cache |
FAQ
What is bufferbloat and how does it affect network performance?
Bufferbloat happens when network buffers are too large, causing latency to spike under load. A normally 20ms connection can jump to 500ms+ when a large download starts. Fix it by using fq_codel or cake qdiscs on your router or Linux gateway: sudo tc qdisc replace dev eth0 root fq_codel.
How do you measure real-world network throughput that matches user experience?
Use iperf3 with parallel streams (-P 4) to simulate multi-threaded traffic, and test TCP with realistic MTU settings (-M 1460). For web-specific testing, use curl -w "@format.txt" -o /dev/null -s URL with a custom format file that shows time_connect, time_starttransfer, and time_total.
What is the difference between latency and jitter, and why does jitter matter?
Latency is the time for a packet to travel from source to destination. Jitter is the variation in latency over time -- the difference between consecutive packet arrival times. Jitter matters most for real-time applications like VoIP and video conferencing, where uneven delivery causes gaps and stuttering in audio/video playback.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro