Troubleshooting a slow network

Troubleshooting a slow network can be challenging. It's often easier to troubleshoot a complete failure. There is no single process to follow since a slow network can be due to many different things. A slow network can be due to:

  • network congestion resulting from incorrect QoS processes, routing, or simply due to a time of high network usage
  • data corruption - there may be a high number of corrupted packets/frames arriving at ports that simply drop them, requiring applications to resend much data.
  • collisions - due to an incorrectly configured network, once again resulting in applications needing to resend data
  • faulty physical infrastructure such as cables
  • too many STP topology change notifications (TCNs) resulting in flapping interfaces
  • non-optimal routing configuration,
  • hardware failure

There are several strategies that can be used to diagnose these including:

  • Ping and Traceroute are utilities that allow you to see the measured delays as well as the paths that are taken by transmitted packets. Delays from end to end as well as to each individual hop will give you an idea of where the problem may be.
  • Monitoring suites such as SolarWinds, Observium, and LibreNMS can use protocols including SNMP to monitor specific aspects of a network and the network devices, identifying errors and events that can help you troubleshoot.
  • Other tools such as NetFlow and NBAR are also useful to gain insight into what is happening on your network.

Having a network monitoring system is critical for such situations. It’s not easy to troubleshoot such problems using the CLI, beyond very basic diagnosis tasks. Using monitoring systems increases visibility, and will warn you whenever any thresholds you set, such as the upper limit of allowed latency, are surpassed.