We often think of network engineering performance in terms of either metrics—MTTR, network availability, packet loss—or in terms of equipment and software—routers, switches, modems, portals, and so on. However, even in today’s information-rich landscape with its abundance of countless monitoring systems, network issues are taking longer than ever to resolve.
This is because in many ways, people dictate network performance.
Increased Pressure on Network Engineers is Creating a Knowledge Problem
A traditional service model for a Network Operations Center might allocate a certain number of nodes to each technician or engineer to monitor. But as the number of nodes—and associated monitoring tools—has grown, so too has the number of alerts, making this model less viable and resulting in overwhelm for NOC techs and network engineers.
Consider the astonishing scale and scope of the modern enterprise network: at the edge, this might include any number of systems, from SD-WAN appliances to point-of-sale systems to network security systems, spanning multiple points of presence (PoP) and multiple connection types, each with its own unique configuration and challenges, each monitored individually. The sheer volume of these notifications creates a torrent of information that technicians and engineers must sift through to determine the significance of each alert and the appropriate course of action, if any.
This triage consumes valuable time and requires significant mental effort that takes a very real toll on engineers: the pressure to resolve issues quickly and maintain network stability can lead to high levels of stress and burnout. Turnover in the industry is rampant. The average tenure of a network engineer is only 18 to 24 months. This churn exacerbates knowledge loss and human error—and the implications are significant. Nearly 40% of organizations have suffered a major outage caused by human error over the past three years. In 2022, over 60% of outages cost organizations more than $100,000, and 15% of these surpassed $1 million.
Only Human: When Providers Dodge Accountability During Network Outages
While network engineers play a crucial role in managing complex networks, there are also external actors involved—service providers, for example.
It’s not unusual for a single circuit to cross a half-dozen or more providers. When an outage occurs, often the first step that a network engineer will take is to reach out to each of these providers to determine the point of failure. Unfortunately, human nature sometimes creates a tendency of evading accountability. Service providers may try to shift blame or downplay their responsibility when WAN issues arise. A typical response is, “We’ve looked into the issue, and the problem is outside of our area.”
Providers, of course, have their own interests at stake when doing this—like avoiding any penalties associated with failing to uphold service-level agreements (SLAs) regarding network uptime. Shifting the blame also buys providers time to conduct their own internal investigations and quietly repair any issues, leaving network managers holding the risks of financial loss and reputational damage.
Without comprehensive network-wide visibility, it’s difficult for a network engineer to challenge such claims with concrete evidence. The engineer is left to navigate a labyrinth of finger-pointing and delays while every provider continues to pass the metaphorical buck and MTTI increases with every fruitless call.
The repercussions go beyond business productivity. Network outages create fertile ground for cybercriminals to exploit network vulnerabilities. Extended downtime can disrupt security monitoring protocols, leaving businesses more susceptible to cybersecurity risks—the longer the outage, the more opportunity for malicious actors to strike.
For Better Business Performance, Enhance Employee Efforts (And Morale) with Connectivity Intelligence
Having talented network engineers on staff can keep operations running smoothly and expedite resolution in the case of a serious issue. But talented engineers are becoming harder to find, more expensive to hire, and more difficult to retain. Ensuring that employee morale and satisfaction remain high isn’t just a fluffy, feelings-driven action: it’s smart business. People dictate network performance.
Connectivity intelligence goes beyond simple monitoring and provides actionable insight, can supercharge your team, increase node density per engineer, reduce the associated mental loads, pinpoint failures, and reduce the churn that threatens business performance.
Stay tuned for our next article, where we’ll examine how your technology stack plays a crucial role in successful WAN management. In the meantime, you can download our industry report for a more detailed look at the WAN issues faced today.