Drew Golden, Director, Product Management
In the IT industry, we understand that more automation and Machine Learning (ML) will get IT operations to the next level. Many providers are anxious to make the leap from Service to Value, as illustrated in the Gartner chart below. Automation is truly the only way to get there.
The key to a healthy, efficient NOC is the seamless flow of information that leads to an automated solution—before a customer ever feels the impact of an outage.
However, many NOCs experience internal friction that trickles down to the customer and back up through tickets and angry calls. Why? There are a few common reasons:
At Federos, we understand these pains all too well (having sat in the NOC ourselves), which is why we created a holistic, unified service assurance solution, Assure1®.
Before we dive into the solution to these problems, we need to take a closer look at how we, and the industry as a whole, think about automation.
There’s an aspiration goal in the industry when it comes to automation—a “lights-out NOC” or a fully automated NOC. You can imagine a completely virtualized environment that runs on its own, with little to no human involvement needed.
Is this possible? The future seems to be headed in that direction, but we do know that our present and near-future state is not quite there yet.
The reality is, only 10-15% of the work can be fully automated. The other 85-90% still rely on humans to deliver on the actionability.
Why? Most NOCs have a mix of legacy equipment, modern equipment and tech, and virtualized systems (where everything is in the cloud). Not only are these tools separate, but they do not communicate, and as a result create a “swivel-chair” effect for NOC workers. There may be a world where nearly everything is virtualized and fully automated, but as yet, this is aspirational.
The NOC needs processes that automate how the network identifies and resolves service-impacting incidents in real-time. Or, even better, that can prevent incidents before they happen. Reacting to negative events or customer tickets is inefficient and costly. Automation and Machine Learning can scale your ability to predict and prevent issues before they occur.
The need to consolidate and process information quickly is paramount to the success of any network operations team. Until now, Communication Service Providers (CSPs), Managed Service Providers (MSPs) and other enterprises have struggled to visualize their expanding networks quickly and accurately in a singular view, relying on legacy tools and manual practices to monitor critical network functions and services. The proliferation of inventory systems, siloed applications, and the fractured network infrastructures brought together through acquisitions, has created significant visibility gaps to the NOC, negatively impacting productivity and increasing costs.
Once you have consolidated data in one platform, you need to quickly pinpoint, analyze and resolve the root cause of service-impacting events. A system like Assure1® helps you eliminate and suppress massive amounts of noise to ensure your operations team always acts correctly against incidents that typically result in impacted services.
With ML and event analytics, you can leverage industry-standard ML algorithms with special data filters to normalize data, ensuring correct patterns are fed into the ML engine.
Using these data streams, the solution helps you detect anomalies, such as temporal deviations, statistical rarities and unusual behaviors, to generate a singular root causal event. Root causal events contain suppression patterns that filter out noise to improve NOC operators’ rate of predictability to resolve problems versus responding to a storm of event alarms (again, allowing you to be proactive instead of reactive).
At Federos, we talk a lot about actionability because it is the key to effective automation. Operations teams must shift to an actionability mindset in order to drive automation.
ML and event analytics rounds out the three-prong Assure1® strategy for providing customers with industry-leading root cause analysis (RCA). Federos delivers three types of RCA, and the final one is tied to actionability that requires a human:
So, now we ask you: how much time are you spending in reactive mode or on manual, time-consuming processes? Are you being asked to do more with less information?
Unfortunately, those are typical NOC conditions—and they shouldn’t be.
Assure1® collects and normalizes fault, performance, topology, service, and other external data into a single, unified platform. Advanced correlation and analysis, including AI/Machine Learning, produces actionable insights that drive automation and improve operational efficiency while significantly lowering costs.