As operators embrace multi-cloud and cloud-native architectures, cloud-native function validation is proving to be a continuous requirement
As operators have migrated to multi-cloud, the reliability of network functions has become a new cause of worry. Many experience container failures, jitters and latency, and resource constraints settling into this new cloud-native base.
The failures have a specific cause: unfamiliarity with new container-based technologies. The previous architecture featured monolithic applications, whereas the new cloud-native functions (CNFs) are built on cloud-native principles. As a result, this new infrastructure is less predictable, which often makes it less reliable.
The problem arises from a multi-vendor model which on one hand provides freedom from vendor lock-ins and offers lucrative cost savings, but on the other hand, introduces a new level of complexity.
“The fundamental difference is [operators] are still getting the cloud-native functions from their [previous] vendor, but they’re getting the cloud from a different vendor, and usually getting the hardware from even a different vendor. What that means is now operators themselves assume the responsibility of systems integration,” said Bill Clark, principal product manager, 5G Cloud-native Deployment Validation at Spirent.
But few prepare for failures moving into a platform that is famous for performance and resiliency, an oversight that has a high cost.
“In this new world, there’re things like pods and nodes and containers. None of those things ever existed before Kubernetes and cloud native…They have to be very conscious about what if something fails in the middle of the cloud,” Clark cautioned.
When a CNF failure happens in this new stack, identifying the source isn’t easy. So to get around the problem, CNF validation is introduced as an essential business strategy. Validating CNF resiliency ensures that they behave as expected when pushed into production in a public cloud environment. More importantly, it provides answers to critical questions like whether the cloud provider can meet the CNF’s performance demands in a way that meets service-level agreements (SLAs), and what must be done to resolve it, if it does not.
In order to avoid service degradation and costly outages, providers recommend lab and pre-production testing. This involves proactive and continuous validation of CNFs against all the real failure scenarios, whether that’s failure at the container, pod or node level, latency and packet loss issues, or resource constraints.
These scenarios are simulated in a safe lab environment to test CNFs’ ability to come back from a failure and resume service. The tests offer visibility into key fault indicators that point to hidden risks of failure, as well as verify redundancy, failover, and policy responses to ensure high availability (HA).
Spirent offers a platform-agnostic solution that performs CNF validation right out-of-box. Spirent positions the solution as a more affordable alternative to open-source testing tools. Clark described it as “something that hasn’t really existed before”. In practice, this allows CNF validation to move from a one-time thing to an ongoing exercise.
