NFV and SDN – The death knell for the huge clunky routers?

Last IETF i ran into a couple of hallway discussions where the folks were having a lively debate on whether Network Function Virtualization (NFV) and Software Defined Networking (SDN) will eventually sound the death knell for huge clunky hardware vendors like Cisco, Juniper, Alcatel-Lucent, etc. I was quickly apprised about some Wall Street analyst’s report that projected a significant drop in Cisco’s revenue over the next couple of years as service providers moved to SDN and NFV solutions . I heard claims about how physical routers (that i so lovingly build in AlaLu) will get replaced by virtual routers (vRouters) and other server based software that even small startups could build. The barrier to entry in the service provider markets had suddenly been lowered and the monopoly of the big 3 was being ominously challenged. There was talk about capex spending reduction happening in the service provider networks and how a few operators were holding on to their purchase orders to see how the SDN and NFV story unfurled. There was then a different camp that believed that while SDN and NFV promised several things, it would take time before things got really deployed and started affecting capex spending and OEM’s revenues.

So whats the deal?

Based on my conversation with several folks actively looking into SDN/NFV and a good bit of reading I understand that operators are NOT interested in replacing their edge aggregation and core routers with software driven vRouters. They still want to continue with those huge clunky beasts with full control plane intelligence embedded alongside their  packet pushing data plane. These routers are required to respond to network events in real time (remember FRR?) to prevent outages and slowdowns. Despite all performance improvements the general purpose processors can typically process not more than 2-3 Gbps per core (Intel with DPDK module and APIs for Open Virtual Switch promises better throughput) which is embarrassingly slow when compared to the throughput of 400-600 Gbps thats possible with NPUs and ASICs today. Additionally routers using non-ethernet ports (DSL, PON, Coherent Optical, etc)  cannot be easily virtualized since the general purpose CPUs cannot perform the network functions along with the DSP components required to support these ports.

So while a mobile gateway that essentially forwards packets can be virtualized, it would only make sense to do this where the amount of traffic its handling is relatively small.

So where can we deploy these NFV controlled server based vRouters?

The Provider Edge (PE) routers does several things today, few of which could be easily moved out to be implemented on standard server hardware. ETSI’s NFV Use cases document (case #2)  identifies vPE as a potential NFV use case. The “PE” routers in the MPLS world connects the customer edge (CE) router at the customer premises to the P routers in the provider network. The PE router serves as the service delimiter where it provides L3 VPNs, VPLS, VLL, CDNs and other services to the customers.

The ETSI NFV use-case document (case #2) describes how enterprises are deploying multiple services in branch offices; several of these enterprises use dedicated standalone appliances to provide these services (firewalls, IDS/IPS, WAN optimization, etc), which is “cost prohibitive, inflexible, slow to install and difficult to maintain”.

As a result, many enterprises are looking at outsourcing the virtualization of enterprise CPE (access router) into the operator’s network.

Increased capex and opex pressure is edging enterprises and providers to look at virtualization capabilities made possible by NFV. So, lets look at what all can be virtualized by NFV.

The ETSI NFV use-case document states that “Traditional IP routers  based on custom hardware and software are amongst the most capital-intensive portions of service-provider infrastructure. PE routers run out of control plane resources before they run out of data plane resources and virtualization of control plane functions improves scalability.”

It further states that moving some of the control plane to equivalent functionality implemented in standard commercial servers deploying NFV can result in significant savings.

The figure below gives an idea of the components that can be moved out of the PE router and onto an NFV-powered server.

Network functions/services that can be offloaded from the PE router
Network functions/services that can be offloaded from the PE router

If we’re able to push out the functions/services shown in the figure above, the PE router effectively gets reduced to a router thats mainly pushing the packets out and vPE, the device for service delivery. NFV appears to be most effective at the edge of the network where customers are served — this also happens to be mostly ethernet, which works in the favor of NFV since other ports cannot be served as effectively.

Operators believe NFV can be used for mobile packet core functions for 3G and EPC. LTE operators believe that while basic packet pushing functions must still reside in the routers, the other ancillary functions that have been added to the routers over the time are good candidates for NFV. We can keep BRAS, firewalls, IDS, WAN optimizers, and other service functions separate and use the physical router for merely transferring the packets.

Clearly, the vPE can handle many network functions that are currently done by the conventional physical routers. While the PE may still handle pushing the packets, the intelligence for many of the services typically handled by the PE can be moved to vPE. This is a paradigm shift from what the PE routers have been doing all this while. The network functions and services that can be moved to vPE are:

  • Mobile packet core functions for 3G and LTE EPC
  • Firewalls (FW) and IDS/IPS (Intrusion Detection and Intrusion Prevention systems)
  • Deep Packet Inspection (DPI)
  • CDNs (content delivery networks) and caching
  • IP VPNs – control plane to set up the MPLS VPNs
  • VLLs and VPLS – control plane to set up the MPLS VPNs

These functions can be virtualized to run either on the servers under NFV or can be SDN controlled. Where these reside in the network will depend upon the QoS and QoE (Quality of Experience) required by the customers. If latency and speed is an issue, the functions should reside in servers close to the customers. But if latency is not an issue the functions could reside deep in the provider network or a remote data center.

Conclusion

Operators will deploy NFV and SDN, which will impact their buying decisions. Its clear that they will not be replacing their core and  edge aggregation routers with NFV driven software solutions. Instead, NFV will be used at the edge to offload service functions from the HW PE router onto servers with vPE in the NFV environment to deliver new services agilely to end users and generate higher revenue.

There is thus no need for the Ciscos, Junipers and Alalu’s of the world to worry about falling revenues since the NFV powered solutions are not targeting their highest margain businesses — at least not yet!

Advertisement

iOS7’s impact on networks worldwide

Apple releases an iOS update and the networks all across the world witness a spike of almost 100% in the average traffic that they receive. Apple delivers its content using Akamai, which allegedly handles 20% of world’s total web traffic. Akamai is thus in a unique position to provide a view of whats happening on the web, at any given instant in time. Akamai logs clearly show an over all increase in Internet traffic and the hotspots in Europe soon after Apple released its iOS7.

Akamai
Akamai showing traffic hotspot in Europe

Most service providers saw Akami and Limelight traffic up by an average of 300-700% immediately after iOS7 was released.

Being an Android user myself, i found iOS7’s release with the massive increase in the Internet traffic reported all over the world quite insidious. Honestly, i was a trifle concerned with what iOS7 was internally doing to result this.

It turned out to be quite an anti-climax when i realized that the spurt in network traffic was just because of Apple devices upgrading to the newer iOS. The iOS7 upgrade for the phones is around 900MB, and that for the ipads is around 1.2GB. Given that there are quite a few of these devices out there, one only needs to multiply this with the upgrade size to realize the traffic volumes that service providers all across the world are grappling with.

Its well known that Apple fans dont want to wait before they go in for an upgrade. The iOS7 adoption rate has been the highest ever for any platform (beating their own iOS6 rate, which was in itself phenomenal in all respects). Its claimed that within two days of its release, iOS7 is already running on more than half of all Apple devices out there (which btw is already quite high).

Google is perplexed with how it can improve the miserably low adoption rate for their Android OS.  This seems to stem from the fact that most Android devices just do not receive updates in a timely manner and the ones that do, only go for an update roughly six months after a new version is released.

Jelly Bean (the latest version of Android) currently is on a fewer Android devices than iOS 7 on iOS devices. This may not seem mind boggling, until you realize that iOS 7 has only been out for only 5 days (as of this post) whereas Android Jelly Bean was been around since a little more than a year and half.

iOS’s high adoption rate is a headache for several service providers since, lets face it, all of them oversubscribe their access links. This is done by design, since its assumed that not everyone would demand full bandwidth usage at the same time. Usually it works well, sometimes it doesnt, as we’ll just see.

Most homes have multiple iOS devices, so this translates to each household doing 5-6 GB worth of iOS updates in a single day. Multiply this by thousands and you’ll see the volume of traffic each provider sees around the week whenever an iOS  is released.

Having a CDN which is caching the iOS7 update, would definitely help in any large deployment. What could, suggest some people, also help is if each one of these Apple “i” devices advertise an “iOS update available” locally and other “i” devices merely downloaded the update from there, as long as the signature is valid (all images are signed).

This at the very least  can improve the user experience (no more facebook/twitter updates on how slow their iOS upgrade was) and can potentially help in avoiding clogging the Internet tubes.

Few service providers are furious with Apple as they see their customers complaining that their network/Internet access is slow. There is a camp that thinks its pretty dumb on Apple’s part to make their OS update available globally on the same day — Microsoft and others have a strategy where they provide incremental downloads. Others suggest that Apple should do this on weekends, when traffic volumes are low.  I strongly disagree with this line of reasoning and believe its parochial to call on a war on Apple — remember, iOS updates are user pulls, not Apple pushes. Its the Operators who should update their infrastructure to gracefully handle such events — today its an iOS7 release, tomorrow it could be something else (Obama in a political sex scandal?). If this means getting fatter pipes, or talking to CDN vendors to put caches in their networks or putting up their own caches, then this ought to be done. If they do not/cannot have an CDN cache then they could explore connecting to an Internet Exchange (IX) that does. IX peering, i am told, is not prohibitively expensive in most countries.

Ben quite succinctly sums it up on a nanog mailing list, “Your (the service provider) user is paying you to push packets. If that’s causing you a problem, you either need to review your commercial structure (i.e. charge people more) or your technical network design. Face the facts, what with everyone jumping on the “cloud” bandwagon, the future is only going to see you pushing more packets, not less !  So if you can’t stand the heat, get out of the kitchen (or the xSP industry).”