Why we cannot live without a Telco Cloud, and how does one build one?

There are a more mobile phone connections (~7.9 billion) than the number of humans (~7.7 billion) colonising this planet.

Let me explain.

Clearly, not every person in the world has a mobile device. Here we’re talking about mobile connections that come from people with multiple devices (dual SIMs, tablets) and other integrated devices like cars, and other smart vehicles, and of course the myriad IOT devices. I don’t have to go too far — my electric 2 wheeler has a mobile connection that it uses to cheerfully download the updated firmware version and the software patches every now and then.

While the global population is growing at 1.08% annually, the mobile phone connections are growing at 2.0%. We will very soon be outnumbered by the number of mobile subscriptions, all happily chatting, tweeting and in general sending data over the network. Some of it would need low latency and low jitter, while some may be more tolerant to the delays and jitter.

What’s the big deal with mobile connections growing?

Well, historically most people have used their mobile phones to talk; to catch up on all the gossip on your neighbours and relatives.

Not anymore.

Now, it’s primarily being used to watch video.

And lots of it; both cached and live.

And it will only grow.

Video traffic in mobile networks is forecast to grow by around 34 percent annually up to 2024 to account for nearly three-quarters of mobile data traffic, from approximately 60 percent currently.

Why is the mobile video traffic growing?

The growth is driven by the increase of embedded video in many online applications, growth of video-on-demand (VoD) streaming services in terms of both subscribers and viewing time per subscriber, multiple video sharing platforms, the evolution towards higher screen resolutions on smart devices. All of these factors have been influenced by the increasing penetration of video-capable smart devices.

India had (still has?) the highest average data usage per smartphone at around 9.8 GB per month by the end of 2018.

And the Internet traffic’s not even hit the peak yet.

It will hit the roof once 5G comes in. Will reach dizzying stratospheric heights when mobile content in the Indian vernacular languages comes of age.

India is home to around 19,500 languages or dialects. Every state has its own primary language and which often is alarmingly different from the state bordering it. There is a popular Hindi saying:

Kos-kos par badle paani, chaar kos par baani

The languages spoken in India change every few kilometres, as does the taste of the water.

Currently, most of the mobile content is in few popular Indian languages.

However, thats changing.

How is the Internet traffic related to the number of languages in India?

According to a Sharechat report , 2018 was the year when for the first time internet users in great numbers accessed social media in their regional languages and participated actively in contributing to user generated content in native languages.

A KPMG India and a Google report claims that the Indian language internet users are expected to grow at a CAGR of 18% vs English users at a CAGR of 3%.

This explains a flurry of investments in vernacular content startups in India.

When all these users come online, we are looking at a prodigious growth in the Internet traffic. More specifically, in the user generated traffic, which primarily would be video — again, video that is live or could be cached.

In short, we’re looking at massive quantities of data being shipped at high speeds over the Internet.

And for this to happen, the telco networks need to change.

From the rigid hardware based network to a more agile, elastic, virtualized, cloud based network. The most seismic changes will happen in the service provider network closest to the customer — the edge network. In the Jurassic age, this would have meant more dedicated hardware at the telco edge. However, given the furious rate of innovation, locking into rigid hardware platforms may not be very prudent since the networks will need to support a range of new devices, service types, and use cases. 5G with its enhanced mobile data experience will unleash innovation that’s not possible for most ordinary mortals to imagine today. The networks however need to be ready for that onslaught. They need to be designed to accommodate the agility and the flexibility that is not needed today.

And how do the networks get that agility and flexibility?

I agree with Wally for one.

The telcos will get that flexibility by virtualizing their network functions, and by, uhem, moving it all to the cloud.

Let me explain this.

Every node, every element in a network exists for a reason. It’s there to serve a function (routing, firewall, intrusion detection, etc). All this while we had dedicated, proprietary hardware that was optimized and purpose built to serve that one network function. These physical appliances had to be manually lugged and installed in the networks. I had written about this earlier here and here.

Now, replace this proprietary hardware with a pure software solution that runs on an off-the-shelf x86 based server grade hardware. One could run this software on a bare metal server or inside a virtual machine running on the hypervisor.

Viola, you just “virtualized” the “network function”!

This is your VNF.

So much for the fancy acronym.

The networks get that flexibility and agility by replacing the physical appliances with a telco cloud running the VNFs. By bringing the VNFs closer to their customer’s end devices. By distributing the processing, and management of data to micro datacenters at the periphery of the network, closer to the customer end devices. Think of it as content caching 2.0.

The edge cloud will be the first point of contact and a lot of processing will happen there. The telco giants are pushing what’s known as edge computing: where VNFs run on a telco cloud closer to the end user, thereby cutting the distance to a computer making a given decision. These VNFs, distributed across different parts of a network, run at the “edges” of the network.

Because the VNFs run on virtual machines, one could potentially run several such virtual functions on a single hypervisor. Not only does this save on hardware costs, space and power, it also simplifies the process of wiring together different network functions, as it’s all done virtually within a single device/server. The service function chaining got a lot simpler!

While we can run multiple VNFs on a single server, we can also split the VNFs across different servers to gain additional capacity during demanding periods. The VNFs can scale up, and scale down, dynamically, as the demand ebbs and flows.

This just wasn’t possible in the old world where physical network functions (fancy word for network appliances) were used. The telco operators would usually over-provision the network to optimize around the peak demand.

In the new paradigm we could use artificial intelligence and deep learning algorithms to predict the network demand and spin the VNFs in advance to meet the network demand in advance.

How can machine learning help?

Virtual Network Functions (VNFs) are easy to deploy, update, monitor, and manage. It’s after all just a special workload running on a VM. It takes less than a few seconds to spin a new instance of a VM.

The number of VNF instances, similar to generic computing resources in cloud, can be easily scaled based on load. Auto-scaling (of resources without human intervention) has been investigated in academia and industry. Prior studies on auto-scaling use measured network traffic load to dynamically react to traffic changes.

There are several papers that explore using a Machine Learning (ML) based approach to perform auto-scaling of VNFs in response to dynamic traffic changes. The ML classifiers learn from past VNF scaling decisions and seasonal/spatial behavior of network traffic load to generate scaling decisions ahead of time. This leads to improved QoS and significant cost savings for the Telco operators.

In a 2017 Heavy Reading survey, most respondents said that AI/ML would become a critical part of their network operations by 2020. AI/ML and Big data technologies would play a pivotal role in making real time decisions when managing virtualized 5G networks. I had briefly written about it here.

Is Telco Cloud the same as a data center?

Oh, the two are different.

Performance is the key in Telco Cloud. The workloads running on the Telco Cloud are extremely sensitive to delay, packet loss and latency. A lot of hard work goes into ensuring that the packet reaches the VNF as soon as it hits the server’s NIC. You dont want the packet to slowly inch upwards through the host’s (its almost always Linux) OS before it finally reaches the VM hosting the VNF.

In a datacenter running regular enterprise workloads, a few milli seconds of delay may still be acceptable. However, on a telco cloud, running a VNF, such a delay can be catastrophic.

Linux, and its networking component is optimized for general purpose computing. This means that achieving high performance networking inside the Linux kernel is not easy, and requires some bit, ok quite a bit, of customizations and hacks to get it past the 50K packets per second limit thats often incorrectly cited as an upper limit for the Linux kernel performance. Routing packets through the kernel may work for the regular data center workloads.

However, the VNFs need something better.

Because the Linux kernel is slow, we need to completely bypass the kernel.

One could start with SR-IOV.

Very simply, with SR-IOV, a VM hosting the VNF has direct access to subset of PCI resources on a physical NIC. With an SR-IOV compliant driver, the VNF can directly DMA (Direct Memory Access) the outgoing packets to the NIC hardware to achieve higher throughput and lower latency. DMA operation from the device to the VM memory does not compromise the safety of underlying hardware. Intel IO Virtualization Technology (vt-d) supports DMA and interrupt remapping and that restricts the NIC hardware to subset of physical memory allocated for a particular VM. No hypervisor interaction is needed except for interrupt processing.

However, there is a problem with SR-IOV.

Since the packet coming from the VNF goes out of the NIC unmodified, the telco operators would need some other HW switch, or some other entity to slap on the VxLAN or the other tunneling headers on top of the data packet so that it can reach the right remote VM. You need a local VTEP that all these packets hit when they come out of the NIC.

Having a VTEP outside complicates the design. The operators would like to push the VTEP into the compute, and have a plain IP fabric that only does IP routing. There was ways to solve this problem as well, but SR-IOV has limitations on potential migration of the VM hosting the VNF from one physical server to another. This is a big problem. If the VM gets locked down, then we lose on the flexibility and the agility that we had spoken of before.

Can something else be used?

Yes.

There’s a bunch of kernel bypass techniques, and I’ll only look at a few.

Intel DPDK (Data Plane Development Kit) has been used in some solutions to bypass the kernel, and then there are new emerging initiatives such as FD.io (Fast Data Input Output) based on VPP (Vector Packet Processing). More will likely emerge in the future.

DPDK and FD.io move networking into Linux user space to address both speed and technology plug-in requirements. Since these are built in the Linux user space, there are no changes in the Linux kernel. This eliminates the extra effort required to convince the Linux kernel community about the usefulness of the patches and their adoption can be accelerated.

DPDK bypasses the Linux kernel and manages the NIC and CPU assignment directly. It uses up some CPU cores for the network processing. It has threads that handle the receiving and processing of packets from the assigned receive queues. They do this in a tight loop, and anything interrupting these threads can cause packets to be dropped. That is why these threads must run on dedicated CPU cores; that is, no other threads — including the various Linux kernel tasks — in the system should run on this core.

Telcos consider this as a “waste” of their CPU cores. The cores that could have run the VNFs have now been hijacked by the DPDK to process packets from the NICs. Its also questioned if we can get a throughput of 100Gbps and beyond with DPDK and other kernel bypass techniques. It might be asinine to dedicate 30 CPU cores in a 32 core server for packet processing, leaving only 2 cores for the VNF.

Looks almost impossible to get 100Gbps+ thats needed for NFV.

Fortunately, no — things are a lot better.

Enter SmartNICs — the brainer cousin of the regular NICs, or rather NICs on steroids. These days there is a lot more brains in the modern NIC – or at least some of them – than we might realize. They take the offloading capabilities to a whole new level. The NIC vendors are packing in a lot of processors in their NIC ASICs to beef up their intelligence. Mellanox’s ConnectX-5 adapter card, which is widely deployed by hyperscalers has six different processors built into it that were designed by Mellanox.

Ok, so these are not CPUs in the normal sense, the ones you and I understand. These are purpose built to allow the NIC to, for instance, look at fields in the incoming packets, look at the IP headers and the Layer 2 headers and look at the outer encapsulated tunnel packets for VXLAN and from that do flow identification and then finally take actions.

This is history repeating itself.

Many many years ago, when dinosaurs still ruled the Earth, Cisco would use a MIPS processor to forward packets in software. And then the asteroid hit the Earth, and Cisco realized that to make the packet routing and forwarding more efficient and for it to scale, they needed custom ASICs, and they started making chips to forward packets.

This is exactly what is now happening in the Telco Cloud space. Open vSwitch was pure software that steered the data between individual virtual machines and routed it, but the performance and scalability was bad that companies started questioning on why some of the processing couldn’t be offloaded to the hardware. Perhaps, down to the NIC if you will. And thats what the latest and the greatest smartNICs do. You can download the OVS rules onto the NIC cards so that you completely bypass the Linux kernel and do all that heavy lifting in hardware.

DPDK and SmartNICs are very interesting and warrants a separate post, which i will do in some time.

So, what is the conclusion?

Aha, i meandered. I often do when I’m very excited.

The Internet traffic is exploding. It’s nowhere near the saturation point, and will increase manifold with 5G and other technologies coming in.

The Telco network can only scale if its virtualized, a’la the telco cloud. Pure hardware based old-style network, especially at the edges, will fail miserably. It will not be able to keep pace with the rapid changes that 5G will bring in. Pure hardware will still rule in the network core. Not at the edge. The edge cloud is where most of the innovation (AI/ML, kernel bypass in software, smartNICs, newer offloading capabilities, etc) will take place.

Telco cloud is possible. We have all the building blocks, today. We have the technology to virtualize, to ship packets at 100-200 Gbps to (and from) the VMs running the VNFs. Imagine the throughput that a rack full of commodity x86 servers, where each does 200Gbps, will get you.

I am very excited about the technology trendlines and the fact that what we’re working on in Nuage Networks is completely inline with where the networking industry is headed.

I am throughly enjoying this joyride. How about you?

Software defined WAN (SD-WAN) is really about Intelligence ..

Lets admit that most of us in the networking domain know as much about SD-WAN as an average 6th grader on fluid mechanics — which is to say pretty much nothing. We take it as something much grander and exotic than what it really is and are obviously surrounded by friends and well-wishers who wink conspiratorially that they “know it all” and consider themselves on an intellectual high ground to educate us on matters of this rich and riveting biological social interaction. Like most others in that tender and impressionable age, i did get swayed by what i heard and its only later that i was able to sort things out in my head, till it all became somewhat clear.

The proverbial clock’s wound backwards and i experience that feeling of deja-vu each time i read an article on SD-WAN that either extols its virtues or vilifies it as something that has always existed and is being speciously served on a platter dressed up as something that it is not. And like the big boys then, there are men who-know-it-all, who have already written SD-WAN off as something that has always existed and really presents nothing new here. Clearly, i disagree with that view.

I presume, perhaps a trifle rashly, that you are already aware of basic concepts of SDN and NFV (and this) and hence wouldnt waste any more oxygen explaining those.

So what really is the SD-WAN technology and the precise problem that its trying to solve?

SD-WAN is a way of architecting, designing and deploying enterprise WANs using commodity Internet connections in a manner that makes those “magically” appear as a private “MPLS-like” connection. Its the claim that it can appear “MPLS-like” that really peeves the regular-big-mpls-vendors-and-consultants. I will delve into the “MPLS-like” aspect a little later, so please hold on to your sabers till then. What makes the “magic” work is the control plane that implements and enforces the network access policies (VOIP is high priority/low latency/low jitter, big data sync medium priority and all else low priority, no VOIP via Afghanistan, etc) and the data plane that weaves an L2/L3 overlay on top of the existing consumer-grade Internet links (broadband links and in a few cases the LTE/4G connections).

The SD-WAN evangelists want to wean enterprises off their dedicated prohibitively priced private WAN connections (read MPLS circuits) with commodity enterprise broadband links. Philosophically, adding a new branch should just mean shipping a CPE device (perhaps in a virtualized form-factor) that auto-magically dials into a central controller when brought to life. Once thats done and the credentials verified, the branch should just come online (viola!) and should be visible to all the geo-separated branches. Contrast this with the provisioning time (can go as high as a year in some remote locations) and the complexity it takes to get a remote branch online today with MPLS and you will understand why most IT folks have ulcers and are perennially on anti-anxiety/depressant medicines. And btw we’ve not even begun talking about the expenses and long term contracts with the MPLS connections here!

Typically SD-WAN solutions have a central SDN controller which is really a cluster of x86 devices (servers, VMs, containers, take your pick) and hence has computing and analytical horsepower much more than a dedicated HW network device. The controller has complete visibility right from the source all the way till the destination and can constantly analyze traffic and can carve out optimal network paths for applications and individual flows based on the user and application policies. In the first mile the Internet links are either coalesced to form a fatter pipe or are used separately as dictated by the customer policies. The customer traffic is continuously finger-printed and is routed dynamically based on the real time network conditions.

Where most people go wrong is when they believe that SD-WAN solutions lose control over the traffic once it leaves the customer premises or the SD-WAN edge node. Bear in mind that there is nothing in the SD-WAN technology that prevents further control over how the traffic is routed and this could perhaps be one aspect differentiating one SD-WAN offering from the other. Since SD-WAN is an overlay technology you will not have control over each physical hop, but you can surely do something more nuanced given the application and end-to-end network visibility that exists with the controller.

Its “MPLS-like” in the sense that you can, in most cases, guarantee the available bandwidth and network up time. The central controller can monitor each overlay circuit for loss/jitter/delay and can take corrective actions when routing traffic. Patently enterprise broadband connections in certain geographies dont come with the same level of reliability as MPLS and it behooves upon us to ask ourselves if we need that level of reliability (given the cost that we pay for such connections). An enterprise can always hedge its risks by commissioning a few backup enterprise broadband connections for those rainy days when the primary is out cold. Alternatively, enterprises can go in for a hybrid approach where they maintain a low bandwidth MPLS connection for their mission-critical traffic and use the SD-WAN solution for everything else OR can implement a policy to revert to the MPLS connection when the Internet connections are not working satisfactorily. This can also provide a plausible transition strategy to the enterprises who may not be comfortable switching to SD-WANs given that the technology is still relatively new.

And do note that even MPLS connections go down, so its really not fair to say that SD-WAN solutions stand on tenuous grounds with regard to the reliability. Yes i concede that there are SLAs given with MPLS that just dont exist with regular Internet pipes. However, one could argue that you can get some bit of extra reliability by throwing in an additional Internet link (with a different provider?) thats only there as a standby. Also note that with service providers now giving fiber connections, the size and the quality of Internet links is only going to improve with time. A large site for instance can aggregate a 1Gbps Google Fiber and a 1Gbps Verizon FIOS connection and can retain a small MPLS connection as the standby. If the enterprise discovers that its MPLS connection is underutilized it can negotiate on pricing or can go with lower MPLS pipe and thereby save on its costs.

I recently read a blog which argued that enterprise broadband promising 350Mbps would mostly give only around 320Mbps on an average. Sure this might be true in a few geographies, but seriously, who cares? Given the cost difference between a broadband connection and an MPLS circuit i will gladly assume that i only had a 300Mbps connection and derive utmost pleasure any time it gives me anything more than that!

The central controller in the SD-WAN technologies amongst other things (analyzing traffic, links) can also continually learn about the customer network conditions and can predict when link qualities will deteriorate and can preemptively reroute traffic before the links start acting up. Given that the controller is monitoring paths end-to-end and is also monitoring and analyzing the traffic emanating from the branch sites there are insights that enterprises can draw that they could have never imagined when using traditional WAN architectures since in that world all connections are really only “dumb pipes”. SD-WAN changes all that — it changes how the enterprise connections and the applications running there are viewed. The WAN architecture is aligned to the application service requirements and its management is greatly simplified. You can implement complex network policies and let the SD-WAN infrastructure sweat on your behalf (HINT: intent driven networking).

So watch out before you disdainfully write off SD-WAN as a technology thats merely replacing your dumb MPLS pipes with the regular Internet connections, since i argue, it can really do a lot more than that. Perhaps a topic worth discussing some other day.

NFV – CPE vendors MUST evolve!

Customer Premises Equipment (CPE) devices have always been a pain point for the service providers. One, they need to be installed in large large numbers (surely you remember the truck rolls that need to be sent out), and second, and more importantly, they get complex and costlier with time. As services and technology evolve, these need to be replaced with something more uglier and meaner than what existed before. In a large network, managing all the CPEs — right from the configuration, activation, monitoring, upgrading and efficiently adding more services – in itself becomes a full time job (and not the one with utmost satisfaction i must add).

ETSI’s Use case #2 describes how the CPE device can be virtualized. The idea is to replace the physical CPEs with all the services it supports on an industry standard server that is and cheaper and easier to manage. Doing this can reduce the number and complexity of the CPE devices that need to be installed at the customer sites.

The jury is still out on the specific functions that can be moved out of the CPE. Clearly, what everybody agrees to is a need for a device that will physically connect the customer to the network. There will hence always be a device at the customer premises. If we can virtualize most of the things that a CPE does, then this device could be a plain L2 switch that takes packets from the customers and pushes those towards the network side.

So what do we gain by CPE virtualization?

You reduce the number of devices deployed at customer premises. Most enterprise customers when adding new services add more devices beyond the access point/demarcation device or NID. If the functions serviced by those devices can be virtualized, then you dont need to add those extra devices.

In residential markets, we can completely remove the set-top boxes (including storage for video recorders) and the layer 3 functions provided by home gateways as these functions can be virtualized (on standard servers driven by highly scalable cloud-based software) , leaving each home with a plain L2 switch. This apparently is already underway as we speak.

Since each site has a vanilla L2 switch, you dont need to replace it till its potent enough to handle the incoming traffic onslaught. Since all the intelligence resides in the standard server, it can easily be replaced/upgraded without involving the dreaded truck rolls.

Your engineers dont have to visit customer premises for upgrades. Since most of the services are hosted over the cloud, all upgrades happen at the hosting location or the data center. Even if the virtualized services are deployed at the customer premises, you dont have to upgrade each CPE device. Its only the server at the customer premises that needs an upgrade.

Newer services and applications can be easily introduced, since those can be tested out at the hosting location or the data center. You dont have to worry about trying it out on all the different CPE devices. Barrier to entry in the network has suddenly lowered since the legacy CPE equipment doesnt need to be replaced. Also helps avoid vendor lock-in if all CPE devices are plain L2 switches and all the “work” is being done in SW on the standard servers.

Scaling up becomes less of a headache. BGP routers, as they start scaling, run out of control plane memory much before they hit the data plane limits. If the control plane has been virtualized, then its much easier to address this problem vis-a-vis when BGP is running on physical routers.

There are several vendors pushing for CPE virtualization. If you’re a CPE vendor who believes that your services are far too complex to be virtualized, then beware that things are moving very fast in the NFV space. I had earlier posted about how virtual routers can replace the existing harware here. Its fairly easy to imagine CPEs going virtual — from being high end devices to easily commoditized L2 switches! So if you dont evolve fast, then you run the risk of getting extinct!

NFV: Will vRouters ever replace hardware routers?

When i started looking at NFV, i always imagined it being relegated to places in the network that would receive only teeny weeny amount of data traffic since the commodity hardware and software could only handle so much of traffic. I also naively believed that it would be deployed in networks where customers were not uber-sensitive to latency and delay (broadband customers, etc). So if somebody really wanted a loud bang for their buck they had to use specialized hardware to support the network function. You couldnt really use Intel x86-based servers running SW serving customers for whom QoS and QoE were critical and vital. The two examples that leap to my mind are (i) Evolved Packet Core (EPC) functions such as Mobility Management Entity (MME) and BNG environments where the users need to be authorized before they can expect to receive any meaningful services.

While i understood that servers were getting powerful and Intel was doing its bit with its Data Plane Development Kit (DPDK) architecture, it didnt occur to me till recently that we would be seeing servers handling traffic at 10G+ line rate. Vyatta, a Brocade company now, uses vRouters to implement real network functions. Vyatta started with its modest 5400 vRouter that could only handle 1G worth of traffic at the line rate. But then last year it announced 5600 vRouter that takes advantage of Intel multi-core and DPDK architecture to achieve 10x+ performance. Essentially how DPDK drastically improves the performance is by directly passing the packets from the line card to the code running in the userspace by completely bypassing the high-latency DRAM processing thus speeding up the packet processing. It also supports amongst other things, lockless FIFO implementation for packet enqueue/dequeue as semaphores and spinlocks are expensive.

The Vyatta 5600 vRouter can be installed on pretty much any x86 based server and can support number of network functions such as dynamic routing, policy-based routing, firewalls, VPN, etc. Vyatta redesigned its software to make use of multiple cores — so while the control plane ran on one core, the data plane was distributed across multiple cores. Using a 4 core processor, they ran control plane on 1 core, and 3 instances of line traffic were handled by the remaining 3 cores. This way Vyatta was able to handle 10G traffic through a single processor.

Now imagine putting 3-4 such x86 based servers in a network. If (and we look at this in some other blog post) you can split the data traffic equitably, you can achieve close to 30-40G throughput.

Wind River a few weeks ago announced its new accelerated virtual switch (vSwitch) that could deliver 12 million packets per second to guest virtual machines (VMs) using only two processor cores on an industry-standard server platform, in a real-world use case involving bidirectional traffic.

Many people believe that NFV is best suited to deployed at the edge of the network — basically close to the customers and isnt yet ready for the core or places where the traffic volumes are high or the latency tolerance is low. I agree to this, and covered this aspect in great details here.

What this shows is that its patently possible for virtual routers to run at speeds comparable to regular hardware based routers and can replace them. This augurs well for NFV since it means that it can be deployed in a lot many places in the carrier network than what most skeptics believed till some time back.

NFV and SDN – The death knell for the huge clunky routers?

Last IETF i ran into a couple of hallway discussions where the folks were having a lively debate on whether Network Function Virtualization (NFV) and Software Defined Networking (SDN) will eventually sound the death knell for huge clunky hardware vendors like Cisco, Juniper, Alcatel-Lucent, etc. I was quickly apprised about some Wall Street analyst’s report that projected a significant drop in Cisco’s revenue over the next couple of years as service providers moved to SDN and NFV solutions . I heard claims about how physical routers (that i so lovingly build in AlaLu) will get replaced by virtual routers (vRouters) and other server based software that even small startups could build. The barrier to entry in the service provider markets had suddenly been lowered and the monopoly of the big 3 was being ominously challenged. There was talk about capex spending reduction happening in the service provider networks and how a few operators were holding on to their purchase orders to see how the SDN and NFV story unfurled. There was then a different camp that believed that while SDN and NFV promised several things, it would take time before things got really deployed and started affecting capex spending and OEM’s revenues.

So whats the deal?

Based on my conversation with several folks actively looking into SDN/NFV and a good bit of reading I understand that operators are NOT interested in replacing their edge aggregation and core routers with software driven vRouters. They still want to continue with those huge clunky beasts with full control plane intelligence embedded alongside their packet pushing data plane. These routers are required to respond to network events in real time (remember FRR?) to prevent outages and slowdowns. Despite all performance improvements the general purpose processors can typically process not more than 2-3 Gbps per core (Intel with DPDK module and APIs for Open Virtual Switch promises better throughput) which is embarrassingly slow when compared to the throughput of 400-600 Gbps thats possible with NPUs and ASICs today. Additionally routers using non-ethernet ports (DSL, PON, Coherent Optical, etc) cannot be easily virtualized since the general purpose CPUs cannot perform the network functions along with the DSP components required to support these ports.

So while a mobile gateway that essentially forwards packets can be virtualized, it would only make sense to do this where the amount of traffic its handling is relatively small.

So where can we deploy these NFV controlled server based vRouters?

The Provider Edge (PE) routers does several things today, few of which could be easily moved out to be implemented on standard server hardware. ETSI’s NFV Use cases document (case #2) identifies vPE as a potential NFV use case. The “PE” routers in the MPLS world connects the customer edge (CE) router at the customer premises to the P routers in the provider network. The PE router serves as the service delimiter where it provides L3 VPNs, VPLS, VLL, CDNs and other services to the customers.

The ETSI NFV use-case document (case #2) describes how enterprises are deploying multiple services in branch offices; several of these enterprises use dedicated standalone appliances to provide these services (firewalls, IDS/IPS, WAN optimization, etc), which is “cost prohibitive, inflexible, slow to install and difficult to maintain”.

As a result, many enterprises are looking at outsourcing the virtualization of enterprise CPE (access router) into the operator’s network.

Increased capex and opex pressure is edging enterprises and providers to look at virtualization capabilities made possible by NFV. So, lets look at what all can be virtualized by NFV.

The ETSI NFV use-case document states that “Traditional IP routers based on custom hardware and software are amongst the most capital-intensive portions of service-provider infrastructure. PE routers run out of control plane resources before they run out of data plane resources and virtualization of control plane functions improves scalability.”

It further states that moving some of the control plane to equivalent functionality implemented in standard commercial servers deploying NFV can result in significant savings.

The figure below gives an idea of the components that can be moved out of the PE router and onto an NFV-powered server.

Network functions/services that can be offloaded from the PE router

If we’re able to push out the functions/services shown in the figure above, the PE router effectively gets reduced to a router thats mainly pushing the packets out and vPE, the device for service delivery. NFV appears to be most effective at the edge of the network where customers are served — this also happens to be mostly ethernet, which works in the favor of NFV since other ports cannot be served as effectively.

Operators believe NFV can be used for mobile packet core functions for 3G and EPC. LTE operators believe that while basic packet pushing functions must still reside in the routers, the other ancillary functions that have been added to the routers over the time are good candidates for NFV. We can keep BRAS, firewalls, IDS, WAN optimizers, and other service functions separate and use the physical router for merely transferring the packets.

Clearly, the vPE can handle many network functions that are currently done by the conventional physical routers. While the PE may still handle pushing the packets, the intelligence for many of the services typically handled by the PE can be moved to vPE. This is a paradigm shift from what the PE routers have been doing all this while. The network functions and services that can be moved to vPE are:

Mobile packet core functions for 3G and LTE EPC
Firewalls (FW) and IDS/IPS (Intrusion Detection and Intrusion Prevention systems)
Deep Packet Inspection (DPI)
CDNs (content delivery networks) and caching
IP VPNs – control plane to set up the MPLS VPNs
VLLs and VPLS – control plane to set up the MPLS VPNs

These functions can be virtualized to run either on the servers under NFV or can be SDN controlled. Where these reside in the network will depend upon the QoS and QoE (Quality of Experience) required by the customers. If latency and speed is an issue, the functions should reside in servers close to the customers. But if latency is not an issue the functions could reside deep in the provider network or a remote data center.

Conclusion

Operators will deploy NFV and SDN, which will impact their buying decisions. Its clear that they will not be replacing their core and edge aggregation routers with NFV driven software solutions. Instead, NFV will be used at the edge to offload service functions from the HW PE router onto servers with vPE in the NFV environment to deliver new services agilely to end users and generate higher revenue.

There is thus no need for the Ciscos, Junipers and Alalu’s of the world to worry about falling revenues since the NFV powered solutions are not targeting their highest margain businesses — at least not yet!

OpenFlow, Controllers – Whats missing in Routing Protocols today?

There is a lot of hype around OpenFlow as a technology and as a protocol these days. Few envision this to be the most exciting innovation in the networking industry after the vaccum tubes, diodes and transistors were miniaturized to form integrated circuits. This is obviously an exaggeration, but you get the drift, right?

The idea in itself is quite radical. It changes the classical IP forwarding model from one where all decisions are distributed to one where there is a centralized beast – the controller – that takes the forwarding decisions and pushes that state to all the devices (could be routers, switches, WiFi access points, remote access devices such as CPEs) in the network.

Before we get into the details, let’s look at the main components – the Management, Control and the Forwarding (Data) plane – of a networking device. The Management plane is used to manage (CLI, loading firmware, etc) and monitor the device through its connection to the network and also coordinates functions between the Control and the Forwarding plane. Examples of protocols processed in the management plane are SNMP, Telnet, HTTP, Secure HTTP (HTTPS), and SSH.

The Forwarding plane is responsible for forwarding frames – it receives frames from an ingress port, processes them, and sends those out on an egress port based on what’s programmed in the forwarding tables. The Control plane gathers and maintains network topology information, and passes it to the forwarding plane so that it knows where to forward the received frames. It’s in here that we run OSPF, LDP, BGP, STP, TRILL, etc – basically, whatever it takes us to program the forwarding tables.

Routing Protocols gather information about all the devices and the routes in the network and populate the Routing Information Base (RIB) with that information. The RIB then selects the best route from all the routing protocols and populates the forwarding tables – and Routing thus becomes Forwarding.

So far, so good.

The question that keeps coming up is whether our routing protocols are good enough? Are ISIS, OSPF, BGP, STP, etc the only protocols that we can use today to map the paths in the network? Are there other, better options – Can we do better than what we have today?

Note that these protocols were designed more than 20 years ago (STP was invented in 1985 and the first version of OSPF in 1989) with the mathematics that goes in behind these protocols even further. The code that we have running in our networks is highly reliable, practical, proven to be scalable – and it works. So, the question before us is – Are there other, alternate, efficient ways to program the network?

Lets start with what’s good in the Routing Protocols today.

They are reliable – We’ve had them since last 20+ years. They have proven themselves to be workable. The code that we use to run them has proven itself to be reliable. There wouldn’t be an Internet if these protocols weren’t working.

They are deterministic in that we know and understand them and are highly predictable – we have experience with them. So we know that when we configure OSPF, what exactly will it end up doing and how exactly will it work – there are no surprises.

Also what’s important about today’s protocols are that they are self healing. In a network where there are multiple paths between the source and the destination, a loss of an interface or a device causes the network to self heal. It will autonomously discover alternate paths and will begin to forward frames along the secondary path. While this may not necessarily be the best path, the frames will get delivered.

We can also say that today’s protocols are scalable. BGP certainly has proven itself to run at the Internet’s scale with extraordinarily large number of routes. ISIS has as per the local folklore proven to be more scalable than OSPF. Trust me when i say that the scalability aspect is not the limitation of the protocol, but is rather the limitation of perhaps the implementation. More on this here.

And like everything else in the world, there are certain things that are not so good.

Routing Protocols work under the idea that if you have a room full of people and you want them to agree on something then they must speak the same language. This means that if we’re running OSPFv3, then all the devices in the network must run the exact same version of OSPFv3 and must understand the same thing. This means that if you throw in a lot of different devices with varying capabilities in the network then they must all support OSPFv3 if they want to be heard.

Most of the protocols are change resistant, i.e., we find it very difficult to extend OSPFv2 to say introduce newer types of LSAs. We find it difficult to make enhancements to STP to make it better, faster – more scalable, to add more features. Nobody wants to radically change the design of these protocols.

Another argument that’s often discussed is that the metrics used by these protocols are really not good enough. BGP for example considers the entire AS as one hop. In OSPF and ISIS, the metrics are a function of the BW of the link. But is BW really the best way to calculate a metric of an interface to feed in to the computation to select the best path?

When OSPF and all the routing protocols that we use today were designed and built they were never designed to forward data packets while they were still re-converging. They were designed to drop data as that was the right thing to do at that time because the mathematical computation/algorithms took long enough and it was more important to avoid loops by dropping packets. To cite an example, when OSPF comes up, it installs the routes only after it has exchanged the entire LSDB with its neighbors and has reached a FULL state. Given the volume of ancillary data that OSPF today exchanges via Opaque LSAs this design is an over-kill and folks at IETF are already working on addressing this.

We also have poor multipath ability with our current protocols today. We can load balance between multiple interfaces, but we have problems with the return path which does not necessarily come back the way you wanted. We work around that to some extent by network designs that adapt to that.

Current routing protocols forward data based on destination address only. We send traffic to 192.168.1.1 but we don’t care where it came from. In truth as networks get more complex and applications get more sophisticated, we need a way to route by source as well by destination. We need to be able to do more sophisticated forwarding. Is it just enough to send an envelope by writing somebody’s address on an envelope and putting it in a post box and letting it go in the hope that it gets there? Shouldn’t it say that Hey this message is from the electricity deptt. That can go at a lower priority than say a birthday card from grandma that goes at a higher priority. They all go to the same address but do we want to treat them with the same priority?

So the question is that are our current protocols good enough – The answer is of course Yes, but they do have some weaknesses and that’s the part which has been driving the next generation of networking and a part of which is where OpenFlow comes in ..

If we want to replace the Routing protocols (OSPF, STP, LDP, RSVP-TE, etc) then we need something to replace those with. We’ve seen that Routing protocols have only one purpose for their existence, and that’s to update the forwarding tables in the networking devices. The SW that runs the whole system today is reasonably complex, i.e., SW like OSPF, LDP, BGP, multicast is all sitting inside the SW in an attempt to load the data into the forwarding tables. So a reasonably complex layer of Control Plane is sitting inside each device in the network to load the correct data into the forwarding tables so that correct forwarding decisions are taken.

Now imagine for a moment that we can replace all this Control Plane with some central controller that can update the forwarding tables on all the devices in the network. This is essentially the OpenFlow idea, or the OpenFlow model.

In the OpenFlow model there is an OpenFlow controller that sends the Forwarding table data to the OpenFlow client in each device. The device firmware then loads that into the forwarding path. So now we’ve taken all that complexity around the Control Plane in the networking device and replaced it with a simple client that merely receives and processes data from the Controller. The OpenFlow controller loads data directly into the OpenFlow client which then loads it directly into the FIB. In this situation the only SW in the device is the chip firmware to load the data into the FIB or TCAM memories and to run the simple device management functions, the CLI, to run the flash and monitor the system environmentals. All the complexity around generating the forwarding table has been abstracted away into an external controller. Now its also possible that the device can still maintain the complex Control Plane and have OpenFlow support. OpenFlow in such cases would load data into the FIBs in addition to the RIB that’s maintained by the Control Plane.

The Networking OS would change a little to handle all device operations such as Boot, Flash, Memory Management, OpenFlow protocol handler, SNMP agent, etc. This device will have no OSPF, ISIS,RSVP or Multicast – none of the complex protocols running. Typically, routers spend close to 30+% of CPU cycles doing topology discovery. If this information is already available in some central server, then this frees up significant CPU cycles on all routers in the network. There will also be no code bloat – we will only keep what we need on the devices. Clearly, smaller the code running on the devices, lesser is the bugs, resources required to maintain it – all translating into lower cost.

If we have a controller that’s dumping data into the FIB of a network device then it’s a piece of SW – its an application. It’s a SW program that sits on a computer somewhere. It could be an appliance, a virtual machine (VM) or could reside somewhere on a router. The controller needs to have connectivity to all the networking devices so that it can write out, send the FIB updates to all devices. And it would need to receive data back from the devices. It is envisioned that the controller would build a topology of the network in memory and run some algorithm to decide how the forwarding tables should be programmed in each networking device. Once the algorithm has been executed across the network topology then it could dispatch topology updates to the forwarding tables using OpenFlow.

OpenFlow is an API and a protocol which decides how to map the FIB entries out of the controller and into the device. In this sense a controller is, if we look back at what we understand today, very similar to Stack Master in Cisco. So if one has 5 switches in a stack then one of them becomes the Stack Master. It takes all of the data about the forwarding table. It’s the one that runs the STP algo, decides what the FIB looks like and sends the FIB data on the stacking backplane to each of the devices so that each has a local FIB (that was decided by the Stack Master).

To better understand the Controllers we need to think of 5 elements as shown in the figure.

At the bottom we have the network with all the devices. The OpenFlow protocol communicates with these devices and the Controller. The Controller has its own model of the network (as shown on the right) and presents the User Interface out to the user so that the config data can come in. Via the User Interface the admin selects the rules, does some configuration, instructs on how it wants the network to look like. The Controller then looks at its model of the network that it has constructed by gathering information from the network and then proceeds with programming the forwarding tables in all the network devices to be able to achieve that successful outcome. OpenFlow is a protocol – its not a SW or a platform – it’s a defined information style that allows for dynamic configuration of the networking devices.

A controller could build a model of the network and have a database and then run SPF, RSVP-TE, etc algorithms across the network to produce the same results as OSPF, RSVP-TE running on live devices. We could build an SPF model inside the controller and run SPF over that model and load the forwarding tables in all devices in the network. This would free up each device in the network from running OSPF, etc.

The controller has real time visibility of the network in terms of the topology, preferences, faults, performance, capacity, etc. This data can be aggregated by the controller and made available to the network applications. The modern network applications can be made adaptive, with the potential to become more network-efficient and achieve better application performance (e.g., accelerated download rates, higher resolution videos), by leveraging better network provided information.

Theoretically these concepts can be used for saving energy by identifying underused devices and shutting them down when they are not needed.

So for one last time, lets see what OpenFlow is.

OpenFlow is a protocol between networking devices and an external controller, or in other words a standard method to interface between the control and data planes. In today’s network switches, the data forwarding path and the control path execute in the same device. The OpenFlow specification defines a new operational model for these devices that separates these two functions with the packet processing path on the switch but with the control functions such as routing protocols, ACL definition moved from the switch to a separate controller. The OpenFlow specification defines the protocol and messages that are communicated between the controller and network elements to manage their forwarding operation.

Added Later: Network Function Virtualization is not directly SDN. However, if youre interested i have covered it here and here.