There is a lot of hype around OpenFlow as a technology and as a protocol these days. Few envision this to be the most exciting innovation in the networking industry after the vaccum tubes, diodes and transistors were miniaturized to form integrated circuits. This is obviously an exaggeration, but you get the drift, right?
The idea in itself is quite radical. It changes the classical IP forwarding model from one where all decisions are distributed to one where there is a centralized beast – the controller – that takes the forwarding decisions and pushes that state to all the devices (could be routers, switches, WiFi access points, remote access devices such as CPEs) in the network.
Before we get into the details, let’s look at the main components – the Management, Control and the Forwarding (Data) plane – of a networking device. The Management plane is used to manage (CLI, loading firmware, etc) and monitor the device through its connection to the network and also coordinates functions between the Control and the Forwarding plane. Examples of protocols processed in the management plane are SNMP, Telnet, HTTP, Secure HTTP (HTTPS), and SSH.
The Forwarding plane is responsible for forwarding frames – it receives frames from an ingress port, processes them, and sends those out on an egress port based on what’s programmed in the forwarding tables. The Control plane gathers and maintains network topology information, and passes it to the forwarding plane so that it knows where to forward the received frames. It’s in here that we run OSPF, LDP, BGP, STP, TRILL, etc – basically, whatever it takes us to program the forwarding tables.
Routing Protocols gather information about all the devices and the routes in the network and populate the Routing Information Base (RIB) with that information. The RIB then selects the best route from all the routing protocols and populates the forwarding tables – and Routing thus becomes Forwarding.
So far, so good.
The question that keeps coming up is whether our routing protocols are good enough? Are ISIS, OSPF, BGP, STP, etc the only protocols that we can use today to map the paths in the network? Are there other, better options – Can we do better than what we have today?
Note that these protocols were designed more than 20 years ago (STP was invented in 1985 and the first version of OSPF in 1989) with the mathematics that goes in behind these protocols even further. The code that we have running in our networks is highly reliable, practical, proven to be scalable – and it works. So, the question before us is – Are there other, alternate, efficient ways to program the network?
Lets start with what’s good in the Routing Protocols today.
They are reliable – We’ve had them since last 20+ years. They have proven themselves to be workable. The code that we use to run them has proven itself to be reliable. There wouldn’t be an Internet if these protocols weren’t working.
They are deterministic in that we know and understand them and are highly predictable – we have experience with them. So we know that when we configure OSPF, what exactly will it end up doing and how exactly will it work – there are no surprises.
Also what’s important about today’s protocols are that they are self healing. In a network where there are multiple paths between the source and the destination, a loss of an interface or a device causes the network to self heal. It will autonomously discover alternate paths and will begin to forward frames along the secondary path. While this may not necessarily be the best path, the frames will get delivered.
We can also say that today’s protocols are scalable. BGP certainly has proven itself to run at the Internet’s scale with extraordinarily large number of routes. ISIS has as per the local folklore proven to be more scalable than OSPF. Trust me when i say that the scalability aspect is not the limitation of the protocol, but is rather the limitation of perhaps the implementation. More on this here.
And like everything else in the world, there are certain things that are not so good.
Routing Protocols work under the idea that if you have a room full of people and you want them to agree on something then they must speak the same language. This means that if we’re running OSPFv3, then all the devices in the network must run the exact same version of OSPFv3 and must understand the same thing. This means that if you throw in a lot of different devices with varying capabilities in the network then they must all support OSPFv3 if they want to be heard.
Most of the protocols are change resistant, i.e., we find it very difficult to extend OSPFv2 to say introduce newer types of LSAs. We find it difficult to make enhancements to STP to make it better, faster – more scalable, to add more features. Nobody wants to radically change the design of these protocols.
Another argument that’s often discussed is that the metrics used by these protocols are really not good enough. BGP for example considers the entire AS as one hop. In OSPF and ISIS, the metrics are a function of the BW of the link. But is BW really the best way to calculate a metric of an interface to feed in to the computation to select the best path?
When OSPF and all the routing protocols that we use today were designed and built they were never designed to forward data packets while they were still re-converging. They were designed to drop data as that was the right thing to do at that time because the mathematical computation/algorithms took long enough and it was more important to avoid loops by dropping packets. To cite an example, when OSPF comes up, it installs the routes only after it has exchanged the entire LSDB with its neighbors and has reached a FULL state. Given the volume of ancillary data that OSPF today exchanges via Opaque LSAs this design is an over-kill and folks at IETF are already working on addressing this.
We also have poor multipath ability with our current protocols today. We can load balance between multiple interfaces, but we have problems with the return path which does not necessarily come back the way you wanted. We work around that to some extent by network designs that adapt to that.
Current routing protocols forward data based on destination address only. We send traffic to 192.168.1.1 but we don’t care where it came from. In truth as networks get more complex and applications get more sophisticated, we need a way to route by source as well by destination. We need to be able to do more sophisticated forwarding. Is it just enough to send an envelope by writing somebody’s address on an envelope and putting it in a post box and letting it go in the hope that it gets there? Shouldn’t it say that Hey this message is from the electricity deptt. That can go at a lower priority than say a birthday card from grandma that goes at a higher priority. They all go to the same address but do we want to treat them with the same priority?
So the question is that are our current protocols good enough – The answer is of course Yes, but they do have some weaknesses and that’s the part which has been driving the next generation of networking and a part of which is where OpenFlow comes in ..
If we want to replace the Routing protocols (OSPF, STP, LDP, RSVP-TE, etc) then we need something to replace those with. We’ve seen that Routing protocols have only one purpose for their existence, and that’s to update the forwarding tables in the networking devices. The SW that runs the whole system today is reasonably complex, i.e., SW like OSPF, LDP, BGP, multicast is all sitting inside the SW in an attempt to load the data into the forwarding tables. So a reasonably complex layer of Control Plane is sitting inside each device in the network to load the correct data into the forwarding tables so that correct forwarding decisions are taken.
Now imagine for a moment that we can replace all this Control Plane with some central controller that can update the forwarding tables on all the devices in the network. This is essentially the OpenFlow idea, or the OpenFlow model.
In the OpenFlow model there is an OpenFlow controller that sends the Forwarding table data to the OpenFlow client in each device. The device firmware then loads that into the forwarding path. So now we’ve taken all that complexity around the Control Plane in the networking device and replaced it with a simple client that merely receives and processes data from the Controller. The OpenFlow controller loads data directly into the OpenFlow client which then loads it directly into the FIB. In this situation the only SW in the device is the chip firmware to load the data into the FIB or TCAM memories and to run the simple device management functions, the CLI, to run the flash and monitor the system environmentals. All the complexity around generating the forwarding table has been abstracted away into an external controller. Now its also possible that the device can still maintain the complex Control Plane and have OpenFlow support. OpenFlow in such cases would load data into the FIBs in addition to the RIB that’s maintained by the Control Plane.
The Networking OS would change a little to handle all device operations such as Boot, Flash, Memory Management, OpenFlow protocol handler, SNMP agent, etc. This device will have no OSPF, ISIS,RSVP or Multicast – none of the complex protocols running. Typically, routers spend close to 30+% of CPU cycles doing topology discovery. If this information is already available in some central server, then this frees up significant CPU cycles on all routers in the network. There will also be no code bloat – we will only keep what we need on the devices. Clearly, smaller the code running on the devices, lesser is the bugs, resources required to maintain it – all translating into lower cost.
If we have a controller that’s dumping data into the FIB of a network device then it’s a piece of SW – its an application. It’s a SW program that sits on a computer somewhere. It could be an appliance, a virtual machine (VM) or could reside somewhere on a router. The controller needs to have connectivity to all the networking devices so that it can write out, send the FIB updates to all devices. And it would need to receive data back from the devices. It is envisioned that the controller would build a topology of the network in memory and run some algorithm to decide how the forwarding tables should be programmed in each networking device. Once the algorithm has been executed across the network topology then it could dispatch topology updates to the forwarding tables using OpenFlow.
OpenFlow is an API and a protocol which decides how to map the FIB entries out of the controller and into the device. In this sense a controller is, if we look back at what we understand today, very similar to Stack Master in Cisco. So if one has 5 switches in a stack then one of them becomes the Stack Master. It takes all of the data about the forwarding table. It’s the one that runs the STP algo, decides what the FIB looks like and sends the FIB data on the stacking backplane to each of the devices so that each has a local FIB (that was decided by the Stack Master).
To better understand the Controllers we need to think of 5 elements as shown in the figure.
At the bottom we have the network with all the devices. The OpenFlow protocol communicates with these devices and the Controller. The Controller has its own model of the network (as shown on the right) and presents the User Interface out to the user so that the config data can come in. Via the User Interface the admin selects the rules, does some configuration, instructs on how it wants the network to look like. The Controller then looks at its model of the network that it has constructed by gathering information from the network and then proceeds with programming the forwarding tables in all the network devices to be able to achieve that successful outcome. OpenFlow is a protocol – its not a SW or a platform – it’s a defined information style that allows for dynamic configuration of the networking devices.
A controller could build a model of the network and have a database and then run SPF, RSVP-TE, etc algorithms across the network to produce the same results as OSPF, RSVP-TE running on live devices. We could build an SPF model inside the controller and run SPF over that model and load the forwarding tables in all devices in the network. This would free up each device in the network from running OSPF, etc.
The controller has real time visibility of the network in terms of the topology, preferences, faults, performance, capacity, etc. This data can be aggregated by the controller and made available to the network applications. The modern network applications can be made adaptive, with the potential to become more network-efficient and achieve better application performance (e.g., accelerated download rates, higher resolution videos), by leveraging better network provided information.
Theoretically these concepts can be used for saving energy by identifying underused devices and shutting them down when they are not needed.
So for one last time, lets see what OpenFlow is.
OpenFlow is a protocol between networking devices and an external controller, or in other words a standard method to interface between the control and data planes. In today’s network switches, the data forwarding path and the control path execute in the same device. The OpenFlow specification defines a new operational model for these devices that separates these two functions with the packet processing path on the switch but with the control functions such as routing protocols, ACL definition moved from the switch to a separate controller. The OpenFlow specification defines the protocol and messages that are communicated between the controller and network elements to manage their forwarding operation.