How bad is the OSPF vulnerability exposed by Black Hat?

I was asked a few weeks ago by our field engineers to provide a fix for the OSPF vulnerability exposed by Black Hat last month. Prima facie there appeared nothing new in this attack as everyone knows that OSPF (or ISIS) networks can be brought down by insider attacks. This isnt the first time that OSPF vulnerability has been announced at Black Hat. Way back in 2011 Gabi Nakibly, the researcher at Israel’s Electronic Warfare Research and Simulation Center, had demonstrated how OSPF could be brought down using insider attacks. Folks were not impressed, as anybody who had access to one of the routers could launch attacks on the routing infrastructure. So it was with certain skepticism that i started looking at yet another OSPF vulnerability exposed by Gabi, again at Black Hat. Its only when i started delving deep into the attack vector that the real scale of the attack dawned on me. This attack evades OSPF’s natural fight back mechanism against malacious LSAs which makes it a bit more insidious than the other attacks reported so far.

I exchanged a few emails with Gabi when i heard about his latest exposé. I wanted to understand how this attack was really different from the numerous other insider attacks that have been published in the past. Insider attacks are not very interesting, really. Well, if you were careless enough to let somebody access your trusted router, or somebody was smart enough to masquerade as one of your routers and was able to inject malicious LSAs then the least that you can expect is a little turbulence in your routing infrastructure. However, this attack stands apart from the others as we shall soon see.

OSPF (and ISIS too) has a natural fight back mechanism against any malacious LSA that has been injected in a network. When an OSPF router receives an LSA that lists that router as the originating router (referred to as a self-originated LSA) it looks at the contents of the LSA (just in case you didnt realize this). If the received LSA looks newer than the LSA that this router had last originated, the router advances the LSA’s LS sequence number one past the received LS sequence number and originates a new instance of this LSA. In case its not interested in this LSA, it flushes the LSA by originating a new LSA with age set to MaxAge.

All other routers in the network now update their LS database with this new instance and the malacious LSA effectively gets purged from the network. Viola,its that simple!

As a result of this, the attacker can only flood malacious LSAs inside the network till the router that the malacious LSA purports to come from (victim router) receives a copy. As soon as this router floods an updated copy, it doesnt take long for other routers in the network to update their LS DB as well – the flooding process is very efficient in disseminating information since network diameters are typically not huge, and yes, packets travel with the speed of light. Did you know that?

In the attack that Gabi described, the victim router does not recognize the malacious LSA as its own and thus never attempts at refreshing it. As a result the malicious LSA remains stealthily hidden in the routing domain and can go undetected for a really long time. Thus by controlling a single router inside an AS (the one that will flood the malacious LSA), an attacker can gain control over the entire routing domain. In fact, an attacker need not even gain control of an entire router inside the AS. Its enough if it can somehow inject the malacious LSAs over a link such that one of the OSPF routers in the network accept this. In the media release, Black Hat claimed ” The new attack allows an attacker that owns just a single router within an AS to effectively own the routing tables of ALL the routers in that AS without actually owning the routers themselves. This may be utilized to induce routing loops, network cuts or longer routes in order to facilitate DoS of the routing domain or to gain access to information flows which otherwise the attacker had no access to.”

So what is this attack?

Lets start by looking at what the LS header looks like.

In this attack we are only interested in the two fields, the Link State ID and the Advertising Router, in the LS Header. In the context of a Router LSA, the Link State ID identifies the router whose links are listed in the LSA. Its always populated with the router ID of that router. The Advertising Router field identifies the router that initially advertised (originated) the LSA. The OSPF spec dictates that only a router itself can originate its own LSA (i.e. no router is expected to originate a LSA on behalf of other routers), therefore in Router LSAs the two fields – ‘Link State ID’ and ‘Advertising Router’ – must have the exact same value. However, the OSPF spec does not specify a check to verify this equality on Router LSA reception.

Unlike several other IETF standards, the OSPF spec is very detailed, leaving little room for any ambiguity in interpreting and implementing the standard. This is usually good as it results in interoperable implementations where everybody does the right thing. The flip side however is that since everybody follows the spec to the tittle, a potential bug or an omission in the standard, would very likely affect several vendor implementations.

This attack exploits a potential omission (or a bug if you will) in the standard where it does not mandate that the receiving router verifies that the Link State ID and the Advertising Router fields in the Router LSA are the exact same value.

This attack sends malacious Router LSAs with two different values in the LS header. The Link State ID carries the Router ID of the router that is being attacked (the victim) and the Advertising Router is set to some different (any) value.

When the victim receives the malacious Router LSA, it does not refresh this LSA as it doesnt recognize this as its own self generated LSA. This is because the OSPF spec clearly says in Sec 13.4 that “A self-originated LSA is detected when either 1) The LSA’s Advertising Router is equal to the router’s own Router ID or 2) the LSA is a network LSA .. “.

This means that OSPF’s natural fight back mechanism is NOT triggered by the victim router as long as the field ‘Advertising Router’ of a LSA is NOT equal to the victim’s Router ID. This is true even if the ‘Link State ID’ of that LSA is equal to the victim’s Router ID. Going further it means no LSA refresh is triggered even if the malacious LSA claims to describe the links of the victim router!

When this LSA is flooded all the routers accept and install this LSA in their LS database. This exists along side the valid LSA originated by the victim router. Thus each router in the network now has two Router LSAs for the victim router – the first that was genuinely originated by the victim router and the second that has been inserted by the attacker.

When computing the shortest path first algorithm, the OSPF spec in Sec 16.1 requires implementations to pick up the LSA from the LS DB by doing a lookup “based on the Vertex ID“. The Vertex ID refers to the Link State ID field in the Router LSAs. This means that when computing SPF, routers only identify the LSAs based on their Link State ID. This creates an ambiguity on which LSA will be picked up from the LS database. Will it be the genuine one originated by the victim router or will it be the malacious LSA injected by the attacker? The answer depends on how the data structures for LS DB lookup have been implemented in the vendor’s routers. Ones that pick up the wrong LSA will be susceptible to the attack. The ones that dont, would be oblivious to the malacious LSA sitting in their LSA DBs.

Most router implementations are vulnerable to this attack since nobody expects the scenario where multiple LSAs with the same Link State ID will exist in the LS DB. It turns out that at least 3 major router vendors (Cisco, Juniper and Alcatel-Lucent) have already released advisories and announced fixes/patches that fixes this issue. The fix for 7210 would be out soon ..

Once again, the attacker does not need to have an OSPF adjacency to inject the forged LSAs.

Doing this is not as difficult as we might think it is. There is no need for the attacker to access the LS DB sequence number – all it needs to do is to send an LSA with a reasonably high sequence number, say something like MAX_SEQUENCE – 1 to get this LSA accepted.

The attack can also be performed without complete information about the OSPF topology. But, this is highly dependent on the attack scenario and what piece of false information the attacker wishes to advertise on behalf of the victim. For example, if the attacker wishes to disconnect the victim router from the OSPF topology then merely sending an empty LSA without knowing the OSPF topology in advance would also work. In the worst case, the attacker can also get partial information on the OSPF topology by using trace routes, etc. This way the attacker can construct LSAs that look very close to what has been originally advertised by the victim router, making it all the more difficult to suspect that such LSAs exist in the network.

Why providers still prefer IS-IS over OSPF when designing large flat topologies!

I was recently interacting with our pre-sales team for a large MPLS deployment and was reading the network design that was proposed. I saw that they had suggested IS-IS over OSPF as the IGP to use at the core. One of the reasons cited was the inherent security that IS-IS provides by running natively over the Layer 2. Another was that IS-IS is more modular and thus easier to extend as compared to OSPF. OSPF, its alleged is very rigid and required a complete protocol rewrite to support something as basic as IPv6! 🙂 Then there was this overload feature that IS-IS provides which can signal memory overload that does not exist in OSPF and finally a point about IS-IS showing superior scalability (faster convergence). In case you’re intrigued about the last point, as i clearly was, then it was explained that IS-IS uses just one Link State Packet (LSP) per level for exchanging the routing information. This LSP contains many TLVs, each of which represents a piece of routing information. OSPF on the other hand, needs to originate multiple LSAs, one for each type and as a consequence is a lot more chattier and hence not suitable for large flat networks.

I personally dont agree to any one of the reasons listed above and anyone who favors IS-IS over OSPF for the above reasons is patently mistaken. These are all extremely weak arguments and have mostly been overtaken by reality. Lets look at each one by one.

Security – While its true that one cant lob an IS-IS packet from a distance without a tunnel it was never really a compelling reason for some one to pick up IS-IS over OSPF. The same holds good for OSPF multicast packets as well which cannot be launched by some script kidde sitting miles away from his personal laptop. Both the protocols have been extended to support stronger algorithms (RFC 5310 for IS-IS and RFC5709 for OSPF) and have similar authentication mechanisms. I can say this with some degree of confidence as i have co-authored both these standards.

Modularity – While its somewhat easier to extend IS-IS in a backward compatible way this sort of thing doesnt happen much any more. Both protocols have been extended to support multiple instances, traffic engineering, multi-topology, graceful restart, etc. This isnt imo a showstopper for someone picking up OSPF as the IGP to use.

Overload Mechanism – IS-IS has the ability to set the Overload (OL) bit in its LSAs. This results in other routers in that area treating this router as a leaf router in their shortest path trees, which means that its only used for reaching the directly connected interfaces and is never placed on the transit path to reach other routers. So does this happen any more? No, it doesnt. This feature was required in the jurassic age when routers came with severely constrained memory, CPU power and the original intention of the OL mechanism is now mostly irrelevant. Most core routers today have enough memory and CPU that they will not get inundated by the IS-IS routes in any sane network design.

These days OL bit is used to prevent unintentional blackholing of packets in BGP transit networks. Due to the nature of these protocols, IS-IS and OSPF converge must faster than BGP. Thus there is a possibility that while the IGP has converged, IBGP is still learning the routes. In that case if other IBGP routers start sending traffic towards this IBGP router that has not yet completely converged it will start dropping traffic. This is because it isnt yet aware of the complete BGP routes. OL bit comes handy in such situations. When a new IBGP neighbor is added or a router restarts, the IS-IS OL bit is set. Since directly connected (including loopbacks) addresses on an “overloaded” router are considered by other routers, IBGP can be bought up and can begin exchanging routes. Other routers will not use this router for transit traffic and will route the packets out through an alternate path. Once BGP has converged, the OL bit is cleared and this router can begin forwarding transit traffic.

So how can we do this in OSPF since there is no OL bit in its LSAs?

Simple. We can set the metric of all transit links on an “overloaded” router to 0xffff in its Router LSAs. This will result in the router not being included as a transit node in the SPF tree. Stub links can still be advertised with their normal metrics so that they are reachable even when the router is “overloaded”. Thus this point against OSPF is also not valid.

Finally we come to the scalability and the convergence part. This one is slightly tricky and is not so easy. I wrote a few posts around 4 years back discussing this here and here. You might want to read these.

IMO one of the big reasons why most big providers use (or have used) IS-IS is because way back in 90s Cisco OSPF implementation was a disaster. The first big ISPs (UUnet, MCI) came to them and said “we want to build big infrastructures, should we use OSPF?” and Cisco basically said “No, thats not a good idea, use IS-IS instead”. Dave Katz in Cisco had recently rewritten Cisco’s IS-IS implementation as a side effect of implementing NetWare Link Services Protocol – NLSP (basically IS-IS for Novel IPX) so Cisco was quite confident of its IS-IS implementation. The operators thus picked up IS-IS and continue using it even today as there is really no real difference between IS-IS and OSPF, so no motivation to move from one to the other.

IS-IS was also an advantage in the early days as a router vendor because it was an “open proprietary spec”. It was out there, and published, but unless you had some background in OSI you didn’t know much about it and the spec was scary and weird. This wasn’t on purpose, but it was handy.

It was also nice in the IETF because IS-IS was viewed, at least at the time, as the poor cousin of OSPF and so nobody really cared that much other than the handful of folks that were doing the work. This made the extension of IS-IS a lot easier and a lot less political than OSPF. In fact i have heard about a t-shirt which said “IS – IS = 0” that was distributed in one of the IETF meetings long time ago! Things however have changed and IS-IS is considered at par with OSPF today and both the working groups are quite active in the IETF.

There was one real technical advantage to IS-IS in common deployment scenarios of that day as well. Back then, it was popular to build full meshes of ATM or Frame Relay as the Layer 2 topology for large backbones, because of the perception that healing faults at L2 would happen faster and cleaner than letting the IP routing protocols take care of it (arguably true at the time). Full mesh topologies are the worst possible topologies for standard flooding protocols (IS-IS and OSPF both) and the cost of topology changes was huge. However, IS-IS lent itself to the “mesh group” hack by which you could manually prune the flooding topology to be a subset of the links. OSPF doesn’t easily allow this because of details about the flooding model it uses. Cisco apparently did implement a hack to get around this problem, but its probably more gross than the IS-IS “mesh groups” hack!

Another reason i believe people prefer IS-IS over OSPF is the belief that you can design large networks by building a single large Level 1 (L1) area without any hierarchies in IS-IS and still be able to manage – something that would be difficult with OSPF. There are issues with inter-area traffic engineering and such and most people would like to keep their network as a single area if the routing protocol can manage it.

I used to believe that operators can design big networks without hierarchies in IS-IS since all IP prefixes (i.e. network interfaces, routes aka reachabilities in ISO-speak) are considered as leaf nodes in the SPF for IS-IS. Thus a full SPF will not be triggered for an interface or a route flap in case of IS-IS. OSPF otoh, would go ballistic running SPF each time any IP information changes. The only time we dont run a full SPF in when a Type 5 LSA information changes, but thats hardly an optimization. Compared to this, the only time we run a full SPF in IS-IS is when an actual node goes down (which OSPF would also anyways do).

I was recently having a discussion with Dave Katz from Juniper on this and i realized that this really is an implementation choice. “The graph theory”, he very aptly pointed out, “is the same in both cases!”. The IS-IS spec makes it easier to put an IS-IS reachability as leaf nodes as all routers are identified by a different set of TLVs. This information while its available in OSPF is slightly tricky as the node information is mixed with the link information. Thus while even a naive IS-IS implementation may be able to optimize SPF, it would require a good understanding of the spec to get it right in OSPF.

You could get the exact same optimization in OSPF as IS-IS if you realize that OSPF calculates the routes to the *router IDs* and not the addresses. The distinction between nodes and destinations is syntactically (and semantically) quite clear in OSPF as well. The spec considers the Router IDs which i concede look like IP addresses, something that most people miss.

Actual addresses and prefixes are quite distinct, even in OSPF. So as long as you can keep track of what’s an address and what’s an ID, it’s not that hard, for what it’s worth. The bigger problem is that only a handful of people really understand *why* things in the OSPF spec are done the way they are, and there are less and less of those folks because hardly anybody *needs* to understand it.

But having said all that, the cost of an SPF is so small on the scale of things that it’s not really the issue (which is also why I am not a big fan of partial-SPF optimizations: “See how great it works when you have around O(50K) nodes and there is this one little node that goes down!” is sort of silly because lots of other things would break before a network ever got that big.)

Part of the SPF fear was I believe because Cisco’s original SPF implementation in OSPF was horribly inefficient (and everyone was using slow processors back then) and IOS was a non-preemptive, single threaded environment, and so an SPF (or any slow process) would block other things (like sending and receiving Hellos and other important bits) and would affect *everything*. I am btw sure that its changed now since i am aware of a couple of large Cisco deployments that are running OSPF in the core! Overall system state management is a *much* bigger problem these days than the algorithmic efficiency of these protocols, particularly as we build larger and more distributed environments that require message passing internally.

Also what could have pushed providers back then to IS-IS was the deployment guidelines that Cisco used to publish (including the number of routes in an area) back then which were absurdly small. I am sure, its changed now.

There’s no technical reason why very large flat topologies can’t be supported by a good implementation of either protocol, but ISPs need to be conservative and suspicious of their vendors in order to survive. 😉 I guess that nobody wants to be the first to deploy a large flat OSPF topology; best practices tend to be sticky. However, there is no reason why you cant do it with OSPF today.

I suspect that, at this point, ISPs choose based on culture and familiarity and comfort rather than real technical differences. The perception still exists that while IS-IS can support large flat networks, OSPF cant. However, as i said its just a perception and is not really true any more.

2010 in review

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads This blog is doing awesome!.

Crunchy numbers

A Boeing 747-400 passenger jet can hold 416 passengers. This blog was viewed about 19,000 times in 2010. That’s about 46 full 747s.

In 2010, there were 2 new posts, growing the total archive of this blog to 33 posts.

The busiest day of the year was September 8th with 183 views. The most popular post that day was Designated Router (or DIS) concepts .. .

Where did they come from?

Some visitors came searching, mostly for spf algorithm, ospf graceful restart, shortest path first, ospf vs isis, and ospf flooding.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

Designated Router (or DIS) concepts .. March 2007
1 comment

Shortest Path First (SPF) Algorithm Demystified .. March 2008
11 comments and 1 Like on WordPress.com,

Shortest Path First (SPF) Calculation in OSPF and IS-IS March 2008
4 comments

Graceful Restart Mechanisms in OSPF and IS-IS .. March 2007
2 comments

Database Exchange and Flooding in OSPF and IS-IS .. March 2007
2 comments

Shortest Path First (SPF) Calculation in OSPF and IS-IS

Both OSPF and IS-IS use the Shortest Path First (SPF) algorithm to calculate the best path to all known destinations based on the information in their link state database. It works by building the shortest path tree from a specific root node to all other nodes in the area/domain and thereby computing the best route to every known destination from that particular source/node. The shortest path tree thus constructed, consists of three main entities – the edges, the nodes and the leaves.

Each router in OSPF or an Intermediate System in case of IS-IS, is a node in the SPF tree. The links connecting these routers, the edges. The IP network associated with an IP interface, added into OSPF via the network command is a node, while the IP address associated with an interface thats added in IS-IS is a leaf. An IP prefix redistributed into OSPF or IS-IS from other routing protocols (say BGP) becomes a leaf in both the protocols. Inter-area routes are patently, the leaves.

If you consider the network as shown above, then OSPF would consider routers A, B, C and the network 10.1.1.0/24 as nodes. This is assuming that the interface associated with 10.1.1.0/24 has been added into OSPF. The only leaf in the graph would be the IP prefix 56.1.1.0/24 redistributed into A from some other protocol. IS-IS otoh, would consider routers A, B and C as nodes and networks 10.1.1.0/24 and 56.1.1.0/24 as leaves. This seemingly innocuous difference in representation of the SPF tree leads to some subtle differences between the SPF run in OSPF and IS-IS, which can interest a network engineer.

The nodes in the shortest path tree or the graph form the backbone or the skeleton of that tree. Any change there necessitates a recalculation of the SPF tree, while a change in a leaf of the SPF tree does not require a full recalculation. Removing and adding of leaves without recalculating the entire SPF tree is known as Partial SPF and is a feature of almost every implementation of OSPF and IS-IS that i am aware of. This implies that if the link connecting router C to 10.1.1.0/24 goes down, then a full SPF would be triggered in case of OSPF, and a partial SPF in case of IS-IS.

This shows that the general adage – “Avoid externals in OSPF” should be taken with a pinch of salt and it really depends upon your topology. I have seen networks where ISPs redistribute numerous routes that have a potential to change on a regular basis, as opposed to bringing them via the network command.

IS-IS

o IP routing is integrated into IS-IS by adding some new TLVs which carry IP reachability information in the LSPs. All IP networks are considered externals, and they always end up as leaf nodes in the shortest path tree when IS-IS does a SPF run. All node information, neccessary for SPF calculation is advertised in its IS Neighbors or IS Reachability TLVs. This unambiguously separates the prefix information from the topology information which makes Partial Route Calculation (PRC) easily applicable. Thus IS-IS performs only the less CPU intensive PRC when network events do not affect the basic topology but only affect the IP prefixes.

o Used narrow (6 bits wide) metrics which helped in some SPF optimization. However such small bits proved insufficient for providing flexibility in designing IS-IS networks and other applications using IS-IS routing (MPLS-TE). “IS-IS extensions for Traffic Engineering” introduced new TLVs which defined wider metrics to be used for IS-IS thus taking away this optimization. But then CPU are fast these days and there arent many very big networks anyways!

o SPF for a given level is computed in a single phase by taking all IS-IS LSP’s TLV’s together.

OSPFv2

o Is built around links, and any IP prefix change in an area will trigger a full SPF. It advertises IP information in Router and Network LSAs. The routers thus, advertise both the IP prefix information (or the connected subnet information) and topology information in the same LSAs. This implies that if an IP address attached to an interface changes, OSPF routers would have to originate a Router LSA or a Network LSA, which btw also carries the topology information. This would trigger a full SPF on all routers in that area, since the same LSAs are flooded to convey topological change information. This can be an issue with an access router or the one sitting at the edge, since many stub links can change regularly.

o Only changes in interarea, external and NSSA routes result in partial SPF calculation (since type 3, 4, 5 and 7 LSAs only advertise IP prefix information) and thus IS-IS’s PRC is more pervasive than OSPF’s partial SPF. This difference allows IS-IS to be more tolerant of larger single area domains whereas OSPF forces hierarchical designs for relatively smaller networks. However with the route leaking from L2 to L1 incorporated into IS-IS the apparent motivation for keeping large single area domains too goes away.

o SPF is calculated in three phases. The first is the calculation of intra-area routes by building the shortest path tree for each attached area. The second phase calculates the inter-area routes by examining the summary LSAs and the last one examines the AS-External-LSAs to calculate the routes to the external destinations.

o OSPFv3 has been made smarter. It removes the IP prefix advertisement function from the Router and the Network LSAs, and puts it in the new Intra-Area Prefix LSA. This means that Router and Network LSAs now truly represent only the router’s node information for SPF and woudl get flooded only if information pertinent to the SPF algorithm changes, i.e., there is atopological change event. If an IP prefix changes, or the state of a stub link changes, that information is flooded in an Intra-Area Prefix LSA which does not trigger an SPF run. Thus by separating the IP information from the topology information, we have made PRC more applicable in OSPFv3 as compared to OSPF2.

I recently wrote a post that discusses this further.

Graceful Restart Mechanisms in OSPF and IS-IS ..

If the control and forwarding functions in a router can be separated independently, it is possible to maintain a router’s data forwarding capability intact while the router’s control software is restarted/reloaded. This functionality is termed as “graceful restart” or “nonstop forwarding”. The router’s control software (the routing protocols and the signalling protocols) can stop and can restart for myriad reasons – SW error crashing the protocol task, a switchover to the redundant control card, or a planned shutdown as part of the operational maintenance – the list is endless. The idea behind graceful restart is to continue forwarding packets based on the snapshot of FIB just before the router restarted. The restarting router is assumed to be capable of preserving the FIB and some amount of control information (like cryptographic sequence number in case of OSPF).

OSPF

o Restarting OSPF router originates a Grace LSA (link local Opaque LSA) specifying the ‘grace period’, thereby indicating to its neighbors the time, in seconds, that the neighbors (the helpers) should continue to consider this router as fully adjacent. The helping neighbors enter into a state known as helping mode during this period. The onus falls on the helpers to detect a topological change during the grace period and acting accordingly.

o In case of a planned restart, OSPF issues a Grace LSA to its neighbors on each restarting interface and sets the value 1, which is Software restart, in the Graceful Restart Reason TLV. In case of an unplanned outage, the router first issues a Grace LSA before sending out any HELLOs. Most implementations transmit the Grace LSAs multiple times, till an acknowledgement is heard from the neighboring routers.

o The helping router continues advertising the restarting router in its LSAs and other routers in the network never come to know of this event.

o Using standard OSPF procedures the helping routers establish adjacencies with the restarting router and synchronize their LSDBs. During the grace period, the restarting router receives its own self generated pre-restart LSAs. It accepts them as valid, and does not originate type 1 through 5 and type 7 LSAs, even after it transitions to a FULL state. The restarting router can run the SPF, but its not yet allowed to update the FIB.

o Once the restarting router and its helpers have synchronized their databases within the grace period, the former flushes its grace LSAs to signal successful completion of the gracweful restart procedure. The restarting router now reoriginates its router LSAs on all attached areas and the network LSAs on the segments, where its the DR. It now schedules a full SPF, calculates the routes, and updates the FIB.

o The restarting router had marked all the routes in FIB as stale before sending out the Grace LSAs. After graceful restart is over and it has recalculated the routes, it deletes all the routes marked as stale in the FIB. It can now reoriginate summary LSAs, type 7 LSAs and AS External LSAs as appropriate.

o When the helpers receive the flushes Grace LSAs, they exit the helper mode and revert back to normal OSPF procedures.

o OSPF automatically reverts back to standard OSPF restart from graceful restart if topological changes are detected or if one or more of the restarting router’s neighbors do not support graceful restart.

o More details here.

IS-IS

o Restarting router does not re-compute its own routes until it has achieved database synchronization with its neighbors.

o Uses restart TLV (type 211) in its IIH to obtain the graceful restart functionality. Grace period is decided as the minimum of the Remaining times of received IIHs containing a restart TLV with RA bit set. Upon receiving this IIH, the helping routers would flood the complete sets of CSNPs onto the link and set the SRM flag on all its LSPs towards the restarting router.

o The restarting router can now clear its restart timer, since it has some helping routers that can provide a full set of link state information through the normal transmission and retransmission process.

o During grace period, restarting router does not transmit self-originated LSPs and self-LSPs are not purged or modified. These restrictions are necessary to prevent premature removal of an own LSP and hence churn in other routers.

o Restart mechanism in IS-IS allows to establish adjacency without cycling through the normal operation of adjacency state machine.

o If a timer on the restarting router expires before it recieves a full set of CSNPs from its helpers, the adjacency is reset, and a normal neighbor restart is attempted.

o Usual database synchronization is achieved in situations where the neighboring routers of the restarting router do not support the restart TLV.

o More details here.

Metrics Size in OSPF and IS-IS ..

Each interface in the link state protocols in given a metric or a cost, which is advertised with the link state information in the LSP or the LSA. The SPF algorithm uses this metric to calculate the cost and the nexthop to a destination. Metrics used are generally the inverse of bandwidth, thus a large bandwith capacity link would have a lower metric

IS-IS

o ISO10589 specified the metric to be 6 bits in size. Therefore, the metric value could only range from 0-63. This metric information was carried in Neighbor Reachbility TLV and the IP reachability TLV. Since it was only 6 bits wide, it was called the “narrow metric“. The maximum path metric MaxPathMetric supported is 1023. This in theory brought down the complexity of the SPF algorithm from O(nlog n) to O(n). But this isn’t a significant motivation any more since the CPUs are really fast these days. The metric size apparently was kept small to optimize search while doing the SPF. It also allowed two types of metrics – External and Internal.

o Soon , the “narrow” metric range was found to be too small for certain networks and new TLVs (Extended IP and Extended Neighbor Reachability) were introduced to carry larger metrics as part of the Traffic Engineering document. These were called Wide Metrics. The MaxLinkMetric value now is 0xFFFFFFand the MaxPathMetric, 0xFE00000.

The Extended IP reachability TLV allows for a 4 byte metric, while the Extended Neighbor reachability TLV allows for 3 bytes metric size. This is to enable the metric summarized across levels/domains to be as large as 0xFFFFFFFF while the link metric itself is no larger than 0xFFFFFE. If a metric value of 0xFFFFFF is used the prefix is not used in SPF computation.

o The original specification defined 4 kinds of narrow metrics – delay, default, error and expense. These were never implemented and all implementations only support the default metric. In ISO 10589, the most significant bit of the 8 bit metrics was the field S (Supported) which defined if the metric was to be considered or not. Confusingly, as most ISO documents are, an implementation was supposed to set this bit to 1 if it wanted to indicate that the metric was unsupported. Since only the default metric is used, the implementations must always set this bit to 0 when using the narrow metrics. Later RFC 2966 used this bit in the default metric to mark L1 routes that have been leaked from L1 to L2 and back down into L1 again.

Current implementations must generate the default metric when using narrow metrics and should ignore the other three metrics when using narrow metrics.

OSPFv2

o It allows a link to have a 2 byte metric field in the Router LSA which implies a maximum metric of 0xFFFF.

o The Summary, Summary ASBR, AS-External and NSSA LSAs have a 3 byte metric value. A cost of 0xFFFFFF (LSInfinity) is used to tell the destination described in the LSA is unreachable.

o AS-External and NSSA LSAs allow two metric types, Type 1 and Type 2 which are equivalent to IS-IS Internal and External metrics. The type 1 considers the cost to the ASBR in addition to the advertised cost of the route while the latter uses just the advertised cost while calculating the routes during the SPF.

o The scheme thus allows for links to be configured with a metric no larger than 0xFFFF, while allowing cost of destinations injected across areas/levels to be as large as 0xFFFFFE.

OSPFv3

o It allows similar metric size for the Router LSA as in OSPFv2.

o It allows similar metric sizes for Intra Area Prefix LSA, Inter Area Prefix LSA, AS-External LSA and NSSA LSA as in OSPFv2. The value and significance of LS Infinity is valid here too.

Security and Authentication Issues in OSPF and IS-IS ..

Both protocols have a field indicating the “type” of authentication. There are however differences in the two protocols. In IS-IS, the data associated with the authentication is of variable length. In OSPF it is fixed at 64 bits. 64 bits is sufficient for a password scheme, but would not suffice for a public key signature scheme, which would need a field several hundreds of bits long. A way to circumvent this is by appending a “message” digest at the end of the OSPF packet.

In OSPF there is a single password per link. A router is configured with a password for each link to which it is attached. It transmits that password when it transmits OSPF messages on that link. It expects all OSPF messages it receives on that link to have that password. In IS-IS, a router is configured with a transmit password on a link, which is the password it uses when it transmits IS-IS messages, as well as a set of acceptable receive passwords.

On a P2P link a password scheme in which the receive and transmit passwords are different offers some security. If the passwords are the same, the intruder need only wait for the other router to transmit first, and the intruder will find out the password. Even with two passwords, an intruder can, with effort, discover the passwords.

The reason IS-IS configures routers with a set of acceptable receive passwords, rather than a single receive password, is so that a link, such as a LAN, can be migrated from one password to another without disrupting the network. Since OSPF has single password per link, it is not possible to change the password in an operational network. The routers would all have to be brought down and locally reconfigured.

One of the issues brought by the IS-IS proponents is apparently the big advantage that IS-IS has over OSPF from a security point of view as the former’s packets cannot be routed beyond the immediate next hop or can never be sourced by non-border routers. This can allegedly prevent a variety of potential DoS attacks as anyone can launch OSPF packet bombs in the others network. This apparent vulnerability to DoS attacks is because OSPF rides over IP rather than directly running over the link layer.

Since all OSPF packets can be authenticated using MD5, all spurious OSPF packets can be dropped. But there can be times when MD5 can itself be a part of a problem because it takes significant CPU to check signatures and discard the packets. This is partly true but it is to be noted however that even if OSPF encapsulation is changed to L2, we would still have to support IP encapsulation for virtual links, so we would still have to do MD5. Computing the MD5 used to be a problem in the past when CPUs were not that powerful. However, this is a non issue today, and most of the authentication can be done in the Hardware itself.

Moreover the system administrator can filter on the edges of the network to pry away all the OSPF messages coming from the edges. This will of course be done in addition to cryptography.

There are issues in the HMAC-MD5 scheme as there have recently been reports about attacks on the collision resistance properties of MD5. MD5CRK, was a distributed computing project to break the MD5 hash algorithm in a short period of time. The project closed down with the publication of the paper.

It was discovered that collisions can be found in MD5 algorithm in less than 24 hours, making MD5 very insecure. Further research has verified this result and shown other ways to find collisions in MD5 hashes. We thus need to move away from MD5 towards more complex and difficult to break hash algorithms.

It is because of this that both IS-IS and OSPF now support HMAC-SHA based authentication schemes for the respective protocols. HMAC-SHA support for IS-IS is defined in RFC5310, and for OSPF are been defined in RFC5709.

OSPFv3 has recently been extended to support a similar mechanism as OSPFv2 for authenticating its control packets. The new standard is defined in RFC 6506.

Virtual Links and Partitioned Areas ..

IS-IS

– IS-IS allows a Level-1 Area which is partitioned to be automatically repaired, by electing Partition Designated Level 2 routers and having a virtual link between them. The mechanism is not often implemented and requires an explicit tunnelling mechanism.

– Used in ISO IS-IS for connecting partitions of Level 1 Area over the Level 2 backbone.

OSPF

– Used for connecting physically separate area zeroes (0.0.0.0) to maintain contiguity of the backbone.

– Used for connecting remote areas to the backbone through other areas if direct physical connectivity is not possible. This enables an OSPF packet to be sent from one part of an remote isolated site to the main OSPF network.

– For Virtual links to work, OSPF accepts packets which are have originated more than one hop away. This can lead to security concerns if the packets at the edge of the domain are not properly filtered.

Database Exchange and Flooding in OSPF and IS-IS ..

Initial Database Exchange

For the SPF algorithm to work properly, all routers in the area should have the same database information on which the SPF algorithm works. The process of synchronization includes the “Initial Database Exchange” which is done when the adjacency is coming up and the asynchronous flooding when the Adjacencies are up.

OSPF

– A master-slave relation is established to do the database exchange. Besides the MTU is exchanged in the database description packets before any database exchange starts.

– The database exchange begins once the adjacency state reaches Exstart. On a broadcast links, the DR and BDR form adjacencies with all other routers on the network.

– Only one DB Description packet can be unacknowledged at a time that is, the window size is 1. Each DB Description packet from the master is acknowledged by the slave. The slave sends its own DB Description packet with similar identifiers as the masters.  This is important because if you miss even one DD packet, then the routers have to start all over again.

– DB description packets containing the summary of LSA’s at each end are exchanged. Only when the entire summary is received by the neighbour can it tell which instance of the LSA is not there in the senders database.

– An adjacency in OSPF is declared FULL/UP, when the entire database exchange is completed.

– OSPF does not allow routers to resynchronize their link state database in the steady state. It is only done during the initial database synchronization or when network topology changes. However, there are techniques to do that. One such way is described in “OSPF Out-of-band LSDB resynchronization”

IS-IS

– The MTU check is done at the hello exchange time itself.

– CSNP’s are sent by the DIS on a broadcast link. On a point-to-point link both the neighbours exchange CSNP’s with each other.

– On point-to-point link all the LSP’s SRM flag is also set for the circuit, to indicate the LSP’s have to be sent over the circuit.

– The CNSP’s are sent to reduce the actual flooding of all the LSP’s between the neighbours.

– Multiple CSNP’s can be sent together. CSNP’s unlike DB Descriptions in OSPF are not acknowledged.

– As the CSNP’s have a range of LSP-ID’s, and contain all the LSP’s in the database falling in that range. A neigbour on receiving a CSNP can know which LSP’s in the neighbour are newer, which older and which are absent. Based on this the neighbour can send newer LSP’s to the neighbour.

– Link state database is continuously refreshed and synchronized because of the periodic CSNPs that are announced.

Asynchronous Flooding

Whenever any information in an the database changes, the information is to be exchanged with all other routers in the network. This is done by the flooding process:

OSPF

– Uses reliable flooding mechanism for all link types.

– Changed LSA’s are packed in LS Update packets and send over adjacencies to the neighbour, which unpacks the LSA’s. LS Acknowledgement packets are sent by the receiver, which informs the sender that the receiver has received the LSA.

– The sender retransmits the LSA’s after the re-transmission interval if it does not get acknowledgements for them.

– On a broadcast link LSUpdate packets are sent only to all-DR routers multicast address. The DR floods the LSUpdate packets to All-SPF-Routers.

– Whenever a new DR/BDR is elected, it has to form adjacencies with all other routers in the network.

– There is no difference in the asynchronous flooding procedures between OSPFv2 and OSPFv3.

IS-IS

– LSP’s are flooded as is across the area. They are not packed inside any other packet.

– On broadcast links flooding is not done reliably. A changed LSP is flooded to all IS-IS routers, however no retransmissions occur.

– The reliability in database exchange on a broadcast link is achieved by periodic database exchange. This is done as CSNP’s are sent periodically by the DIS, which initiates the entire database exchange process all over again.

– As the DIS sends periodic CSNP, nothing different needs to be done when a new DIS is exchanged.

– On a point-to-point link flooding is done reliably. LSP’s are flooded to the neighbour and if CSNP entry for the LSP is not received in a particular time interval, the LSP is re-flooded to that neighbour.

Packet Alignment and Extensibility Issues in OSPF and IS-IS

IS-IS

o Does not require any particular alignment of the PDU fields.

o Uses Type-Length-Value (TLV) encoded packets to advertise the routing information. The TLVs that are not supported/recognized are ignored by other IS-IS routers.

o LSPs are flooded intact with unrecognized TLV information making it very extensible. IPv6, GMPLS, etc. support is provided by simply adding a few more TLVs.

o TLVs can be nested as sub-TLVs providing even more flexibility for future extensions. Though the base spec does not use them but the newer drafts and RFCs have started using them (TE extensions, etc).

OSPFv2

o Uses fixed format packets with all fields aligned at 32-bit boundaries for faster processing of the OSPF packets (doesn’t really matter anymore as the CPUs are really fast these days!). This was primarily done because OSPF was meant to be an IPv4 only protocol.

o The downside of the above is that the packet formats are not at all extensible. We had to come out with OSPFv3 when we wanted to provide support for IPv6.

o It uses Link State Advertisements (LSAs) for advertising the routing information and the original specification called for dropping any unrecognized LSA type.

o LSAs of type 9, 10 and 11 have been introduced for advertising other application specific information and enough vendors now support this so that they are likely to get from one side of the network to the other.

o Since the unrecognized LSA types are not flooded to neighbors it makes it very difficult to extend OSPF. This in turn means that all the OSPF routers must be upgraded network-wide to make the new extensions work.

o The new RFCs (Traffic Engineering, GMPLS extensions, etc) written for OSPF now support TLV encoding.

OSPFv3

o Exhibits implicit opaque LSA behaviour i.e. unrecognized LSA types are flooded to the neighbors making it more extensible that OSPFv2

o Designed in a way which makes it easily extensible to any other layer 3 protocol suite.

MTU Limitations ..

The MTU of a sub-network is the largest size packet or frame, specified in octets that can be sent over it. Both OSPF and IS-IS require communicating routers to have matching MTU sizes in order to form adjacencies. This is needed so that routers will not advertise packets larger than a neighbor can receive and process. However, each protocol uses a different mechanism to check against MTU mismatch. For this discussion the term MTU is used for a links Maximum Receive Unit (MRU) too.

IS-IS

– IS-IS works over the link layer, which does not provide for fragmentation and reassembly.

– Hello’s are sent padded to MTU size till an adjacency comes up. If there is an MTU mismatch, the side having the lesser MTU would drop the bigger than MTU hello. This would not allow adjacencies to be formed between interfaces having different MTU’s.

– The hello MTU match is an insufficient condition for IS-IS as LSP’s are flooded as is and not packed into any other packets. For the LSP’s to be successfully synchronized across the subdomain, all LSP’s need to be of a size lesser than the smallest link MTU in the subdomain, else the flooding of the LSP on the link will fail resulting in inconsistent routing tables.

– Mis-configuration of the maximum packet size that a router sends out can cause problems across the subdomain as there is no way to check the value between routers that are not adjacent.

OSPF

– OSPF works over IP, so the fragmentation and reassembly of any OSPF packet is taken care by the IP layer. However for some link technologies where MTU is configurable but not negotiated, we can have packet black-holes whenever packets larger than the receiving sides MTU are sent.

– The MTU is exchanged in the database description packets. If the value of MTU received in the first DB description packet is greater than that can be accepted on an interface, the packet is rejected and the adjacency is not formed. Retransmissions of DB description packets occur because the packets are never acknowledged. The adjacency therefore gets stuck in EXstart state.

– As LS Update’s are assembled in each router, the MTU of another link does not affect the size of the LS Update packet.

– As the MTU match is done at the database exchange state after the DR election has been completed, in case the DR itself cannot form adjacencies with the rest of the routers, it can cause the network to become a stub.

Area ID Change Functionality ..

Changing area-id for an area is useful for link state routing protocols in order to merge two areas into one or to split an area into several areas.

IS-IS

– An area address is a variable length quantity.

– An area can have multiple area addresses. Neighboring IS’s will not form an adjacency unless they have a single area address in common. This is quite useful for IP networks that are transitioning from one area address to another, merging two areas into one or even to split an area into several pieces.

– Seamless transition of area addresses for an area is easier in IS-IS, e.g. initially an area can have area adress A, then the set {A, B} and when the new area address B is recognized by all the routers in the area, old area address A can be removed.

OSPF

– In OSPF each area has a single ID, a 4-byte quantity.

– OSPF does not have the ability to merge and split areas dynamically as IS-IS has, though partitioned backbone can be repaired by using virtual link. But it should be ensured that the area through which virtual link is configured is having full routing information, i.e. it should not be a stub area.

– Area-id can not be changed dynamically in case of OSPF.

Convergence and Scalability Issues in OSPF and IS-IS

This seems to be the favorite question that every newbie has! There is no unequivocal answer to this and it all depends upon the kind of network and the topology. Having said this, lets try to see how the two IGPs can be compared.

IS-IS

This protocol is limited by the maximum number of LSPs that each IS-IS router can issue. This is 256 as its LSP ID is 1 octet long. The total number of IP prefixes carried by IS-IS can be easily computed and it comes to O(31000). However, RFC 3786 describes mechanisms to relax the limit on the number of LSP fragments, thereby increasing the number of IP prefixes that can be carried within IS-IS.

I have however, never seen any network carrying more than O(30K) IP prefixes inside IS-IS. Do let me know if you’re aware of networks where you see IS-IS carrying more than 30K routes.

I say this because this is a reasonable number for any sane IS-IS deployment and it will not run out of space unless someone actually injects the entire (or even partial) BGP feed into the IGP. In that case we will run out of space at about 20% of the way into redistribution and not be able to advertise the rest. It is for this reason that this practice has now been deprecated and the RFC 1745 which lays down the rules for BGP- OSPF interaction, has been moved to the HISTORICAL status.

There are 8 bits to define a pseudonode number in the LSPID which means that a router can be a Designated Intermediate System (DIS) for only 256 LANs. Additionally there is also a limitation on the number of routers that can be advertised in pseudonode LSP of the DIS. Dont worry – RFC 3786 fixes this!

RFC 3373 OTOH proposes a new TLV thats carried in the IIH PDUs that can increase the number of point-to-point adjacencies from 256 on a single router.

The “Remaining lifetime” field which gives the number of seconds before LSP is considered expired is 16 bits wide.

This gives the life time of the LSP as 2^16/60/60 Hrs = 18.7 Hrs

Thus the LSP issued by a router needs to be refreshed after every 18.7 Hrs. So youre not going to see a lot of IS-IS control packets being regenerated in a stable topology.

OSPF

In theory, OSPF topology is limited by the number of links that can be advertised in the Router LSA as each router gets only one Router LSA and it cant be bigger than 64K which is the biggest an IP packet can be. The same constraint applies to the Network LSA also.

Each link in the router can take up at most 24 bytes. Thus, number of links which can be supported is given by (64 * 1024) / 24 = 2370

However, if we take the minimum link size per link (12 bytes) then the maximum is about 2 * 2370 = O(5000) links

To be more specific, we can have O(2300) p2p and p2mp links (not considering virtual links, etc) and O(5000) broadcast/NMBA links described in OSPF’s Router LSA and its Network LSA.

Thus each Router LSA can carry some 5000 links information in it. It is hard to imagine a router having 5000 neighbors but there are already routers with 400 neighbors in some ISPs, and it may not take long to reach the order of the magnitude limited by OSPF.

The Network LSAs are generated by the designated router (DR) for each broadcast network it is connected to. To have scaling problems it should have 2730 * 6 times neighbors on that interface. This is even less probable and hence there are no scalability problems with OSPF per se.

All other LSAs apart from Type 1 and Type 2 hold single prefixes. Because there is no limit to the number of such LSAs, a large number of inter-area and externals can be generated depending upon the memory resources of the router.

Each LSA has an LS Age field which is counted upwards starting from zero. Its life is an architectural constant which says one hour. When an LSA’s LS age field reaches MaxAge, it is reflooded in an attempt to flush the LSA from the routing domain. One hour seems like a long time but if one originates 50,000 LSAs then OSPF will be refreshing on an average of just 36ms!

Total number of LSAs to be refreshed = 50,000

Time by which all the LSAs must be refreshed = LSRefreshTime = 30mins = 1800 secs

Rate at which the LSAs need to be refreshed = 1800/50000 = 36ms

However, if the refreshes are perfectly spread out across time and perfectly batched, the actual update transmission rate may be on the order of one packet per second.

There is however a “do-not-age” LSA which in theory can be pressed into service and which never gets aged. However, such LSAs will be eventually purged from the LS database if they become stale after being held for at least 60 minutes and the originator not reachable for the same period. Moreover it is not backward compatible and if one deploys that in the network today with some routers not supporting this then the network can really get weird. So there isn’t really much that can be done using these unless the whole network is changed!

Theoretically, both the routing protocols are scalable and there should not be any issues with either one of these if implemented properly. Both have similar stability and convergence properties. Practically, providers must go with what their vendors suggest since the vendors are best aware of how each protocol has been implemented on their platforms.

I discuss more of this here.

Database Granularity in OSPF vs IS-IS ..

This post compares how the two link state protocols hold their routing information in their databases as this affects their behavior in how they flood/distribute the change of routing information and the internal implementation complexity.

OSPF

o Organization of Routing Information

OSPF encodes the routing information into small chunks, which it calls Link State Advertisement (LSA). Each LSA has its own 20-byte header in order to be identified uniquely. This header is called the LSA Header. There is no limitation on the size of a LSA, though the actual LSA size is limited by IP packet size limitation: 65,535 bytes minus the LSA Header size and IP packet header size. The database access in OSPF is per LSA basis.

In OSPF routing, the information within an area is described by type 1 and type 2 LSAs (known as Router-LSA and Network-LSA respectively). These LSAs can become big depending upon the number of adjacencies to be advertised and prefixes to be carried inside an area. In other words, the routing information with respect to a single node (either router or network node) is encoded inside a single LSA. On the other hand, each inter-area or external prefix is advertised in a separate LSA (AS-External LSA).

An OSPFv2 router may originate only one Router-LSA for itself, while in OSPFv3, a router is allowed to originate multiple Router-LSAs. A router may originate a Network-LSA for each IP subnet on which the router acts as a designated router (DR). A router may originate one LSA for each inter-area and external prefix, with no limitations on the number of LSAs that it may originate.

Consequences

Originating a new and a unique LSA for each inter-area route and an external prefix implies that there is a LSA Header overhead involved while the information is kept in the database or is flooded to the neighbors. There is thus some extra memory and bandwidth consumed in total.

o Carrying Routing Information

LSAs are carried in Link State Update packets (called LS Updates or LSUs). Each LS Update packet has its own header, consists of a 24 byte OSPF protocol header, and a 4-bytes field indicating the number of LSAs contained in the packet. Thus multiple LSAs can be packed into a single LS Update packet. Some implementations may not do this as its considered difficult achieving this during flooding.

Consequences

In the face of network changes, OSPF floods only the updated LSAs. Therefore, even if an implementation does not pack multiple LSAs into a single LS Update packet (and so bandwidth is consumed by LS Update header for each update of a single LSA), the bandwidth consumption for each network change can be considered adequately small.

IS-IS

o Organization of the Routing Information

In IS-IS, protocol packets are called Protocol Data Units or PDUs. IS-IS encodes the link state information into the set of TLVs and packs these TLVs into one or more Link State PDUs (LSPs). The size limit of a LSP is configurable. The Routing database consists of these PDUs and the access to the database is per PDU basis. The original IS-IS specification places an upper bound on the number of LSPs a router can originate to 255. There are however techniques which enable a router to originate more than 255 LSPs, by using multiple system-id’s for itself.

Consequences

Since routing information in IS-IS for each router is packed in fewer LSPs, the memory consumed for bookkeeping of the routing data within the database is less and is more efficient.

o Carrying Routing Information

Each LSP is flooded independently, without being modified all the way from the originator through the routers till the very end. This results in all the routers having the same LSPs as that originated by the first router.

Consequences

Since LSPs are not modified in any way and are not allowed to be fragmented, in order to be flooded successfully over all links existing in the IS-IS network, great care must be ensured when configuring the size limit of LSP that routers can originate and receive.

If the size limit of the LSP is set without taking into account the minimum value of the MTUs throughout the network, or if the size limit of LSPs conflict among some the routers in the network, the database synchronization may not be achieved, and this can result in routing loops and/or blackholes.

When a change occurs to a LSP, the whole LSP needs to be flooded, and therefore the bandwidth usage can be non-optimal. There is however a solution which exists in theory. If an implementation finds some of the entities to be flapping, then they may be packed into smaller LSPs or may be isolated from the other stable entities. This way one needs to only advertise the unstable LSP/LSPs. I have not btw come across any implementation that does that. Leave a comment if you know one that does this!

Database granularity also affects when two routers need to synchronize their databases. In OSPF, because of its high database granularity there are a lot of items which it needs to synchronize and that process is somewhat complicated with a lot of DBD packets being exchanged back and forth. This gets worse if the router trying to sync is being inundated with a lot of other data traffic also. This is not much of an issue these days as any router worth its salt would prioritize the OSPF control packets.

This is however much simpler in case of IS-IS and there isn’t any finite state machine that the neighbors need to go through to synchronize their databases. It just uses it regular flooding mechanism (a couple of CSNPs describe their entire topology information) to exchange its entire database. You plug in the new IS-IS router and before you realize the router is already sync’ed up with all the other IS-IS routers in the network!

Flushing LSAs (and LSPs) in OSPF and IS-IS ..

An LSA/LSP is flushed (purged) when the contents carried by the LSA/LSP are no longer valid. In OSPF when an LSA is flushed the age is set to MaxAge and the LSA is flooded. In IS-IS when an LSP is purged (flushed) the header alone is flooded with the Remaining Lifetime set to 0, and the value of checksum set to 0.

OSPF only allows self originated LSA to be flushed, IS-IS spec allows in certain cases for non-self originated the LSP to be purged (though new implementations don’t support this and the update RFC has changed it) which can lead to problems.

In OSPF a flushed LSA is not removed unless the LSA is not on any of the retransmit lists and none of the adjacencies on the router are in state Exchange or loading. This ensures that an LSA that an LSA is flooded to all its neighbors before it is removed from the domain. In IS-IS an LSP purged is kept for ZeroAge lifetime if the LSP purged is a self originated LSP and the LSP is kept for MaxAge if the LSP is non self-originated before the LSP is deleted.

When purging an IS-IS LSP the header and authentication data is kept while purging (certain OSPF implementations do the same). However for those LSP’s that don’t support authentication, because the checksum is set to 0 for purged LSP’s, the integrity of the contents cannot be verified. In OSPF the entire content of the LSA is intact while flushing leading to unnecessary data sending.

Checks on HELLOs for OSPF and IS-IS during Adjacency Formation ..

The HELLOs (or IIHs in IS-IS parlance) are responsible bringing up the adjacencies between the two (or multiple) routers. Forming adjacencies is an integral part of all link state routing protocols as all protocol packets other than HELLOs are flooded only over the adjacencies. The rules for formation of such adjacencies however differ between IS-IS, OSPF v2 and OSPF v3.

IS-IS

Besides the basic checks to verify the integrity of the packet, IS-IS does a few checks before forming any adjacency upon receiving the IIHs.

o It allows multiple area addresses to be configured on a router. During the IIH exchange the adjacency is formed only if at least one of the area address matches. The advantage of having multiple areas is explained in the further posts. NOTE that Level 2 only adjacencies would be formed even if the area addresses are not matching.

o To prevent the LSPs and CSNPs from being dropped due to different values for originatingLSPBufferSize and ReceiveLSPBufferSize, all IIHs are padded till the maximum MTU when the adjacency comes up. This check verifies consistent settings between the adjacent routers. This is however not a sufficient check.

o Adjacencies are formed without regard to interface addressing or asymmetric HOLD timer values. Values of IIH interval are not sent in IIH packets. While the IS-IS protocol provides sufficient routing information for relaying packets between adjacent routers, many implementations nonetheless require ARP support to do this. These implementations typically refuse to form an adjacency unless the neighbor interface IP address is on the local interface’s IP subnet.

o IS-IS can carry addressing information of different protocols in its TLV’s. However, the protocol supported field must be sent in Dual and IP-Only routers. RFC1195 specifies no checks for the protocol supported field for adjacency formation. It places topology restrictions on multi-protocol networks. In networks that conform to these restrictions, neighboring routers will always have a protocol in common. Therefore, it does not state whether adjacency formation should take protocols supported into account. However, many implementations, do not form an adjacency with a neighbor unless they share at least one protocol in common.

o Not matching Hold Timer values has advantages wherein the administrator can set different Hold times for different routers. This helps in cases where the going down of a DIS or some router needs to be detected faster. For such routers the hold timer can be set to a lower value.

OSPFv2

The checks for formation of adjacencies are stricter in OSPFv2 as compare to that of IS-IS.

o The area-id of the received packet should always match the incoming interface (with the exception of virtual links). Area type is strictly checked by checking the E-bit (not set for non-default areas) and the N- bit (not-set for non-NSSA areas).

o The values of the Hello interval, the Router Dead Interval and network mask received in Hellos are matched with those on the configured interface. Any mismatch in the values causes the Hello packet to be dropped and hence prevents formation of adjacencies. The disadvantage of this approach is that Hello Interval and Router Dead Interval changes need to be done within the Router Dead Interval, to prevent breaking adjacencies. The advantage is we would not form adjacency in case there is a router that has been mis-configured with a large value and which could cause problems later. The network mask check however does not apply to point to point links. That allows the two ends of a Point-to-Point link to have different addresses.

o MTU check is not done in the hellos. It is done in the during the Database (DB) Exchange process.

OSPFv3

Most of the checks for OSPFv3 are similar to that of OSPFv2.

o OSPFv3 runs on a per link basis instead of a per subnet basis. The check for a network mask is thus not done.

o Instance ID field (non-existent in OSPFv2) on the link is matched with the incoming ID in Hellos. The adjacency is formed only if the Instance ID matches. This allows multiple instances of OSPFv3 to run on a single link.

Areas and Hierarchy in OSPF and IS-IS ..

This is required primarily for scalability issues wherein instabilities inside one small section of the network are hidden from the rest of the network. This also helps in reducing the size of the routing tables, etc. Both the protocols establish a two level hierarchy among the areas.

IS-IS

– Divides the whole routing domain into small areas and uses logical hierarchy based on routing levels called Level 1 and Level 2.

– Level 1 routing is within the area and L2 is between the areas.

– Original spec called for Level 1 routers to know only the topology inside their area and they were unaware of routers/destinations outside of their area. They simply forward all their traffic for outside their area to the nearest Level 2 router.

– Level 2 routers knew only the Level 2 topology and didn’t know any topology inside the area. This forced strict hierarchal routing between the areas where all inter-area data traffic originating from one area followed a default route to the Level 2 sub-domain, where it was forwarded by L2 routing to the destination area.

– This has now changed and RFC 2966 allows leaking L2 information inside L1 for more optimal routing.

– There was some work done in IS-IS for multi-level hierarchies but it wasn’t all that useful and was dropped in between. The idea was that if the networks use IDRP as well along with IS-IS then the 2 levels may not be enough.

– IS-IS routers are associated with a single area and the whole router then belongs to that particular area.

– Area boundaries intersect on links .

– can be extended to support higher levels of hierarchy based on the way routes are leaked in between the levels by setting the up/down bit, when routes are propagated down the hierarchy.

OSPF

– Divides the routing domain into regular areas and a backbone area that is designated as area 0.0.0.0 and all packets going from one area to the other must traverse through this backbone.

– The spec calls for the backbone to be contiguous and to be connected to all the areas through an ABR. There is however a provision to work with disconnected physically disparate backbone areas using virtual links

– Can be attached to multiple areas as its designed around links and uses a links based addressing scheme. It’s the links which are assigned to the areas and not the routers themselves.

Designated Router (or DIS) concepts ..

The DR concept is used by both IS-IS and OSPF on the broadcast media to limit the amount of link state information exchanged between the routers on such media. It helps to reduce the number of adjacencies formed on broadcast media to O(n) instead of O(n^2), where n is the number of nodes.

First things first. There are no routers in IS-IS; there are only Intermediate Systems. So the leader and the evangelist for a LAN in IS-IS is called a DIS – Designated Intermediate System.

IS-IS

– DIS election is deterministic.
– No concept of backup DIS.
– A new DIS is elected when the current goes down.

OSPF

– DR election is non-deterministic.
– Elects DR and BDR to conduct flooding on a LAN.
– All routers on the LAN are only synchronized with the DR and BDR.
– DRship is sticky.

(i) What is meant by a deterministic/non-deterministic election?

In IS-IS, deterministic DIS election makes the possibility of predicting the router that will be elected as DIS from the same set of routers. The router advertising the numerically highest priority wins, with numerically highest MAC address breaking the tie. In IS-IS, DIS can be pre-empted at any time by a different IS-IS node (or a router in plain-speak) with higher priority coming alive.

In OSPF, the DR election is sticky meaning that after a router has been elected, no other router can take over the position unless the original DR goes down. When a router comes up, it accepts the DR regardless of its own priority if a DR is already there. Otherwise the router itself becomes DR if it has the highest priority on the network (a LAN to be precise).

The above scheme makes it harder to predict the identity of the DR, but ensures that DR changes less often. The rationale behind this sticky nature of DRship in OSPF is that it is disruptive to have DR changes as DR keeps track of which routers have acknowledged which link state information and it would require a lot of time and protocol messages for another router to take over in case the DR went down.

Both the sticky and deterministic mechanisms of DR/DIS elections in OSPF and IS-IS can be modified to provide the functionality of the other with some simple modifications in the implementations.

(ii) Do we need a backup Designated router or an Intermediate System?

A backup DIS is redundant in IS-IS because all the routers are synchronized with each other and also because the shorter Hello interval used by the DIS allows for faster detection of failures and subsequent replacement of the DIS.

The presence of BDR in OSPF makes the replacement of the DR transparent in case the DR goes down. All routers on the LAN are only adjacent and synchronized with DR and BDR; and backup DR is fully synchronized with the DR. Forming adjacencies with only the DR/BDR is done to reduce the complexity of data exchange and minimize flooding.

Differences in IS-IS and OSPF encapsulation ..

IS-IS is cool and runs directly over the data link alongside IP. On Ethernet, IS-IS packets are always 802.3 frames, with LSAP value 0xFEFE while IP packets are either Ethernet II frames or SNAP frames identified with the protocol number 0x800. OSPF runs over IP as protocol number 89.

IS-IS runs directly over layer 2 and hence

– cannot support virtual links unless some explicit tunneling is implemented.

– packets are intentionally kept small so that they don’t require hop-by-hop fragmentation.

– uses ATM/SNAP encapsulation on ATM but there are hacks to make it use VcMux encapsulation.

– some OSs that support IP networking have been implemented to differentiate Layer 3 packets in kernel. Such OSs require a lot of kernel modifications to support IS-IS for IP routing. Ditto goes for any HW thats clever and tries to identify L3 control packets in the ASICs.

– can never be routed beyond the immediate next hop and hence shielded from IP spoofing and similar Denial of Service attacks.

– need to provide code points of access for each data link protocol types (Frame Relay, Ethernet, ATM, PPP, etc).

– doesnt need to rely on network layer protocols (like ARP) to communicate with the neighboring systems. Some implementations however, do rely on ARP or static routing to
communicate with the neighbors on LAN.

OSPF runs over IP and hence

– can support virtual links.

– can use IP fragmentation services.

– can use VcMux encapsulation on ATM.

– if an OS already supports IP, no it requires no changes to support OSPF.

– can be routed to a destination multiple hops away and thus vulnerable to DoS attacks and IP spoofing

– transmitted with additional IP header information, thereby increasing some packet overhead. I woudnt fret over this coz there arent many control packets running helter-skelter in a network anyways!

(i) IP Fragmentation

LSPs in IS-IS, unlike as in OSPF, are not regenerated hop-by-hop and so they must be small enough that they are guaranteed to be able to cross *any* media in the network and the value of the maxsized LSP should thus not be greater than the minimum link MTU size in the area.

If a router has more than maxsized LSP bytes of information to advertise into IS-IS, then this originating router must fragment its LSP before flooding.

In past, one area of concern regarding the scalability of the link state routing protocols was the way they would flood and it is believed that preventing fragmentation during flooding is the reason why IS-IS fragments only at the originating router.

OSPF does not provide any explicit fragmentation/reassembly support. When fragmentation is necessary, IP fragmentation/reassembly is used. OSPF protocol packets have been designed so that large protocol packets can be generally be split into several smaller protocol packets.

(ii) ATM Encapsulation

OSPF can run over ATM using VcMux encapsulation (which essentially assumes that all the packets carried are IP) while IS-IS requires LLC/SNAP encapsulation where ATM layer can distinguish between multiple Layer 3 protocols over the same VC. The disadvantage of using the LLC/SNAP encapsulation is that it has some additional bytes for the LLC-SNAP header which results in a packet size > 40 bytes.

Thus a simple TCP ACK message of 40 bytes along with the LLC-SNAP header adds enough bytes so that a single TCP ACK won’t fit into one ATM cell. Much bandwidth is thus wasted because now each TCP ACK requires 2 ATM cells.

An IETF draft proposes a workaround to this issue in which both IS-IS and IP packets can be sent over an ATM VC using Vc Mux encapsulation by reading into the first byte of the L3 header to distinguish between IP and ISO family packets, such as IS-IS, CLNS and ES-IS. However this did not gain popularity because of the demise of ATM cores in the largest ISPs.

The first two fields in the IP header are the 4-bit version number and the 4-bit header length. The value of the first byte is normally 0x45. If there are IP header options attached to the IP header, the first byte can be between 0x46 and 0x4F. The first byte in an IS-IS packet is always 0x83. Thus by looking at the first byte of an incoming packet, the receiver can separate IP and IS-IS packets. Because of this feature one does not need to depend on the ATM layer anymore to help with the de-multiplexing. Routers an now send and receive both IS-IS and IP packets using Vc Mux encapsulation and thus avoid the ATM cell tax.

IS-IS and OSPF Interface Types Supported ..

OSPF models networks as
– Broadcast links
– Point to Point (P2P)
– Point to Multi-Point (P2MP)
– Non-Broadcast multi-access Networks (NBMA)

IS-IS models networks as
– P2P
– Broadcast
– Unnumbered Broadcast

The key differences are the way OSPF provides support for NBMA networks and inherent protocol support for unnumbered broadcast by IS-IS

(i) Support for NBMA Networks

IS-IS has no direct support for connecting ISs over a NBMA network and it must be modeled as a LAN or treated as a set of P2P links. Modeling it as the latter involves a lot of configuration and if full connectivity is not configured, multiple hops might be required for traversing the NBMA cloud.

Experience with ATM LAN emulation has proven un-scalable and insufficiently reliable because of the single point where replication takes place to emulate multicast.

The best alternative for IS-IS is thus to treat each PVC as a point-to-point link. All PVC failures are handled by the protocol since each PVC is visible to the protocol.

IS-IS mesh groups may be used to address the scaling issues which may result from redundant flooding in the highly meshed environments.

In OSPF there is a “NBMA mode” in the original specification which makes the protocol aware that it is on a NBMA network.

Neighbours are discovered initially through configuration which is restricted to the ones eligible for the DR election. To make administration easier and to reduce the HELLO traffic, most of the other routers attached to the NBMA subnet are assigned a router priority of zero. It thus involves quite a bit of administration overhead and is prone to mis-configuration. Also the network will malfunction if one of the nodes loses its link to the DR.

In this mode, each node in the NBMA must have a PVC to the DR and BDR. Since adjacencies between non-DR nodes is not mandated, the order of the number of adjacencies is O(2n), rather than O(n^2) as required when running OSPF without NBMA mode.

NBMA networks are thus only as robust and reliable as the underlying data-link service. If for example, a PVC fails or is mis-configured or if an SVC cannot be established, due to capacity or policy reasons, routing over NBMA subnet will fail. And, unfortunately, often the reason for the failure will not be immediately obvious to the network operator.

The P2MP can be applied to rectify these problems, although at some loss of efficiency.

(ii) Point-to-Multipoint model

This model can be used on any data link technology that the NBMA model can be used on. In addition, the P2MP model doesn’t require all the participating routers to be able to communicate directly to model a partial PVC mesh as a single P2MP networks. Dropping the full mesh requirement also allows the modeling of more exotic data link technologies, such as packet radio, as P2MP networks.

So if your favorite OS can’t support virtual interfaces or if there’s too much overhead involved in generating separate sub interfaces to each of the 500 ATM circuits then P2MP is good and can be your life saver!

However, when operating a full mesh Frame Relay or ATM network in P2MP mode, the work involved in neighbor maintenance, flooding, and database representation increases as O(n^2), where n is the number of OSPF routers attached to the subnet, instead of O(n)behavior that can be achieved with the original NBMA model.

(iii) Unnumbered broadcast

IS-IS supports unnumbered broadcast interfaces; however, most implementations do not! The protocol provides all necessary routing information without the aid of ARP, but doing this requires that each FIB entry contain a next-hop (circuit, SNAP address) pair for each path to a destination, and many routers are designed with FIB entries that contain only next-hop IP addresses instead, to reduce the size of the FIB and perhaps as a simplification.

For this reason, many implementations won’t interoperate with an unnumbered broadcast interface, and won’t interoperate with an implementation that doesn’t support ARP.