Hub and Spoke Multi-Point LSPs for Scalable VPLS Architecture

Multiprotocol Label Switching (MPLS) and Generalized MPLS (GMPLS) provide a mechanism to set up point-to-multipoint (P2MP) LSPs which carry traffic from one ingress point (the root node) to several egress points (the leaf nodes), thus enabling multicast forwarding in an MPLS domain. However, there is no provision to provide a co-routed path back from the egress points (the leaves) to the ingress node (the root). The only way to do this is by configuring Unidirectional point-to-point LSPs from the leaves back to the root node. This entails configuring each leaf node with an LSP back to the root which could be a configuration and management nightmare if the number of leaves are large. Second, it can also not guarantee a co-routed path back from the leaf to the node, as the process of setting up the Unidirectional P2P path is independent from setting up the P2MP path.

This post introduces the concept of a hub-and-spoke multipoint (HSMP) LSP that allows traffic from the root to the leaves via a P2MP LSP and back from the leaves to the root via a unidirectional co-routed LSP. The proposed technique targets one-to-many applications that require reverse one-to-one traffic flow (thus many one-to-one in the reverse direction).

Consider the figure shown below.

PE1 is the ingress router for the HSMP LSP. The egress routers (leaves) are PE2, PE3 and PE4. As can be seen from the figure, PE1 creates a single copy of each packet arriving from the data source. This packet carries the MPLS label value L1. P1 is an ordinary Label Switch Router (LSR) that swaps the incoming label L1 with L2. Lets now focus on the packet forwarding process on the node P2.

For each packet belonging to the HSMP LSP, P2 makes three copies (just like how its done for a P2MP LSP), each of which is sent to PE2, PE3 and PE4 respectively. Packets arrive at P2 with MPLS label L2. As shown in the above figure, P2’s ILM contains an entry for label L2 saying that one copy of the packet should be sent out on interface if1 with label L3, another copy on interface if2 with label L4 and a last copy on interface if3 with label L5. Since the LSR P2 is replicating the MPLS packets its called the branch node.

An obvious advantage of this scheme is the bandwidth optimization. If we had used unidirectional P2P LSPs instead of an HSMP LSP, PE1 would have sent three copies of each packet that it had received from the data source and thus congesting the links PE1-P1 and P1-P2.

What sets an HSMP LSP apart from a regular P2MP LSP is the ability in the former to set up a path from the leaves back to the root. P2MP LSPs are unidirectional, so no traffic can flow from the leaf node routers to the ingress head end router along the P2MP LSP. In HSMP LSP, the leaf nodes can also send unidirectional traffic back to the root. This is shown in the figure below.

Because of the mechanisms defined in HSMP LSP, the branch node P2 advertises the same upstream label L1 for a given HSMP LSP to the nodes PE2, PE3 and PE4. It programs its ILM table as shown above, where it simply swaps L1 with L2 for all incoming MPLS packets and sends those towards P1. This way HSMP LSP is also able to provide a path back from the leaf nodes to the root node.

In the last post i had discussed issues that exist in VPLS. Lets see how HSMP LSP can solve them. I am using the same topology as was used there to demonstrate how HSMP LSP helps.

The figure 3 above shows the same VPLS service that we had discussed in the earlier post.

PE1 knows through some out-of-band mechanism (could be via BGP, Radius, manual configs, etc) that PE2, PE3 and PE4 are the egress nodes that belong to the same VPLS domain. PE1 now needs to establish an HSMP LSP (can be trivially extended to support a pseudowire) to PE2, PE3 and PE4. Figure shows 3 HSMP LSPs that will be required in this arrangement. The red HSMP LSP has PE1 as the root node and PE2, PE3 and PE4 as the leaf nodes. The green HSMP LSP has been initiated by PE2 and has PE2 as the root, and PE1, PE3 and PE4 as the leaf nodes. The blue HSMP LSP has been initiated by PE3 and has PE1, PE2 and PE4 as the leaf nodes. There is another HSMP LSP thats required – the one initiated by PE4 for the VPLS service to function. It has been omitted from the figure for the sake of clarity.

Thus all PE nodes in a VPLS service need to initiate an HSMP LSP (or a HSMP pseudowire) that terminates at the other PE routers.

In VPLS all BUM (broadcast, unknown unicast and multicast) traffic is flooded to all PE nodes. Its only the learnt traffic thats sent unicast by one PE to the other PE.

As explained earlier, a single copy sent by PE1 over an HSMP LSP will reach PE2, PE3 and PE4 (due to its P2MP component). Also the PE routers PE2, PE3 and PE4 can use this HSMP LSP (terminating at them) to send unicast traffic back to PE1.

Thus PE1 sends all BUM traffic on the HSMP LSP it initiates and all learnt unicast over the HSMP LSP that terminates on it.

Going back to figure 3, we can see that PE1 can use the red HSMP LSP to send all BUM traffic. This way it only sends one copy, and all the PE routers receive this packet. If PE1 wants to send learnt unicast traffic back to PE2, it uses the green HSMP LSP that terminates on it. PE1 can use this to send traffic back to PE2, which is the root node for this HSMP LSP. Similarly, PE1 uses the blue HSMP LSP whenever it wants to send learnt unicast traffic back to PE3.

Lets look at LSPs that PE2 uses. Whenever it wants to send BUM traffic, it uses the green HSMP LSP that it has originated. Any packet sent over that LSP is received by all the leaf nodes (which in this case happen to be PE1, PE3 and PE4). Like PE1, if it has to send learnt unicast traffic back to PE1, it uses the red HSMP LSP that was originated by PE1 and terminates at PE2.

Thus for a VPLS to fully function, all PE nodes must establish an HSMP LSP with all the other participating PE routers. It can use the optimized HSMP LSP that it originates for the BUM traffic and the HSMP LSP that other PEs originate for unicast communication.

The above table compares a VPLS service using HSMP LSPs with a regular VPLS service or Hierarchical VPLS (H-VPLS) service. Clearly, the former wins against the regular VPLS and H-VPLS on all counts. This may also perhaps be an answer to Juniper’s claim that H-VPLS is not scalable. Operators now need not implement H-VPLS; they can instead go in for VPLS services implemented using HSMP LSPs.

This post only briefly explains the idea behind HSMP LSPs. Its explained in detail here in this draft.

Advertisement

Shortest Path First (SPF) Algorithm Demystified ..

For the sake of brevity i would refer an Intermediate System (IS) in IS-IS parlance as a router in the following post – thus a router in this post could mean either an OSPF speaker or an IS-IS speaker or some other routing element from your favorite link state protocol. For the SPF algorithm to work, it would require *all* routers in the network to know about all the other routers in the network and the links connecting them. How a link state routing protocol encodes this information and ensures that its disseminated properly is left to that protocol. OSPF encodes this information in Link State Advertisements (LSAs) and floods it reliably, while IS-IS encodes this in a Link State Packet (LSP) that it originates.

Once each router knows about all the other routers and the links connecting them, it runs the Dijkstra Shortest Path First algorithm to determine the shortest path from itself to all the other routers in the network. Since each router has a similar copy of the link state database and each runs the same algorithm, they end up constructing the same view of the network and packets get routed consistently at each hop.

So, how does SPF algorithm work in OSPF and IS-IS.

Imagine a simple network as shown in the figure below.

Network Diagram

Once each router has flooded its link state information in the network, all routers know about all the other routers and the links connecting them. The link state database on each router looks like the following:

[A, B, 3], [A, C, 6], [B, A, 3], [B, D, 3], [B, E, 5], [C, A, 6], [C, D, 9], [D, C, 9], [D, B, 3], [D, E, 3] , [E, B, 5] and [E, D, 3]

Each triple should be read as {originating router, router its connected to, the cost of the link connecting the two routers}

So what does Router A do with this information and how is this used in SPF?

While running SPF, each router maintains two lists – the first is a list of nodes for which the shortest path has been determined and we are sure that no path shorter than the one we have computed can exist. This list is called the PATH (or PATHS) list. The second is the list of paths through the routers that may or may not be the shortest to a destination. This list is called the TENTative list, or simply the TENT list From now on, TENT would refer to the TENT list and PATH, to the PATH list.

Each element in the list is a triplet of the kind {endpoint router that we’re trying to reach, total distance from the calculating router, next-hop to reach the endpoint router}

Each router runs the following algorithm to compute the shortest path to each node:

Step I: Put “self” on the PATH with a distance of 0 and a next hop of self. The router running the SPF refers to itself as either “self” or the root node, because this node is the root of the shortest-path tree.

Step II: Take the node (call it the PATH node) just placed on the PATH list and examine its list of neighbors. Add each neighbor to the TENT with a next hop of the PATH node, unless that neighbor is already in the TENT or PATH list with a lower cost.

Call the node just added to the TENT as the TENT node. Set the cost to reach the TENT node equal to the cost to get from the root node to the PATH node plus the cost to get from the PATH node to the TENT node.

If the node just added to the TENT already exists in the TENT, but with a higher cost, replace the higher-cost node with the node currently under consideration.

Step III: Find the lowest cost neighbor in the TENT and move that neighbor to the PATH, and repeat Step 2. Stop only when TENT becomes empty.

Lets follow the sequence that Router A goes through for building its SPF tree.

1st Iteration of the SPF run

Step I: Put “self” on the PATH with a distance of 0 and a next hop of self. After this step the PATH and TENT look as follows:

PATH – {A, 0, A}

TENT – { }

Step II: Take the node (call it the PATH node) just placed on the PATH list and examine its list of neighbors. Patently, A is the PATH node. Examine its list of neighbors ({A, B, 3}, {A, C, 6}). OSPF does this by looking at the LSAs advertised by Router A, while IS-IS does this by looking at the neighbors TLV found in Router A’s LSP. When an IS-IS node is placed in PATHS, all IP prefixes advertised by it are installed in the IS-IS Routing Information Base (RIB) with the corresponding metric and next hop.

The Step II further says – “Add each neighbor to the TENT with a next hop of the PATH node, unless that neighbor is already in the TENT or PATH list with a lower cost.” A’s neighbors are B and C, and since neither of them is in the PATH or TENT, we add both of them to the TENT.

PATH – {A, 0, A}

TENT – {B, 3, A}, {C, 6, A}

Step II says – “Call the node just added to the TENT as the TENT node. Set the cost to reach the TENT node equal to the cost to get from the root node to the PATH node plus the cost to get from the PATH node to the TENT node.”

Lets pick up the TENT node B. The cost to reach B would be the cost to reach from root node to PATH node + cost from PATH node to TENT node. In the first iteration of SPF, both the root node and PATH node is A. Thus total cost to reach B is cost to reach from A (PATH node) to B (TENT node) which is 3.

Step II says – “If the node just added to the TENT already exists in the TENT, but with a higher cost, replace the higher-cost node with the node currently under consideration.”

B isnt in the TENT, so skip this.

Step III says – “Find the lowest cost neighbor in the TENT and move that neighbor to the PATH, and repeat“. The lowest cost neighbor is B (with cost 3). Move this to PATH. We go back to Step II since TENT isnt yet empty.

2nd Iteration of the SPF run

PATH – {A, 0, A} {B, 3, A}

TENT – {C, 6, A}

We have thus added the neighbor B in PATH, since we know that there cannot be any other shorter path to reach it. And this is, as you will note, consistent with our definition of PATH wherein we had earlier stated that nodes can only be placed there once we are sure that there cannot be any shorter path to reach them from the root node.

We begin our 2nd iteration and go to Step II which says – “Take the node (call it the PATH node) just placed on the PATH list and examine its list of neighbors“. B is the PATH node and we examine its neighbors (A, E and D). Step II further says – “Add each neighbor to the TENT with a next hop of the PATH node, unless that neighbor is already in the TENT or PATH list with a lower cost.”

Since A is already in PATH with ignore it and only add E and D to TENT.

PATH – {A, 0, A} {B, 3, A}

TENT – {C, 6, A} {D, 3, B}, {E, 5, B}

Step II says – “Call the node just added to the TENT as the TENT node. Set the cost to reach the TENT node equal to the cost to get from the root node to the PATH node plus the cost to get from the PATH node to the TENT node.

Aah .. this means that cost against D and E would not be 3 and 5, as what i have shown, but would instead be 3 (cost from A to B) +3 (cost from B to D) = 6 and 3 (cost from A to B) +5 (B to E) = 8

Thus the PATH and TENT look as follows:

PATH – {A, 0, A} {B, 3, A}

TENT – {C, 6, A} {D, 6, B}, {E, 8, B}

The rest of the Step II does not apply here since D and E dont exist in the TENT.

Come to Step III which says “Find the lowest cost neighbor in the TENT and move that neighbor to the PATH, and repeat Step 2″.

Lowest cost neighbor can either be C or D so pick on up randomly. It can mathematically be proven that we would end up with the same SPF tree irrespective of which equal cost neighbor is picked up from the TENT first. In our case, lets pick up C.

It is thus moved to the PATH

PATH – {A, 0, A} {B, 3, A} {C, 6, A}

TENT – {D, 6, B}, {E, 8, B}

3rd Iteration of the SPF run

Go back to Step II which says “Take the node (call it the PATH node) just placed on the PATH list and examine its list of neighbors. Add each neighbor to the TENT with a next hop of the PATH node, unless that neighbor is already in the TENT or PATH list with a lower cost

C’s neighbors are A and D. Since A is already in the PATH only D is added in the TENT.

PATH – {A, 0, A} {B, 3, A} {C, 6, A}

TENT – {D, 6, B}, {E, 8, B} {D, 9, C}

As per Step II we now need to fix the cost to reach D from the root node. This would be cost to reach from A to C (6) + cost from C to D (9) = 15

PATH and TENT now:

PATH – {A, 0, A} {B, 3, A} {C, 6, A}

TENT – {D, 6, B}, {E, 8, B} {D, 15, C}

Step II further says – “If the node just added to the TENT already exists in the TENT, but with a higher cost, replace the higher-cost node with the node currently under consideration.

node D already exists in the TENT (via B with cost 6) and since its with a lesser cost, we remove the node that we had just added from the TENT. This is because a lower cost path to reach node D already exists in the TENT.

PATH – {A, 0, A} {B, 3, A} {C, 6, A}

TENT – {D, 6, B}, {E, 8, B}

We come to Step III which says “Find the lowest cost neighbor in the TENT and move that neighbor to the PATH, and repeat Step 2″.

Lowest cost neighbor is D – which means we move D now to the PATH and go to Step II, since the TENT isnt yet empty.

4th Iteration of the SPF run

PATH – {A, 0, A} {B, 3, A} {C, 6, A} {D, 6, B}

TENT – {E, 8, B}

Step II says “Take the node (call it the PATH node) just placed on the PATH list and examine its list of neighbors. Add each neighbor to the TENT with a next hop of the PATH node, unless that neighbor is already in the TENT or PATH list with a lower cost.”

To examine D’s neighbors we look at the link state information it advertised. It advertised the following information:

[D, C, 9], [D, B, 3], [D, E, 3]

This means that D is says that its connected to C, B and E. We ignore its connection to B and C, since they are already in PATH. We thus only add neighbor E in the TENT.

PATH – {A, 0, A} {B, 3, A} {C, 6, A} {D, 6, B}

TENT – {E, 8, B} {E, 3, D}

Continuing with Step II which further says – “Call the node just added to the TENT as the TENT node. Set the cost to reach the TENT node equal to the cost to get from the root node to the PATH node plus the cost to get from the PATH node to the TENT node.”

This means that we need to adjust the cost of the triple {E, 3, D} that we just added to the TENT. The cost to reach E via D would thus be the cost to reach D from A (which is the root node) + the cost to reach E from D. This comes out to be 6 + 3 = 9.

TENT thus looks like this – {E, 8, B} {E, 9, D}

Step II further says – “If the node just added to the TENT already exists in the TENT, but with a higher cost, replace the higher-cost node with the node currently under consideration”

We just added node E in the TENT and a route to E already exists in the TENT, and its with a lower cost. This means that we remove the route that we had just added.

So PATH and TENT at this point look as follows:

PATH – {A, 0, A} {B, 3, A} {C, 6, A} {D, 6, B}

TENT – {E, 8, B}

We go to Step III which says – “Find the lowest cost neighbor in the TENT and move that neighbor to the PATH, and repeat Step 2″.

The lowest cost neighbor in TENT right now is E. We move this to PATH.

So PATH and TENT at this point look as follows:

PATH – {A, 0, A} {B, 3, A} {C, 6, A} {D, 6, B} {E, 8, B}

TENT – { }

Step III further states that we continue if and only if something remains in the TENT. TENT is now empty, which means that we have computed the shortest paths to all the nodes that A was aware of.

This is marks the end of the SPF algorithm run and the SPF tree that it has computed looks as follows

Network Topology after the SPF Run

How many IP Prefixes can be advertised by an IS-IS router?

The maximum size of an LSP is 1492 bytes.

Available space = 1492 – 27 (Header) = 1465 bytes for TLVs.

Thus an IS-IS router has theoretically up to 256*1465 of space to pack IP reachability TLVs.

The following calculation enables us to determine the number of IP prefixes that can be advertised in an LSP.

The following constraints are to be considered in the calculation:

The maximum size (maxLSPsize) of an LSP is 1492 bytes.
The LSP header (lspHeadersize) is 27 bytes.
The maximum length of a TLV (maxTLVlength) is 255 bytes.

Each TLV 128 consists of type (1 byte), length (1 byte), and IP prefixes (n x 12 bytes) up to total of 255 bytes. The maximum number of fragments of an LSP (maxLSPfragments) is 256.

The number of fragments is determined from the 1-byte LSP Number field in the LSP identifier.

The first fragment contains other TLVs, and the remaining 255 fragments are packed with only TLV 128.

The actual calculation is as follows:

The total space available for TLVs in an LSP is TLVSpace = maxLSPsize – lspHeadersize = 1492 – 27 = 1465 bytes

The number of TLVs that can fit into TLVSpace is 1465/255 = 5.7, approximately 6

Assuming a 1-byte Type field and 1-byte Length field, overhead for 6 TLVs is 6 x 2 = 12 bytes.

Actual space available for prefixes is 1465 – 12 bytes overhead = 1453 bytes

Number of prefixes, each 12 bytes (address + subnet mask + metric) that can fit into TLVSpace is 1453/12 = 121.08 (approximately 121 IP prefixes per LSP)

Considering that few other TLVs can be generated by the router, the number of IP prefixes that can be supported per IS-IS router is 256 fragments, each containing 121 prefixes, for a total of 30,976 prefixes.

Convergence and Scalability Issues in OSPF and IS-IS

This seems to be the favorite question that every newbie has! There is no unequivocal answer to this and it all depends upon the kind of network and the topology. Having said this, lets try to see how the two IGPs can be compared.

IS-IS

This protocol is limited by the maximum number of LSPs that each IS-IS router can issue. This is 256 as its LSP ID is 1 octet long. The total number of IP prefixes carried by IS-IS can be easily computed and it comes to O(31000). However, RFC 3786 describes mechanisms to relax the limit on the number of LSP fragments, thereby increasing the number of IP prefixes that can be carried within IS-IS.

I have however, never seen any network carrying more than O(30K) IP prefixes inside IS-IS. Do let me know if you’re aware of networks where you see IS-IS carrying more than 30K routes.

I say this because this is a reasonable number for any sane IS-IS deployment and it will not run out of space unless someone actually injects the entire (or even partial) BGP feed into the IGP. In that case we will run out of space at about 20% of the way into redistribution and not be able to advertise the rest. It is for this reason that this practice has now been deprecated and the RFC 1745 which lays down the rules for BGP- OSPF interaction, has been moved to the HISTORICAL status.

There are 8 bits to define a pseudonode number in the LSPID which means that a router can be a Designated Intermediate System (DIS) for only 256 LANs. Additionally there is also a limitation on the number of routers that can be advertised in pseudonode LSP of the DIS. Dont worry – RFC 3786 fixes this!

RFC 3373 OTOH proposes a new TLV thats carried in the IIH PDUs that can increase the number of point-to-point adjacencies from 256 on a single router.

The “Remaining lifetime” field which gives the number of seconds before LSP is considered expired is 16 bits wide.

This gives the life time of the LSP as 2^16/60/60 Hrs = 18.7 Hrs

Thus the LSP issued by a router needs to be refreshed after every 18.7 Hrs. So youre not going to see a lot of IS-IS control packets being regenerated in a stable topology.

OSPF

In theory, OSPF topology is limited by the number of links that can be advertised in the Router LSA as each router gets only one Router LSA and it cant be bigger than 64K which is the biggest an IP packet can be. The same constraint applies to the Network LSA also.

Each link in the router can take up at most 24 bytes. Thus, number of links which can be supported is given by (64 * 1024) / 24 = 2370

However, if we take the minimum link size per link (12 bytes) then the maximum is about 2 * 2370 = O(5000) links

To be more specific, we can have O(2300) p2p and p2mp links (not considering virtual links, etc) and O(5000) broadcast/NMBA links described in OSPF’s Router LSA and its Network LSA.

Thus each Router LSA can carry some 5000 links information in it. It is hard to imagine a router having 5000 neighbors but there are already routers with 400 neighbors in some ISPs, and it may not take long to reach the order of the magnitude limited by OSPF.

The Network LSAs are generated by the designated router (DR) for each broadcast network it is connected to. To have scaling problems it should have 2730 * 6 times neighbors on that interface. This is even less probable and hence there are no scalability problems with OSPF per se.

All other LSAs apart from Type 1 and Type 2 hold single prefixes. Because there is no limit to the number of such LSAs, a large number of inter-area and externals can be generated depending upon the memory resources of the router.

Each LSA has an LS Age field which is counted upwards starting from zero. Its life is an architectural constant which says one hour. When an LSA’s LS age field reaches MaxAge, it is reflooded in an attempt to flush the LSA from the routing domain. One hour seems like a long time but if one originates 50,000 LSAs then OSPF will be refreshing on an average of just 36ms!

Total number of LSAs to be refreshed = 50,000

Time by which all the LSAs must be refreshed = LSRefreshTime = 30mins = 1800 secs

Rate at which the LSAs need to be refreshed = 1800/50000 = 36ms

However, if the refreshes are perfectly spread out across time and perfectly batched, the actual update transmission rate may be on the order of one packet per second.

There is however a “do-not-age” LSA which in theory can be pressed into service and which never gets aged. However, such LSAs will be eventually purged from the LS database if they become stale after being held for at least 60 minutes and the originator not reachable for the same period. Moreover it is not backward compatible and if one deploys that in the network today with some routers not supporting this then the network can really get weird. So there isn’t really much that can be done using these unless the whole network is changed!

Theoretically, both the routing protocols are scalable and there should not be any issues with either one of these if implemented properly. Both have similar stability and convergence properties. Practically, providers must go with what their vendors suggest since the vendors are best aware of how each protocol has been implemented on their platforms.

I discuss more of this here.

Database Granularity in OSPF vs IS-IS ..

This post compares how the two link state protocols hold their routing information in their databases as this affects their behavior in how they flood/distribute the change of routing information and the internal implementation complexity.

OSPF

o Organization of Routing Information

OSPF encodes the routing information into small chunks, which it calls Link State Advertisement (LSA). Each LSA has its own 20-byte header in order to be identified uniquely. This header is called the LSA Header. There is no limitation on the size of a LSA, though the actual LSA size is limited by IP packet size limitation: 65,535 bytes minus the LSA Header size and IP packet header size. The database access in OSPF is per LSA basis.

In OSPF routing, the information within an area is described by type 1 and type 2 LSAs (known as Router-LSA and Network-LSA respectively). These LSAs can become big depending upon the number of adjacencies to be advertised and prefixes to be carried inside an area. In other words, the routing information with respect to a single node (either router or network node) is encoded inside a single LSA. On the other hand, each inter-area or external prefix is advertised in a separate LSA (AS-External LSA).

An OSPFv2 router may originate only one Router-LSA for itself, while in OSPFv3, a router is allowed to originate multiple Router-LSAs. A router may originate a Network-LSA for each IP subnet on which the router acts as a designated router (DR). A router may originate one LSA for each inter-area and external prefix, with no limitations on the number of LSAs that it may originate.

Consequences

Originating a new and a unique LSA for each inter-area route and an external prefix implies that there is a LSA Header overhead involved while the information is kept in the database or is flooded to the neighbors. There is thus some extra memory and bandwidth consumed in total.

o Carrying Routing Information

LSAs are carried in Link State Update packets (called LS Updates or LSUs). Each LS Update packet has its own header, consists of a 24 byte OSPF protocol header, and a 4-bytes field indicating the number of LSAs contained in the packet. Thus multiple LSAs can be packed into a single LS Update packet. Some implementations may not do this as its considered difficult achieving this during flooding.

Consequences

In the face of network changes, OSPF floods only the updated LSAs. Therefore, even if an implementation does not pack multiple LSAs into a single LS Update packet (and so bandwidth is consumed by LS Update header for each update of a single LSA), the bandwidth consumption for each network change can be considered adequately small.

IS-IS

o Organization of the Routing Information

In IS-IS, protocol packets are called Protocol Data Units or PDUs. IS-IS encodes the link state information into the set of TLVs and packs these TLVs into one or more Link State PDUs (LSPs). The size limit of a LSP is configurable. The Routing database consists of these PDUs and the access to the database is per PDU basis. The original IS-IS specification places an upper bound on the number of LSPs a router can originate to 255. There are however techniques which enable a router to originate more than 255 LSPs, by using multiple system-id’s for itself.

Consequences

Since routing information in IS-IS for each router is packed in fewer LSPs, the memory consumed for bookkeeping of the routing data within the database is less and is more efficient.

o Carrying Routing Information

Each LSP is flooded independently, without being modified all the way from the originator through the routers till the very end. This results in all the routers having the same LSPs as that originated by the first router.

Consequences

Since LSPs are not modified in any way and are not allowed to be fragmented, in order to be flooded successfully over all links existing in the IS-IS network, great care must be ensured when configuring the size limit of LSP that routers can originate and receive.

If the size limit of the LSP is set without taking into account the minimum value of the MTUs throughout the network, or if the size limit of LSPs conflict among some the routers in the network, the database synchronization may not be achieved, and this can result in routing loops and/or blackholes.

When a change occurs to a LSP, the whole LSP needs to be flooded, and therefore the bandwidth usage can be non-optimal. There is however a solution which exists in theory. If an implementation finds some of the entities to be flapping, then they may be packed into smaller LSPs or may be isolated from the other stable entities. This way one needs to only advertise the unstable LSP/LSPs. I have not btw come across any implementation that does that. Leave a comment if you know one that does this!

Database granularity also affects when two routers need to synchronize their databases. In OSPF, because of its high database granularity there are a lot of items which it needs to synchronize and that process is somewhat complicated with a lot of DBD packets being exchanged back and forth. This gets worse if the router trying to sync is being inundated with a lot of other data traffic also. This is not much of an issue these days as any router worth its salt would prioritize the OSPF control packets.

This is however much simpler in case of IS-IS and there isn’t any finite state machine that the neighbors need to go through to synchronize their databases. It just uses it regular flooding mechanism (a couple of CSNPs describe their entire topology information) to exchange its entire database. You plug in the new IS-IS router and before you realize the router is already sync’ed up with all the other IS-IS routers in the network!

Checks on HELLOs for OSPF and IS-IS during Adjacency Formation ..

The HELLOs (or IIHs in IS-IS parlance)  are responsible bringing up the adjacencies between the two (or multiple) routers. Forming adjacencies is an integral part of all link state routing protocols as all protocol packets other than HELLOs are flooded only over the adjacencies. The rules for formation of such adjacencies however differ between IS-IS, OSPF v2 and OSPF v3.

IS-IS

Besides the basic checks to verify the integrity of the packet, IS-IS does a few checks before forming any adjacency upon receiving the IIHs.

o It allows multiple area addresses to be configured on a router. During the IIH exchange the adjacency is formed only if at least one of the area address matches. The advantage of having multiple areas is explained in the further posts. NOTE that Level 2 only adjacencies would be formed even if the area addresses are not matching.

o To prevent the LSPs and CSNPs from being dropped due to different values for originatingLSPBufferSize and ReceiveLSPBufferSize, all IIHs are padded till the maximum MTU when the adjacency comes up. This check verifies consistent settings between the adjacent routers. This is however not a sufficient check.

o Adjacencies are formed without regard to interface addressing or asymmetric HOLD timer values. Values of IIH interval are not sent in IIH packets. While the IS-IS protocol provides sufficient routing information for relaying packets between adjacent routers, many implementations nonetheless require ARP support to do this. These implementations typically refuse to form an adjacency unless the neighbor interface IP address is on the local interface’s IP subnet.

o IS-IS can carry addressing information of different protocols in its TLV’s. However, the protocol supported field must be sent in Dual and IP-Only routers. RFC1195 specifies no checks for the protocol supported field for adjacency formation. It places topology restrictions on multi-protocol networks. In networks that conform to these restrictions, neighboring routers will always have a protocol in common. Therefore, it does not state whether adjacency formation should take protocols supported into account. However, many implementations, do not form an adjacency with a neighbor unless they share at least one protocol in common.

o Not matching Hold Timer values has advantages wherein the administrator can set different Hold times for different routers. This helps in cases where the going down of a DIS or some router needs to be detected faster. For such routers the hold timer can be set to a lower value.

OSPFv2

The checks for formation of adjacencies are stricter in OSPFv2 as compare to that of IS-IS.

o The area-id of the received packet should always match the incoming interface (with the exception of virtual links). Area type is strictly checked by checking the E-bit (not set for non-default areas) and the N- bit (not-set for non-NSSA areas).

o The values of the Hello interval, the Router Dead Interval and network mask received in Hellos are matched with those on the configured interface. Any mismatch in the values causes the Hello packet to be dropped and hence prevents formation of adjacencies. The disadvantage of this approach is that Hello Interval and Router Dead Interval changes need to be done within the Router Dead Interval, to prevent breaking adjacencies. The advantage is we would not form adjacency in case there is a router that has been mis-configured with a large value and which could cause problems later. The network mask check however does not apply to point to point links. That allows the two ends of a Point-to-Point link to have different addresses.

o MTU check is not done in the hellos. It is done in the during the Database (DB) Exchange process.

OSPFv3

Most of the checks for OSPFv3 are similar to that of OSPFv2.

o OSPFv3 runs on a per link basis instead of a per subnet basis. The check for a network mask is thus not done.

o Instance ID field (non-existent in OSPFv2) on the link is matched with the incoming ID in Hellos. The adjacency is formed only if the Instance ID matches. This allows multiple instances of OSPFv3 to run on a single link.