Internet – Routing Freak!

Internet, only a few weeks ago, had seen Pakistan Telecom Authority (PTA) hijacking the IP prefixes announced by Youtube, as protest against some videos that had been put up there, knocking millions off, all around the globe from accessing Youtube. I wrote about this here. This time its an ISP from USA and Europe, AboveNet (AS 6461) thats hijacked prefix announced and owned by Africa Online (AS 36915).

AboveNet inexplicably started announcing reachability to one of the prefixes (194.9.82.0/24) owned by Africa Online. It took AboveNet more than 22 hours since the problem was first reported, to fix it. Wonder what took them so long! As a result of this prefix hijack, potentially millions of users in Kenya or Africa, all behind 194.9.82.0/24, lost connectivity to the Internet. In isolation 194.9.82.0/24 is not a huge space, but add a couple of NATs and the number of users easily swells to millions. What this means for you and me, who are not being served by Africa Online is, that we lose connectivity to all the websites being hosted behind this IP address block. Imagine what it would do to the Internet if emergency services, banks, google were being hosted there!

Lets see why users in Kenya would lose total connectivity to the Internet:

A user accesses google.com with a (NATed or otherwise) source IP address 194.9.82.x. Google graciously responds, and the IP packet carries the destination IP 194.9.82.x. Because AboveNet has announced reachability to this IP address block, all traffic destined to 194.9.82.x comes to AboveNet where it gets royally dumped, while the user sitting in Kenya (or Africa) is still hopelessly waiting for the packet to arrive.

So, why are the service providers all over the world preferring the route announcement from AboveNet over the one originated from Africa Online?

Well, thats unfortunately how Internet, and my favorite routing protocol – BGP, works!

In BGP, the route advertisement from the provider which has a better vantage point on the Internet, usually wins.

In this particular case both AboveNet(AS 6461) and Africa Online (AS 36915) announced the route to 194.9.82.0/24 . AboveNet, operating from US, sits much closer to the core as compared to Africa Online, and is thus better connected to the other networks than the latter. The AS_PATH length thus seen by the other service providers for the route advertised by AboveNet is much shorter than the one advertised by Africa Online. As a result of this, other BGP speakers pick up the route advertised by AboveNet against the route advertised by Africa Online.

The figure below, constructed using BGPlay from RIPE NCC, shows a snapshot of the routing activity for 194.9.82.0/24 during the period when it was hijacked. The colored lines indicate the path different ASes would take to reach this prefix. Clearly most of the ASes believed AboveNet to be a better path for 194.9.82.0/24.

This is how BGP works and mind you, this isnt broken.

Whats broken is our inability to verify the claim of a service provider when it announces ownership of an address block in BGP. Restrictive route filtering can be applied where the providers only accept the specific prefixes allocated to the customers or where the upstream accepts only specific prefixes allocated to the ISPs, but this is too cumbersome and rarely works. As the matter stands today, there isn’t any clean way to know if the reachability announced by your friendly peer is genuine or whether the provider has a feasible path to the destinations advertised. This needs to be fixed and there is work going on towards this direction in the SIDR WG of IETF.

Can the service providers do something when they learn that an IP prefix has been hijacked by some AS? The answer, fortunately, to this is an unequivocal Yes.

A service provider can override BGP’s decision process in selecting the route advertisement with the shortest AS_PATH length by (i) manipulating the BGP path attribute like LOCAL_PREF since its checked before the AS_PATH length or (ii) decreasing the weight of the offending peer you learn the hijacked route from (this would only work for routers connected directly to AboveNet) (iii) Use regular expressions to filter all or the specific hijacked route advertisement from AS 6461 (AboveNet) so that the announcement from Africa Online wins. The legitimate route is now propagated to other parts of the world.

Each time an ISP inadvertently hijacks someone else’s address block it risks to lose some amount of credibility in the service provider world. Their names are splashed on the mails/PPTs in NANOG and IETF whenever there’s a discussion on interdomain security or on the blogs all over the world.

Fortunately for AboveNet, their hijacking didn’t throw millions off popular websites like Youtube, Google, Yahoo! etc. That would have attracted a LOT more attention than what this event did. When PTA had hijacked YouTube, it was all over the news and there were columns running in Wall Street Journal and New York Times about how tenuous the Internet architecture is. Also what unfortunately went in favor of AboveNet was that the affected users were not in US/Europe/Japan, but were in a relatively silent African subcontinent.

The Internet in mid 80s and 90s was envisaged to work on the IP destination based routing/forwarding paradigm. This meant that the routing protocols would establish the best paths based on some scalar metric like the hop count, or link costs, and all traffic would follow that path. This would work since all IP routing and forwarding was based on the IP address carried in the IP packets. With all due respect and credit to the Internet’s forefathers, the vision and the design did work, till a few years back, when the network operators started realizing that though the IP architecture was indeed scalable, it lacked the finesse to optimally utilize the network resources (particularly in the backbones).

The inadequate utilization of the network resources can be illustrated with the classic “fish problem“. It derives its name from the network (Fig 1) resembling a fish, with A being the head and G and H, the tail of a fish.

All traffic emanating (or passing through) from the tail, (G or H) towards the head (A) can take either of the two paths (F-D-C-B or F-E-B) based on how the IP routing tables are programmed by the routing protocols. The latter decide on the best path by considering the link costs advertised by each router in the network. In this example the total cost of path G-F-D-C-B-A is (10+5+5+5+10) 35, while the cost of the path G-F-E-B-A is 30. This means that all traffic from G to A (or G to B) would follow the path through router E (as shown by the red arrow), and the F-D-C-B path would remain unused, since the total cost associated with it is higher than the one through router E.

This leads to an extremely unbalanced traffic distribution, where the link F-E-B can get heavily overloaded, and at worst, congested, while F-D-C-B always remains idle. This problem arises because of the way IP routing paradigm works. Lets us see why:

o IP routing is destination based, so packets are only routed based on the destination IP address in the packet. Routing protocols typically install one next-hop for each IP address (except in case of equal cost routes, which we can ignore for the time being) or a range of IP addresses (subnet masks) thus all packets sharing the same destination address would all get routed to the same next-hop. This means that if F installs a route 100/8 with next-hop as E, then all IP traffic falling under 100/8 coming to F, would get routed to E. This can lead to unbalanced traffic distribution and create unnecessary congestion hot spots.

o Routers make a local decision, based on what they think is the most optimal path from their perspective, when selecting a path. Since all routers run the same SPF algorithm, with the same lin state database, they all come up with the same shortest path, which very soon turns congested, while the non-shortest path remains idle, and unused. This implies that to optimize the network utilization the routers must factor in some other things before chosing the path. One thing that comes instantly to mind is the total bandwidth available on each link when computing the path. If routers can somehow keep track of the available bandwidth available on each link, then it can distribute the traffic in a manner which can optimize the network resource utilization.

Coming back to our network, we find that all traffic from tail to head flows through the router E, leaving D and C idle. So, what can be done to fix this?

The operator can manipulate the link costs on path F-D-C-B, in a manner as shown below, to get the traffic to flow over it.

This clearly works, since cost of the path F-D-C-B (9) is now lower than path F-E-B (10). But hang on. What we’ve only achieved is moving the entire traffic from path F-E-B to F-D-C-B! The traffic would soon start congesting the latter link, while leaving the former unused. We have really achieved nothing, but have only moved the problem elsewhere. Clearly, this wouldn’t work. So, what else can be done?

Well, not much. A clever network operator can play around with the link costs on paths F-D-C-B and F-E-B such that both become equal, as shown in the figure below.

This would surely alleviate the problem as the two paths would now be equally used. However, this scheme of manually adjusting the link costs is not scalable and only works for small networks. Imagine replacing C, D and E with hundreds of routers, with a subset of them being connected to each other. The precarious scheme of adjusting the link costs would become too complex and too fragile to work. A single link (or a router) failure or a cost change would bring down the entire scheme of distributing the traffic across two paths.

The only scalable solution for the fish problem is by going beyond the realms of traditional IP routing and by providing mechanisms to explicitly manage the traffic inside the network. This new paradigm of routing is called constraint based routing, which essentially strives to compute the best path without violating any of the constraints imposed on the network, and at the same time, being optimal with respect to some scalar metric (hop count, links costs, etc). Once such a path is computed, it establishes and maintains forwarding state along such a path.

This differs from the existing IP routing paradigm which only tries to optimize (by minimizing), a particular scalar metric when computing the best path to a destination. Thus RIP optimizes the number of hops and OSPF/IS-IS, the total path cost, where total path cost is the sum total of individual cost of all the links along the path.

I would discuss more on constraint based routing in my subsequent posts.

Tag: Internet

AboveNet Hijacks Africa Online!