Routing Through the Forest of Trees


Some friends shared a Reddit post the other day that made me both shake my head and ponder the state of the networking industry. Here is the locked post for your viewing pleasure. It was locked because the comments were going to devolve into a mess eventually. The person making the comment seems to be honest and sincere in their approach to “layer 3 going away”. The post generated a lot of amusement from the networking side of IT about how this person doesn’t understand the basics but I think there’s a deeper issue going on.

Trails To Nowhere

Our visibility of the state of the network below the application interface is very general in today’s world. That’s because things “just work” to borrow an overused phrase. Aside from the occasional troubleshooting exercise to find out why packets destined for Azure or AWS are failing along the way when is the last time you had to get really creative in finding a routing issue in someone else’s equipment? We spend more time now trying to figure out how to make our own networks operate efficiently and less time worrying about what happens to the packets when they leave our organization. Provided, of course, that the users don’t start complaining about latency or service outages.

That means that visibility of the network functions below the interface of the application doesn’t really exist. As pointed out in the post, applications have security infrastructure that communicates with other applications and everything is nicely taken care of. Kind of like ordering packages from your favorite online store. The app places the order with a storefront and things arrive at your house. You don’t have to worry about picking the best shipping method or trying to find a storefront with availability or any of the older ways that we had to deal with weirdness.

That doesn’t mean that the processes that enable that kind of service are going away though. Optimizing transport networks is a skill that is highly specialized but isn’t a solved issue. You’ve probably heard by now that UPS trucks avoid left turns whenever possible to optimize safety and efficiency. The kind of route planning that needs to be done in order to eliminate as many left turns as possible from the route is massive. It’s on the order of a very highly specialized routing protocol. What OSPF and BGP are doing is akin to removing the “left turns” from the network. They find the best path for packets and keep up-to-date as the information changes. That doesn’t mean the network is going away. It means we’re finding the most efficient route through it for a given set of circumstances. If a shipping company decides tomorrow that they can no longer guarantee overnight delivery or even two-day shipping that would change the nature of the applications and services that offer that kind of service drastically. The network still matters.

OSI Has to Die

The other thing that jumped out at me about the post was the title. Referring to Layer 3 of the OSI model as a routing function. The timing was fortuitous because I had just finished reading Robert Graham’s excellent treatise on getting rid of the OSI model and I couldn’t agree more with him. Containing routing and addressing functions to a single layer of an obsolete model gives people the wrong ideas. At the very least is encourages them to form bad opinions about those ideas.

Let’s look at the post as an example. Taking a stance like “we don’t need layer three because applications will connect to each other” is bad. So is “We don’t need layer two because all devices can just broadcast for the destination”. It’s wrong to say those things but if you don’t know why it’s wrong then it doesn’t sound so bad. Why spend time standing up routing protocols if applications can just find their endpoints? Why bother putting higher order addresses on devices when the nature of Ethernet means things can just be found easily with a broadcast or neighbor discovery transmission? Except you know that’s wrong if you understand how remote networks operate and why having a broadcast domain of millions of devices would be chaos.

Graham has some very compelling points about relegating the OSI model to history and teaching how networks really operate. It helps people understand that there are multiple networks that exist at one time to get traffic to where it belongs. While we may see the Internet and Ethernet LAN as a single network they have different purposes. One is for local traffic delivery and the other is for remote traffic delivery. The closest analog for certain generations is the phone system. There was a time when you have local calls and long distance calls that required different dialing instructions. You still have it today but it’s less noticeable thanks to mobile devices not requiring long distance dialing instructions.

It might be more appropriate to think of the local/remote dichotomy like a private branch exchange (PBX) phone network. Phones inside the PBX have locally significant extensions that have no meaning outside of the system. Likewise, remote traffic can only enter the system through entry points created by administrators, like a main dial-in number that terminates on an extension or direct inward dial (DID) numbers that have significance outside the system. Extensions only matter for the local users and have no way to communicate outside without addressing rules. Outside addresses have no way of communicating into the local system without creating rules that allow it to happen. It’s a much better metaphor than the OSI model.


Tom’s Take

I don’t blame our intrepid poster for misunderstanding the way network addresses operate. I blame IT for obfuscating it because it doesn’t matter anymore to application developers. Sure, we’ve finally hit the point where the network has merged into a single entity with almost no distinction from remote WAN and local LAN. But we’ve also created a system where people forget the dependencies of the system at lower levels. You can’t encode signals without a destination and you can’t determine the right destination without knowing where it’s supposed to be. That’s true if you’re running a simple app in an RFC 1918 private space or the public cloud. Forgetting that little detail means you could end up lost in a forest not being able to route yourself out of it again.

Leave a comment