Complexity is the enemy of understanding. Think about how much time you spend in your day trying to simplify things. Complexity is the reason why things like Reddit’s Explain Like I’m Five exist. We strive in our daily lives to find ways to simplify the way things are done. Well, except in networking.
Building On Shifting Sands
Networking hasn’t always been a super complex thing. Back when bridges tied together two sections of Ethernet, networking was fairly simple. We’ve spent years trying to make the network do bigger and better things faster with less input. Routing protocols have become more complicated. Network topologies grow and become harder to understand. Protocols do magical things with very little documentation beyond “Pure Freaking Magic”.
Part of this comes from applications. I’ve made my feelings on application development clear. Ivan Pepelnjak had some great comments on this post as well from Steve Chalmers and Derick Winkworth (@CloudToad). I especially like this one:
Derick is right. The application developers have forced us to make networking do more and more faster with less requirement for humans to do the work to meet crazy continuous improvement and continuous development goalposts. Networking, when built properly, is a static object like the electrical grid or a plumbing system. Application developers want it to move and change and breathe with their needs when they need to spin up 10,000 containers for three minutes to run a test or increase bandwidth 100x to support a rollout of a video streaming app or a sticker-based IM program designed to run during a sports championship.
We’ve risen to meet this challenge with what we’ve had to work with. In part, it’s because we don’t like being the scapegoat for every problem in the data center. We tire of sitting next to the storage admins and complaining about the breakneck pace of IT changes. We have embraced software enhancements and tried to find ways to automate, orchestrate, and accelerate. Which is great in theory. But in reality, we’re just covering over the problem.
The solution to our software networking issues seems simple on the surface. Want to automate? Add a layer to abstract away the complexity. Want to build an orchestration system on top of that? Easy to do with another layer of abstraction to tie automation systems together. Want to make it all go faster? Abstract away!
“All problems in computer science can be solved with another layer of indirection.”
This is a quote from Butler Lampson often attributed to David Wheeler. It’s absolutely true. Developers, engineers, and systems builders keep adding layers of abstraction and indirection on top of complex system and proclaiming that everything is now easier because it looks simple. But what happens why the abstraction breaks down?
Automobiles are perfect example of this. Not too many years ago, automobiles were relatively simple things. Sure, internal combustion engines aren’t toys. But most mechanics could disassemble the engine and fix most issues with a wrench and some knowledge. Today’s cars have computers, diagnostics systems, and require lots of lots of dedicated tools to even diagnose the problem, let alone fix it. We’ve traded simplicity and ease of repairability the appearance of “simple” which conceals a huge amount of complexity under the surface.
To refer back to the Lampson/Wheeler quote, the completion of it is, “Except, of course, for the problem of too many indirections.” Even forty years ago it was understood that too many layers of abstraction would eventually lead to problems. We are quickly reaching this point in networking today. With all the reliance on complex tools providing an overwhelming amount of data about every point of the network, we find ourselves forced to use dashboards and data lakes to keep up with the rapid pace of changes dictated to the network by systems integrations being driven by developer desires and not sound network systems thinking.
Networking professionals can’t keep up. Just as other systems now must be maintained by algorithms to keep pace, so too does the network find itself being run by software instead of augmented by it. Even if people wanted to make a change they would be unable to do so because validating those changes manually would cause issues or interactions that could create havoc later on.
So how do we fix the issues? Can we just scrap it all and start over? Sadly, the answer here is a resounding “no”. We have to keep moving the network forward to match pace with the rest of IT. But we can do our part to cut down on the amount of complexity and abstraction being created in the process. Documentation is as critical as ever. Engineers and architects need to make sure to write down all the changes they make as well as their proposed designs for adding services and creating new features. Developers writing for the network need to document their APIs and their programs liberally so that troubleshooting and extension are easily accomplished instead of just guessing about what something is or isn’t supposed to be doing.
When the time comes to build something new, instead of trying to plaster over it with an abstraction, we need to break things down into their basic components and understand what we’re trying to accomplish. We need to augment existing systems instead of building new ones on top of the old to make things look easy. When we can extend existing ideas or augment them in such as way as to coexist then we can worry less about hiding problems and more about solving them.
Abstraction has a place, just like NAT. It’s when things spiral out of control and hide the very problems we’re trying to fix that it becomes an abomination. Rather than piling things on the top of the issue and trying to hide it away until the inevitable day when everything comes crashing down, we should instead do the opposite. Don’t hide it, expose it instead. Understand the complexity and solve the problem with simplicity. Yes, the solution itself may require some hard thinking and some pretty elegant programming. But in the end that means that you will really understand things and solve the complexity conundrum.