I don’t know if you’ve had a chance to see this Reddit thread yet, but it’s a funny one:
Short non-clickbait summary: We deployed SD-WAN and turned off OSPF. We now have a /16 route for the internal network and a default route to the Internet where a lot of our workloads were moved into the cloud.
Bravo for this networking team for simplifying their network to this point. All other considerations aside, does this kind of future really bode well for SD-WAN?
Now You See Me
As pointed out in the thread above, the network team didn’t really get rid of their dynamic routing protocols. The SD-WAN boxes that they put in place are still running BGP or some other kind of setup under the hood. It’s just invisible to the user. That’s nothing new. Six years ago, Ivan Pepelnjak found out Juniper QFabric was running BGP behind the scenes too.
Hiding the networking infrastructure from the end user is nothing new. It’s a trick that has been used for years to allow infrastructures to be tuned and configured in such a way as to deliver maximum performance without letting anyone tinker with the secret sauce under the hood. You’ve been using it for years whether you realize it or not. Have MPLS? Core BGP routing is “hidden” from you. SD-WAN? Routing protocols are running between those boxes. Moved a bunch of workloads to AWS/Azure/GCE? You can better believe there is some routing protocol running under that stack.
Making things complex for the sake of making them hard to work on is foolish. We’ve spent decades and millions of dollars trying to make things easy. If you don’t believe me, look at the Apple iPhone. That device is a marvel at hiding all the complexity underneath. But, it also makes it really hard to troubleshoot when things go wrong.
Building On Shoulders
SD-WAN is doing great things for networking. I can remember years ago the thought of turning up a multi-site IPSec VPN configuration was enough to give me hives, let alone trying to actually do it. Today, companies like Viptela, VeloCloud, and Silver Peak make it easy to do. They’re innovating on top of the stack instead of inside it.
So much discussion in the community happens around building pieces of the stack. We spend time and effort making a better message protocol for routing information exchange. Or we build a piece of the HTTP stack that should be used in a bigger platform. We geek out about technical pieces because that’s where our energy feels the most useful.
When someone collects those stack pieces and tries to make them “easy”, we shout that company down and say that they’re hiding complexity and making the administrators and engineers “forget” how to do the real work. We spend more time focusing on what’s hidden and not enough on what’s being accomplished with the pieces. If you are the person that developed the fuel injection system in a car, are you going to sit there and tell Ford and Chevrolet than bundling it into a automotive platform is wrong?
So, while the end goal of any project like the one undertaken above is simplification or reducing problems because of less complex troubleshooting it is not a silver bullet. Hiding complexity doesn’t make it magically go away. Removing all your routing protocols in favor of a /16 doesn’t mean your routing networking runs any better. It means that your going to have to spend more time trying to figure out what went wrong when something does break.
Ask yourself this question: Would you rather spend more time building out the network and understand every nook and cranny of it or would you rather learn it on the fly when you’re trying to figure out why something isn’t working the way that it should? The odds are very good that you’re going to put the same amount of time into the network either way. Do you want to front load that time? Or back load it?
The Reddit thread is funny. Because half the people are dumping on the poster for his decision and the rest are trying to understand the benefits. It surely was created in such a way as to get views. And that worked admirably. But I also think there’s an important lesson to learn there. Simplicity for the sake of being simple isn’t enough. You have to replace that simplicity with due diligence. Because the alternative is a lot more time spent doing things you don’t want to do when you really don’t want to be doing them.