SD-WAN and Technical Debt


Back during Networking Field Day 22, I was having a fun conversation with Phil Gervasi (@Network_Phil) and Carl Fugate (@CarlFugate) about SD-WAN and innovation. I mentioned that it was fascinating to see how SD-WAN companies kept innovating but that bigger, more established companies that had bought into SD-WAN seemed to be having issues catching up. As our conversation continued I realized that technical debt plays a huge role in startup culture in all factors, not just with SD-WAN. However, SD-WAN is a great example of technical debt to talk about here.

Any Color You Want In Black

Big companies have investments in supply chains. They have products that are designed in a certain way because it’s the least expensive way to develop the project or it involves using technology developed by the company that gives them a competitive advantage. Think about something like the Cisco Nexus 9000-series switches that launched with Cisco ACI. Every one of them came with the Insieme ASIC that was built to accelerate the policy component of ACI. Whether or not you wanted to use ACI or Insieme in your deployment, you were getting the ASIC in the switch.

Policies like this lead to unintentional constraints in development. Think back five years to Cisco’s IWAN solution. It was very much the precursor to SD-WAN. It was a collection of technologies like Performance Routing (PfR), Application Visibility Control (AVC), Policy Based Routing (PBR), and Network Based Application Recognition (NBAR). If that alphabet soup of acronyms makes you break in hives, you’re not alone. Cisco IWAN was a platform very much market by potential and complexity.

Let’s step back and ask ourselves an important question: “Why?” Why was IWAN so complicated? Why was IWAN hard to deploy? Why did IWAN fail to capture a lot of market share and ride the wave that eventually became SD-WAN? Looking back, a lot of the choices that were made that eventually doomed IWAN can come down to existing technical debt. Cisco is a company that makes design decisions based on what they’ve been doing for a while.

I’m sure that the design criteria for IWAN came down to two points:

  1. It needs to run on IOS.
  2. It needs to be an ISR router.

That doesn’t sound like much. But imagine the constraints you run into with just those two limitations. You have a hardware platform that may not be suited for the kind of work you want to do. Maybe you want to take advantage of x86 chipset acceleration. Too bad. You have to run what’s in the ISR. Which means it could be underpowered. Or incapable of doing things like crypto acceleration for VPNs, which is important for building a mesh of encrypted tunnels. Or maybe you need some flexibility to build a better detection platform for applications. Except you have to use IOS. Which uses NBAR. And anything you write to extend NBAR has to run on their platforms going forward. Which means you need to account for every possible permutation of hardware that IOS runs on. Which is problematic at best.

See how technical debt can creep in from the most simplistic of sources? All we wanted to do was build a platform to connect WANs together easily. Now we’re mired in a years-old hardware choice and an aging software platform that can’t help us do what needs to be done. Is it any wonder why IWAN didn’t succeed in the original form? Or why so many people involved with the first generation of SD-WAN startups were involved with IWAN, even if just tangentially?

Debt-Free Development

Now, let’s look at a startup like CloudGenix, who was a presenter at Networking Field Day 22 and was recently acquired by Palo Alto Networks. They started off on a different path when they founded the startup. They knew what they wanted to accomplish. They had a vision for what would later be called SD-WAN. But instead of shoehorning it into an existing platform, they had the freedom to build what they wanted.

No need to keep the ISR platform? Great. That means you can build on x86 hardware to make your software more universally deployable on a variety of boxes. Speaking of boxes, using commercial off-the-shelf (COTS) equipment means you can buy some very small devices to run the software. You don’t need a system designed to use ATM modules or T1 connections. If all you little system is ever going to use is Ethernet there’s no reason to include expansion at all. Maybe USB for something like a 4G/LTE modem. But those USB ports are baked into the board already.

A little side note here that came from Olivier Huynh Van of Gluware. You know the USB capabilities on a Cisco ISR? Yeah, the ISR chipset didn’t support USB natively. And it’s almost impossible to find USB that isn’t baked into an x86 board today. So Cisco had to add it to the ISR in a way that wasn’t 100% spec-supported. It’s essentially emulated in the OS. Which is why not every USB drive works in an ISR. Take that for what’s it’s worth.

Back to CloudGenix. Okay, so you have a platform you can build on. And you can build software that can run on any x86 device with Ethernet ports and USB devices. That means your software doesn’t need to do complicated things. It also means there are a lot of methods already out there for programming network operating systems for x86 hardware, such as Intel’s Data Plane Development Kit (DPDK). However CloudGenix chose to build their OS, they didn’t need to build everything completely from scratch. Even if they chose to do it there are still a ton of resources out there to help them get started. Which means you don’t have to restart your development every time you need to add a feature.

Also, the focus on building the functions you want into an OS you can bend to your needs means you don’t need to rely on other teams to build pieces of it. You can build your own GUI. You can make it look however you want. You can also make it operate in a manner that is easiest for your customer base. You don’t need to include every knob or button or bell and whistle. You can expose or hide functions as you wish. Don’t want customers to have tons of control over VPN creation or certificate authentication? You don’t need to worry about the GUI team exposing it without your permission. Simple and easy.

One other benefit of developing on platforms without technical debt? It’s easy to port your software from physical to virtual. CloudGenix was already successful in porting their software to run from physical hardware to the cloud thanks to CloudBlades. Could you imagine trying to get the original Cisco IWAN running in a cloud package for AWS or Azure? If those hives aren’t going crazy right now I’m sure you must have nerves or steel.


Tom’s Take

Technical debt is no joke. Every decision you make has consequences. And they may not be apparent for this generation of products. People you may never meet may have to live with your decisions as they try to build their vision. Sometimes you can work with those constraints. But more often than not brilliant people are going to jump ship and do it on their own. Not everyone is going to succeed. But for those that have the vision and drive and turn out something that works the rewards are legion. And that’s more than enough to pay off any debts, technical or not.

2 thoughts on “SD-WAN and Technical Debt

  1. Pingback: SD-WAN and Technical Debt - Tech Field Day

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s