About networkingnerd

Tom Hollingsworth, CCIE #29213, is a former network engineer and current organizer for Tech Field Day. Tom has been in the IT industry since 2002, and has been a nerd since he first drew breath.

SD-WAN and Technical Debt

Back during Networking Field Day 22, I was having a fun conversation with Phil Gervasi (@Network_Phil) and Carl Fugate (@CarlFugate) about SD-WAN and innovation. I mentioned that it was fascinating to see how SD-WAN companies kept innovating but that bigger, more established companies that had bought into SD-WAN seemed to be having issues catching up. As our conversation continued I realized that technical debt plays a huge role in startup culture in all factors, not just with SD-WAN. However, SD-WAN is a great example of technical debt to talk about here.

Any Color You Want In Black

Big companies have investments in supply chains. They have products that are designed in a certain way because it’s the least expensive way to develop the project or it involves using technology developed by the company that gives them a competitive advantage. Think about something like the Cisco Nexus 9000-series switches that launched with Cisco ACI. Every one of them came with the Insieme ASIC that was built to accelerate the policy component of ACI. Whether or not you wanted to use ACI or Insieme in your deployment, you were getting the ASIC in the switch.

Policies like this lead to unintentional constraints in development. Think back five years to Cisco’s IWAN solution. It was very much the precursor to SD-WAN. It was a collection of technologies like Performance Routing (PfR), Application Visibility Control (AVC), Policy Based Routing (PBR), and Network Based Application Recognition (NBAR). If that alphabet soup of acronyms makes you break in hives, you’re not alone. Cisco IWAN was a platform very much market by potential and complexity.

Let’s step back and ask ourselves an important question: “Why?” Why was IWAN so complicated? Why was IWAN hard to deploy? Why did IWAN fail to capture a lot of market share and ride the wave that eventually became SD-WAN? Looking back, a lot of the choices that were made that eventually doomed IWAN can come down to existing technical debt. Cisco is a company that makes design decisions based on what they’ve been doing for a while.

I’m sure that the design criteria for IWAN came down to two points:

  1. It needs to run on IOS.
  2. It needs to be an ISR router.

That doesn’t sound like much. But imagine the constraints you run into with just those two limitations. You have a hardware platform that may not be suited for the kind of work you want to do. Maybe you want to take advantage of x86 chipset acceleration. Too bad. You have to run what’s in the ISR. Which means it could be underpowered. Or incapable of doing things like crypto acceleration for VPNs, which is important for building a mesh of encrypted tunnels. Or maybe you need some flexibility to build a better detection platform for applications. Except you have to use IOS. Which uses NBAR. And anything you write to extend NBAR has to run on their platforms going forward. Which means you need to account for every possible permutation of hardware that IOS runs on. Which is problematic at best.

See how technical debt can creep in from the most simplistic of sources? All we wanted to do was build a platform to connect WANs together easily. Now we’re mired in a years-old hardware choice and an aging software platform that can’t help us do what needs to be done. Is it any wonder why IWAN didn’t succeed in the original form? Or why so many people involved with the first generation of SD-WAN startups were involved with IWAN, even if just tangentially?

Debt-Free Development

Now, let’s look at a startup like CloudGenix, who was a presenter at Networking Field Day 22 and was recently acquired by Palo Alto Networks. They started off on a different path when they founded the startup. They knew what they wanted to accomplish. They had a vision for what would later be called SD-WAN. But instead of shoehorning it into an existing platform, they had the freedom to build what they wanted.

No need to keep the ISR platform? Great. That means you can build on x86 hardware to make your software more universally deployable on a variety of boxes. Speaking of boxes, using commercial off-the-shelf (COTS) equipment means you can buy some very small devices to run the software. You don’t need a system designed to use ATM modules or T1 connections. If all you little system is ever going to use is Ethernet there’s no reason to include expansion at all. Maybe USB for something like a 4G/LTE modem. But those USB ports are baked into the board already.

A little side note here that came from Olivier Huynh Van of Gluware. You know the USB capabilities on a Cisco ISR? Yeah, the ISR chipset didn’t support USB natively. And it’s almost impossible to find USB that isn’t baked into an x86 board today. So Cisco had to add it to the ISR in a way that wasn’t 100% spec-supported. It’s essentially emulated in the OS. Which is why not every USB drive works in an ISR. Take that for what’s it’s worth.

Back to CloudGenix. Okay, so you have a platform you can build on. And you can build software that can run on any x86 device with Ethernet ports and USB devices. That means your software doesn’t need to do complicated things. It also means there are a lot of methods already out there for programming network operating systems for x86 hardware, such as Intel’s Data Plane Development Kit (DPDK). However CloudGenix chose to build their OS, they didn’t need to build everything completely from scratch. Even if they chose to do it there are still a ton of resources out there to help them get started. Which means you don’t have to restart your development every time you need to add a feature.

Also, the focus on building the functions you want into an OS you can bend to your needs means you don’t need to rely on other teams to build pieces of it. You can build your own GUI. You can make it look however you want. You can also make it operate in a manner that is easiest for your customer base. You don’t need to include every knob or button or bell and whistle. You can expose or hide functions as you wish. Don’t want customers to have tons of control over VPN creation or certificate authentication? You don’t need to worry about the GUI team exposing it without your permission. Simple and easy.

One other benefit of developing on platforms without technical debt? It’s easy to port your software from physical to virtual. CloudGenix was already successful in porting their software to run from physical hardware to the cloud thanks to CloudBlades. Could you imagine trying to get the original Cisco IWAN running in a cloud package for AWS or Azure? If those hives aren’t going crazy right now I’m sure you must have nerves or steel.


Tom’s Take

Technical debt is no joke. Every decision you make has consequences. And they may not be apparent for this generation of products. People you may never meet may have to live with your decisions as they try to build their vision. Sometimes you can work with those constraints. But more often than not brilliant people are going to jump ship and do it on their own. Not everyone is going to succeed. But for those that have the vision and drive and turn out something that works the rewards are legion. And that’s more than enough to pay off any debts, technical or not.

The Bane of Backwards Compatibility

I’m a huge fan of video games. I love playing them, especially on my old consoles from my formative years. The original Nintendo consoles were my childhood friends as much as anything else. By the time I graduated from high school, everyone had started moving toward the Sony Playstation. I didn’t end up buying into that ecosystem as I started college. Instead, I just waited for my brother to pick up a new console and give me his old one.

This meant I was always behind the curve on getting to play the latest games. I was fine with that, since the games I wanted to play were on the old console. The new one didn’t have anything that interested me. And by the time the games that I wanted to play did come out it wouldn’t be long until my brother got a new one anyway. But one thing I kept hearing was that the Playstation was backwards compatible with the old generation of games. I could buy a current console and play most of the older games on it. I wondered how they managed to pull that off since Nintendo never did.

When I was older, I did some research into how they managed to build backwards compatibility into the old consoles. I always assumed it was some kind of translation engine or enhanced capabilities. Instead, I found out it was something much less complicated. For the PS2, the same controller chip from the PS1 was used, which ensured backwards compatibility. For the PS3, they essentially built the guts of a PS2 into the main board. It was about as elegant as you could get. However, later in the life of those consoles, system redesigns made them less compatible. Turns out that it isn’t easy to create backwards compatibility when you redesign things to remove the extra hardware you added.

Bringing It Back To The Old School

Cool story, but what does it have to do with enterprise technology? Well, the odds are good that you’re about to fight a backwards compatibility nightmare on two fronts. The first is with WPA3, the newest security protocol from the Wi-Fi Alliance. WPA3 fixes a lot of holes that were present in the ancient WPA2 and includes options to protect public traffic and secure systems from race conditions and key exchange exploits. You’d think it was designed to be more secure and would take a long time to break right? Well, you’d be wrong. That’s because WPA3 was exploited last year thanks to a vulnerability in the WPA3-Transition mode designed to enhance backwards compatibility.

WPA3-Transition Mode is designed to keep people from needing to upgrade their wireless cards and client software in one fell swoop. It can configure a WPA3 SSID with the ability for WPA2 clients to connect to it without all the new enhanced requirements. Practically, it means you don’t have to run two separate SSIDs for all your devices as you move from older to newer. But practical doesn’t cover the fact that security vulnerabilities exist in the transition mechanism. Enterprising attackers can exploit the weaknesses in the transition setup to crack your security.

It’s not unlike the old vulnerabilities in WPA when it used TKIP. TKIP was found to have a vulnerability that allowed for exploiting. People were advised to upgrade to WPA-AES as soon as possible to prevent this. But if you enabled older non-AES capable clients to connect to your SSIDs for compatibility reasons you invalidated all that extra security. Because AES had to operate in TKIP mode to connect the TKIP clients. And because the newer clients were happy to use TKIP over AES you were stuck using a vulnerable mode. The only real solution was to have a WPA-AES SSID to connect to for your newer secure clients and leave a WPA-TKIP SSID active for the clients that had to use it until they could be upgraded.

4Gs for the Price of 5

The second major area where we’re going see issues with backwards compatibility is with 5G networking. We’re hearing about the move to using 5G everywhere. We’ve no doubt heard by now that 5G is going to replace enterprise wireless or change the way we connect to things. Honestly, I’m not surprised someone has tried to make the claim that 5G can make waffles and coffee yet. But 5G is rife with the same backwards compatibility issues present in enterprise wireless too.

5G is an evolution of the 4G standards. Phones issued today are going to have 4G and 5G radios and the base stations are going to mix the radio types to ensure those phones can connect. Just like any new technology, they’re going to maximize the connectivity of the existing infrastructure and hope that it’s enough to keep things running as they build out the new setup. But by running devices with two radios or having a better connection from the older devices, you’re going to set yourself up to have your new protocol inherently insecure thanks to vulnerabilities in the old versions. It’s already projected that governments are going to take advantage of this for a variety of purposes.

We find ourselves in the same boat as we do with WPA3. Because we have to ensure maximum compatibility, we make sacrifices. We keep two different versions running at the same time, which increases complexity. We even mark a lot of necessary security upgrades as optional in order to keep people from refusing to implement them or fall behind because they don’t understand them1.

The biggest failing for me is that we’re pushing for backwards compatibility and performance over security. We’re not willing to make the hard choices to reduce functionality in order to save our privacy and security. We want things to be backwards compatible so we can buy one device today and have it work on everything. We’ll just make the next one more secure. Or the one after that. Until we realize that we’re still running old 802.11 data rates in our newest protocols because no one bothered to remove them. We have to make hard choices sometimes and sacrifice some compatibility in order to ensure that we’re safe and secure with the newer technology.


Tom’s Take

Backwards compatibility is like the worst kind of nostalgia. I want the old thing but I want it on a new thing that runs faster. I want the glowing warmth of my youth but with the convenience of modern technology. It’s like buying an old sports car. Sure, you get all the look and feel of an old powerful engine. You also lose the safety features of the new body along with the comforts you’ve become accustomed to. You have to make a hard choice. Do you keep the old car original and lose out on what you like to get what you want? Or do you create some kind of hybrid that has exactly what you want and need but isn’t what you started with? It’s a tough choice to make. In the world of technology, there’s no right answer. But we need to remember that every compromise we make for performance can lead to compromises in security.


  1. I’m looking at you, OWE ↩︎

Fast Friday Thoughts on Where We Are

It’s been a crazy week. I know the curse is “May you live in interesting times,” but I’m more than ready for things to be less interesting for a while. It’s going to take some time to adjust to things. From a networking perspective, I have a few things that have sprung up.

  • Video conferencing is now a big thing. Strangely, Cisco couldn’t make video the new phone. But when people are stuck at home now we need to do video again? I get that people have a need to see each other face-to-face. But having worked from home for almost seven years at this point I can tell you video isn’t a necessity. It’s a nice option, but you can get a lot accomplished with video calls and regular emails.
  • Along side this is the fact that the push to put more video out there is causing applications to reach their breaking points. Zoom, which is fairing the best out of all of them so far, had some issues on Thursday morning. Tripling the amount of traffic that’s going out and making it very sensitive to delay and jitter is going expose a lot of flaws in the system.
  • I applaud all of the companies in the last week that have chosen to step out and offer resources to help people work better from home. I also hope that employees and managers use them after this is over to help enable more remote work. Just remember that flexibility has a cost axis as well. Those VPNs and security services and CASBs aren’t going to be free forever. If it makes sense, use it. Otherwise, find something that does.
  • Remember that this is a stressful time for everyone. I work from home all the time. And this week I have been totally exhausted. Try to find a way to keep your sanity. Step outside for air. Take a short break. Look for ways to keep yourself healthy. It’s going to take time for people to adjust to this. It’s going to take time even if you know how to work remotely too.

Tom’s Take

I’m not sure where this is all headed. We’re all still figuring it out. Things won’t look the same six months from now no matter what. But keep working where you can and improving what you do. The value in this shift comes from empowering us to do what we can. If that means cutting back on Netflix during working hours or spending some extra time learning a new skill make it happen and grow as much as you can. We’re going to need that.

I Hate Excellent Questions

I was listening to a recent episode of the Packet Pushers Podcast about SD-WAN and some other stuff. At one point, my good friend Greg Ferro (@EtherealMind) asked the guest something, and the guest replied with, “That’s an excellent question!” Greg replied with, “Of course it was. I only ask excellent questions.” I was walking and laughed out loud harder than I’ve laughed in a long time.

This was also a common theme during Networking Field Day. Everyone was asking “great” or “excellent” questions. I chuckled and told the delegates that it was a canned response that most presenters give today. But then I wondered why all our questions are excellent. And why I hated that response so much.

Can You Define “Excellent”?

The first reason why I think people tend to counter with “excellent” praise is because they are stalling for an answer. It’s a time-honored tradition from spelling bees when you don’t know how to spell the word and you need a few more seconds to figure out if this is one of those “i before e” words or not. I get the purpose of defining something of non-native speaker origin. But defining a simple word? It’s such a recognizable trope that we incorporated some of the fun into a video we did a few years ago at Aruba Atmosphere:

Watching my friends “stall” while they’re trying to figure out how to spell a made up word still cracks me up.

More importantly, in technology this response is designed to help the engineer or tech person spend a few critical seconds formulating their response and matching it to the question that was asked. Even just a second of memorized, practiced response repetition means you can think about how to answer the question without leaving silence.

We live in a world today where silence is bad. We’re so used to hearing noise and other kinds of filler that anything regarded as contemplation or thinking is negative. Instead, we must always be talking and making an audible effort to answer things. Even if it means repeating the same phrases over and over again. It’s bad enough when it’s a pause word. It’s really bad when it’s the same word at the beginning of a sentence for almost an hour. “That’s an excellent question” is quickly becoming the response equivalent of “um” in the vocabulary.

High Praise, Indeed

The other reason why I think people are quick to praise “excellent” questions comes from a bit of social trickery. Sadly, too many sales opportunities descend into an antagonistic relationship where salespeople feel they have to use every trick in the book to separate people from their money. They use tactics designed to inflate egos and make people feel more important so they feel like their making a good decision.

Think about the suspect phrasing here. It’s not a “good” question. Or even a “great” question. It’s almost always an “excellent” question. And I’d argue that the more likely a person is to sell you something, the more likely that person is to remark that all your questions are excellent.

This kind of puffery can be infuriating to people. It’s not unlike the standard “have you lost weight?” opening when you see someone for the first time in a long time. It’s verbal garbage. You don’t believe it. They don’t believe it. It’s rare that people even acknowledge it. And yet, we find ourselves repeating it over and over again. “That’s an excellent question” is ego stroking at its finest.

And the worst part? You’re not praising the person! You’re praising their question. You’re really saying that the words they used were good enough to merit praise. It’s not even that you are praising the person as much as their output. If you really, really, really feel the need to do this, think about doing it in a way that calls out the person asking the question instead:

  • Wow, you’re really paying attention here!
  • Did you read ahead?
  • You’re really getting this.
  • I’m very impressed with your grasp of this topic.

See how each of these responses is designed to work with the person in mind and not just the question? Sure, there s a bit more ego stroking here than with a simple “excellent” question. But if you’re just trying to flatter the person and you don’t even care about the quality of the question why not just sell out all the way? If the point of the response is to make a person feel good about themselves then just go all out.


Tom’s Take

I’m not likely to change the world overnight. Lord knows I’ve lost the battle against GIF and on-premises enough already and those are grammatically correct. The “excellent” question thing is a quirk of speech that isn’t going to just disappear because we bring it to light. People are still going to stall or try to boost the questioner’s ego. They’re still going to fill silence or make people full of themselves. Instead of falling back on the tropes of bygone eras, be a different person next time. Instead of the knee-jerk reaction of excellence, take a moment to think and praise the person asking the question. Then give a solid answer that they need to hear. You’ll find it a lot more effective. In fact, I’d venture to say it’s an excellent strategy.

There Are No More Green Fields

I’ve looked at quite a few pieces of technology in the past few years. Some have addressed massive issues that I had when I was a practicing network engineer. Others have shown me new ways to do things I never thought possible. But one category of technology still baffles me to this day: The technology that assumes greenfield deployment.

For those not familiar, “greenfield” is a term that refers to a project that is built on a site completely from scratch. It originally comes from a day when the project in question was a factory or other capital improvement that was literally being built in a field with green grass growing on top. The alternative to that project was one where something was being built in a location where there was existing infrastructure or other form of site pollution. And, of course because everyone in humanity never gets older than twelve, this is called a “brownfield” site.

Getting back to the technology side of things, let’s talk about greenfield deployments. When was the last time you walked into a building and found zero technology of any kind? Odds are good that’s not the case. Sure, there are some SMBs that have minimal technology. There are a lot of organizations that have just the basics. But the days of walking into a completely empty building and rolling out new PCs, phones, and software loads are gone. So too are the days of zero wireless coverage, no existing networking equipment, and no server hardware.

No matter how big your organization is right now, there is some solution that can get you connected quickly. The number of times that I’ve heard of the office “IT Person” going to a big box store and buying a consumer-grade router to get a couple of MacBooks on the Internet is more than you might think. The need for office phone systems has been supplanted with mobile phones thanks to unlimited minutes and apps that run just about everything now. The infrastructure in the office is now just a wireless router and a subscription to an application suite. If you’re really enterprising you might even have a server or two running in AWS.

What Can Brown Do For Me

The world is brown now. There are no green fields left. Technology has invaded every part of our life. 45% of the world’s population carries a smartphone in their pocket and the number is climbing quickly. Everyone has access to some form of computing device that runs software, whether it’s a phone, PC, laptop, or device in a public place like a library. The Internet is ubiquitous with mobile device data plans and free Wi-Fi springing up in every coffee shop and retail location you can see.

Why then would a company assume there are greenfield deployment opportunities left out there? If you know that companies are going to have some kind of existing infrastructure why would you build a product that assumes otherwise? I understand that when you’re building something that no one has ever seen before that the likelihood of having to replace existing technology is low. But you are still going to need to integrate that exciting new tech with something else, aren’t you?

Building Blocks

Organizations have a mentality of building in phases. We need new capacity in this location so we build it out. Maybe it’s a rack or a pod or a building. The basic idea is the same. We need to add a component so we add it on like it was a Lego brick component. That kind of mentality is helped along by systems that can be deployed quickly in a turnkey fashion. It’s how technology operates today.

But that same turnkey system can become a pariah of technology if it doesn’t interoperate well with the other technology on-site. Build a network fabric that doesn’t play well with others? Your pod deployment is probably going to be a one-off. Build a storage solution that doesn’t interface well with virtual servers? Might not be an additions to that storage unit. Build a backup tool that doesn’t work with cloud storage or volumes? Guess what won’t be getting backed up any time soon?

Developing in a vacuum speeds time to market for sure. But it also tells your customers that you don’t really have much of a plan aside from “we hope you only buy our gear”. Imagine if there was a tire company that released a tire that could only work on a couple of new cars that were just released and not on any other cars on the market. Unless those tires were $2,500 each that company would like go out of business very quickly. Sure, it’s easy to build a high performance tire that only works with those two cars. But what if the people that own those two cars don’t want that tire? Or they don’t want to pay that price for it?

The alternative is to take the extra time and effort to realize that brownfield deployments are the norm now. You can’t hope to build something and not realize people are going to integrate it into their existing infrastructure. It’s reasonable to assume that an enterprise solution is going to replace consumer-grade equipment. It’s also fair to think that a complete solution may or may not replace an existing competing solution. But don’t assume that your technology is going to be deployed somewhere that doesn’t have any technology. Learn how those devices work and figure out how to interface with them. Make it easy for people to manage both the solutions or you may find yourself missing out on a sale.


Tom’s Take

Tom Watson is famous for having said, “There’s a world market for maybe 5 computers.” Of course, he said that almost 80 years ago when computers were in their infancy and the size of a garage. Today, we have computers everywhere. Yet we still see companies that think there’s a market for something that they built that isn’t completely revolutionary. Even with cutting edge technology like AR/VR or ultra mobile computers you still see existing technology as an interface point. It’s time to stop thinking that the world is a verdant field of green just waiting for the right solution to come along. Instead, think of the world as a pile of Lego houses waiting for your solution to be placed right beside it.

Denial of Services as a Service

Hacking isn’t new. If you follow the 2600 Magazine culture of know the name Mitnick or Draper you know that hacking has been a part of systems as long as their have been systems. What has changed in recent years is the malicious aspect of what’s going on in the acts themselves. The pioneers of hacking culture were focused on short term gains or personal exploitation. It was more about proving you could break into a system and getting the side benefit of free phone calls or an untraceable mobile device. Today’s hacking cultures are driven by massive amounts of theft and exploitation of resources to a degree that would make any traditional hacker blush.

It’s much like the difference between petty street crime and “organized” crime. With a patron and a purpose, the organizers of the individual members can coordinate to accomplish a bigger goal than was ever thought possible by the person on the street. Just like a wolf pack or jackals, you can take down a much bigger target with come coordination. I talked a little bit about how the targets were going to start changing almost seven years ago and how we needed to start figuring out how to protect soft targets like critical infrastructure. What I didn’t count on was how effectively people would create systems that can cripple us without total destruction.

Deny, Deny, Deny

During RSA Conference this year, I had a chance to speak briefly with Tom Kellerman of Carbon Black. He’s a great guy and I loved the chance to chat with him about some of the crazy stuff that Carbon Black has been seeing in the wild. He gave me a peek at their 2020 Cybersecurity Report and some of the new findings they’ve been seeing. A couple of things jumped out at me during our discussion though.

The first is that the bad actors that have started pushing attacks toward critical infrastructure have realized that denying that infrastructure to users is just as good as destroying it. Why should I take the time to craft a beautiful and elegant piece of malware like Stuxnet to deny a key piece of SCADA systems when I can just use a cyptolocker to infect all the control boxes for a tenth of the cost? And, if the target does pay up to get things unlocked, just leave them there in a state of shutdown!

A recent episode of the Risky Business podcast highlights this to great effect. A natural gas processing plant system was infected and needed to be cleaned. However, when gas is flowing through the pipelines you can’t just shut off one site to have it cleaned. You have to do a full system shutdown! That meant knocking the entire facility offline for two days to restore one site’s systems. That’s just the tip of the iceberg.

Imagine if you could manage to shut down a hospital like the accidental spanning tree meltdown at Beth Israel Deaconess Medical Center in 2002. Now, imagine a cyptolocker or a wiper that could shut down all the hospitals in California during a virus outbreak. Or maybe one that could infected and wipe out the control systems for all the dams providing power for the Tennessee Valley Authority. Getting worried yet? Because the kinds of people that are targeting these installations don’t care about $5,000 worth of Bitcoin to unlock stuff. They care about causing damage. They want stuff knocked offline. Or someone that organizes them does. And the end goal is the same: chaos. It doesn’t matter if the system is out because of the malware or down for days or weeks to clean it. The people looking to benefit from the chaos win no matter what.

Money, Money, Money

The biggest key to this kind of attack is the same as it always has been. If you want to know where the problems are coming from, follow the money. In the past, it was following the money to the people that are getting paid to do the attacks. Today, it’s more about following the money to the people that make the money from these kinds of attacks. It’s not enough to get Bitcoin or some other amount of peanuts in an untraceable wallet. If you can do something that manipulates global futures markets or causes fluctuations in commodity prices on the order of hundreds of thousands or even millions of dollars you suddenly don’t care about whether or not some company’s insurance is going to pay out to unlock their HR files.

Think about it in the most simple terms. If I could pay someone to shut down half the refineries in the US for a month to spike oil prices for my own ends would that be worth paying a few thousand dollars to a hacking team to pull off? Better yet, if that same hacking team was now under my “protection” from retaliation from the target, do you think they’d continue to work for me to ensure that they couldn’t be caught in the future? Sure, go ahead and freelance when you want. Just don’t attack my targets and be on-call when I need something big done. It’s not unlike all those crazy spy movies where a government agency keeps a Rolodex full of assassins on tap just in case.


Tom’s Take

The thought of what would happen in a modern unrestricted war scares me. Even 4-5 years from now would be a massive problem. We don’t secure things from the kind of determined attackers that can cause mass chaos. Let’s just shut down all the autonomous cars or connected vehicles in NY and CA. Let’s crash all the hospital MRI machines or shut down all the power plants in the US for a day or four. That’s the kind of coordination that can really upset the balance of power in a conflict. And we’re already starting to see that kind of impact with freelance groups. You don’t need a war to deny access to a service. Sometimes you just need to hire it out to the right people for the right price.

What Is Closed-Loop Automation?

During Networking Field Day 22 last week, a lot the questions that were directed at the presenters had to do with their automation systems. One term kept coming up that I was embarrassed to admit that I’d never heard of. Closed-loop automation is the end goal for these systems. But what is closed-loop automation? And why is it so important. I decided to do a little research and find out.

Open Up

To understand closed-loop systems, you have to understand open-loop systems first. Thankfully, those are really simple. Open-loop systems are those where the output isn’t directly affected by the control actions of the system. It’s a system where you’re going to get the output no matter how you control it. The easiest example is a clothes dryer. There are a multitude of settings that you can choose for a clothes dryer, including the timing of the cycle. But no matter what, the dryer will stop at the end of the cycle. There’s no sensor in a basic clothes dryer that senses the moisture level of the clothes and acts accordingly.

Open-loop systems are stable and consistent. Every time you turn on the dryer, it will run until it finishes. There’s no variable in the system that will change that. Aside from system failure, it’s going to run exactly 30 minutes every time it’s set to that cycle. It’s also not going to run unless you set the cycle. As my family will tell you, putting clothes in the dryer and not setting it will not result in magic happening.

Close It Off

In contrast, closed-loop systems have outputs that are dependent upon the control function of the system. If the control function requires something change in the system to achieve the desired output then it will change that thing to get there.

The most classic example of a closed-loop system is the HVAC system in your house. The control function is the thermostat. If you want the temperature in your house to be 70 degrees Fahrenheit (21 degrees C), you set the thermostat and let the system take care of things. If the temperature falls below the required setting, the heating unit will turn on and bring the temperature up to the required level before shutting off. In the summertime, rising above the temperature setting will cause the air conditioning compressor to kick on and cool things down.

Closed loop systems are great because you set them and forget them. Unlike the dryer example above I can set my thermostat and it will run even if I forget to go turn on the heater/AC. But they’re also more complicated to troubleshoot and figure out. As someone with very little practical knowledge of the operation of HVAC it’s rough to figure out if it’s the thermostat or the unit or some other relay somewhere that’s causing your house to be too warm or too cold.

Closed loop systems can also take more inputs given the right control settings. Using the same A/C example, I upgraded my thermostat from a basic model to one from Ecobee. Once I got it installed I had a lot of extra control over what I could do with it. For example, I could now have the settings in the house run based on time-of-day instead of just one basic setting all the time. If I wanted it colder at night I could tell the system to look at the time and change the setting until it was sunrise. I could also tell it to look for me to be home (using geolocation) and raise and lower the temperature if my geotoken, in this case my phone, wasn’t in the area. The possibilities are endless because the system is driven by those inputs.

Automatic for the Non-people

Let’s extend the idea of closed-loop systems to network automation. Now, you can make a system (the network) behave a certain way based on inputs to the control functions. This is a massive change from the steady-state that we’ve worked years to achieve. The system can now react to changes in state or inputs. Massive file transfer activity being done between two branch locations? Closed-loop automation can reprogram the edge SD-WAN gateways to implement QoS policies based on the traffic types to preserve bandwidth for voice calls or critical application traffic. When the transfers are done the system can clean up the policies.

Because closed-loop automation can do a wide variety of actions based on inputs, data becomes super valuable. The information your system is providing as feedback can create more stable results. Open-loop systems are super stable because they are incapable of change. They also run every time someone tells them to run. They require intervention. Closed-loop systems are capable of running without the need for people based solely on the data you get from the system. But they also have issues because bad data or inputs can cause the system to react in strange ways. For example, if the thermostat in a house is placed in direct sunlight or has an error that causes it to think the house is 90 degrees, the A/C compressor may kick on even if the house temperature is far, far below that. Data has to be correct for the system to work as intended.


Tom’s Take

The promise of closed-loop automation is exciting. The ability for the network to run without our help is music to my ears. But it also means we have to be more diligent about keeping the control functions of the system working properly with the correct data inputs. It also means we need to monitor the control system outputs to head off problems before they can impact the reliability of the system. I can’t wait to see how we continue to close the loop and create better, more responsive systems in the future.