Complexity Isn’t Always Bad

I was reading a great post this week from Gian Paolo Boarina (@GP_Ifconfig) about complexity in networking. He raises some great points about the overall complexity of systems and how we can never really reduce it, just move or hide it. And it made me think about complexity in general. Why are we against complex systems?

Confusion and Delay

Complexity is difficult. The more complicated we make something the more likely we are to have issues with it. Reducing complexity makes everything easier, or at least appears to do so. My favorite non-tech example of this is the carburetor of an internal combustion engine.

Carburetors are wonderful devices that are necessary for the operation of the engine. And they are very complicated indeed. A minor mistake in configuring the spray pattern of the jets or the alignment of them can cause your engine to fail to work at all. However, when you spend the time to learn how to work with one properly, you can make the engine perform even above the normal specifications.

Carburetors have been largely replaced in modern engines by computerized fuel injectors. These systems accomplish the same goal of injecting the fuel-air mixture into the engine. However, they are completely controlled by a computer system instead of being mechanically configured. It’s a great leap forward for people that aren’t mechanics or gear heads. The system either works or it doesn’t. There’s no configuration parameters. Of course, if it doesn’t work there’s also very little that you as a non-mechanic can do to rectify the situation. As Gian Paolo points out, the complexity in the system has just been moved from the carburetor to the computer system running it.

But why is that a bad thing? If the standard user is never supposed to fiddle with the system why is moving the complexity a bad thing? It could be argued that removing complications from the operation and diagnostics of the system are good, but only if you ever intend untrained people to ever work on the system. A non-mechanic might never be able to fix a fuel injector system, but a trained person should be able to fix it quickly. Here, the complexity isn’t a barrier to the people who have been trained properly to anticipate it.

Complexity is only a problem for people who don’t understand it. Whether it’s a routing protocol or a file system, complex things are going to exist no matter what we do. Understanding them doesn’t have to be the job of everyone that uses the system.

A Tangled Web

I remember briefly working with Novell’s original identity management system back when it was still called DirXML. It was horribly complicated. It required a number of XML drivers importing information into eDirectory, which itself had quirks. And that identity repository fed multiple systems via XML rules to populate those data structures. It was a complicated nightmare to end all nightmares.

Except when it worked. When the system did the job properly, it looked like magic. Information entered for a new employee in the HR system automatically created an Active Directory user account in a different system, provisioned an email account in a third different system, and even created a time card entry in a fourth totally different system. The complexity under the hood churned its way through to provide usability to the people that relied on the system. Could they have manually entered all of that information? Sure. But having it automatically happen was a huge time saver for them. And when you apply it to a school where those actions needed to be repeated dozens of times for new students you can see how it would save a significant amount of time.

Here, complexity is the reason the system exists. You didn’t have the capability to feed those individual systems at the time because of the lack of API support or various other reasons. You had to find a way to force feed the information to a system that wasn’t expecting to get it any other way. Complexity here was required. And it worked. Until it didn’t.

Troubleshooting the XML issues in the system and keeping it running with new updates and broken links consumed a huge amount of time for the people I knew that were good at using it. So much time, in fact, that a couple of them made a business out of remotely supported DirXML for customers that utilized it and either didn’t know how to use it or didn’t have the specific knowledge necessary to make it work the way they wanted. Here, the complexity wasn’t only a necessity of the system, but it was a driver to create a new support level for it.

Ultimately, DirXML went away as it was consumed by the Novell Identity Manager. And now, the idea of these systems not having an API is silly. We focus our efforts more on the programming of the API and not on the extra complexity of the layers on top of it. But even those API interactions can be complex. So, we’ve essentially traded one complexity for another. We have simplified some aspects of the complexity while introducing others. We’ve also standardized things without necessarily making them an easier to do.


Tom’s Take

Complexity is bad when we don’t understand it. Trying to explain lagrange points and orbital dynamics is a huge pain when you aren’t talking to rocket scientists. However, most people that understand the complexities of the college football playoff system are more than happy to explain it to you in depth simply because they “get it”. Complexity isn’t always the enemy. If the people working on the system understand it enough to get the reasons why it needs to be complex to fulfill a job requirement then it’s not a bad thing. Instead of trying to move or reduce complexity, we should instead to try to ensure that we don’t add any additional complexity to the system. That’s how you keep the complexity snowball from rolling you over.

Predictions As A Service

It’s getting close to the end of the year and it’s time once again for the yearly December flood of posts that will be predicting what’s coming in 2018. Long time readers of my blog know that I don’t do these kinds of posts. My New Year’s Day posts are almost always introspective in nature and forward looking from my own personal perspective. But I also get asked quite a bit to contribute to other posts about the future. And I wanted to tell you why I think the prediction business is a house of cards built on quicksand.

The Layups

It’s far too tempting in the prediction business to play it safe. Absent a ton of research, it’s just easier to play it safe with some not-so-bold predictions. For instance, here’s what I could say about 2018 right now:

  • Whitebox switching will grow in revenue.
  • Software will continue to transform networking.
  • Cisco is going to buy companies.

Those are 100% true. Even without having spent one day in 2018. They’re also things I didn’t need to tell you at all. You already knew them. They’re almost common sense at this point. If I needed to point out that Cisco is going to buy at least two companies next year, you are either very new to networking or you’ve been studying for your CCIE lab and haven’t seen the sun in eight months.

Safe predictions have a great success rate. But they say nothing. However, they are used quite a bit for the lovely marketing fodder we see everywhere. In three months, you could see presentation from an SD-WAN vendor that says, “Industry analyst Tom Hollingsworth predicts that 2018 is going to be a big year for software networking.” It’s absolutely true. But I didn’t say SD-WAN. I didn’t name any specific vendors. So that prediction could be used by anyone for any purpose and I’d still be able to say in December 2018 that I was 100% right.

Playing it safe is the most useless kind of prediction there is. Because all you’re doing is reinforcing the status quo and offering up soundbites to people that like it that way.

Out On A Limb

The other kind of prediction that people love to get into is the crazy, far out bold prediction. These are the ones that get people gasping and following your every word to see if it pays off. But these predictions are prone to failure and distraction.

Let’s run another example. Here are four bold sample predictions for 2018:

  • Cisco will buy Arista.
  • VMware will cease to be a separate brand inside Dell.
  • Hackers will release a tool to compromise iPhones remotely.
  • HPE will go out of business.

Those predictions are more exciting! They name companies like Cisco and VMware and Apple. They have very bold statements like huge purchases or going out of business. But guess what? They’re completely made up. I have no insight or research that tells me anything even close to those being true or not.

However, those bold predictions just sit out there and fester. People point to them and say, “Tom thinks Cisco will buy Arista in 2018!” And no one will every call me on the carpet if I’m wrong. If Cisco does end up buying Arista in 2020 or later, people will just say I was ahead of my time. If it never comes to pass, people will forget and just focus on my next bold prediction of VMware buying Cisco. It’s a hype train with no end.

And on the off chance that I do nail a big one, people are going to think I have the inside track. My little predictions will be more important. And if I hit half of my bold ones, I would probably start getting job offers from analyst firms and such. These people are the prediction masters extraordinaire. If they aren’t telling you something you already know, they’re pitching you something that have no idea about.

Apple has a cottage industry built around crazy predictions. Just look back to August to see how many crazy ideas were out there about the iPhone X. Fingerprint sensor under the glass? 3D rear camera? Even crazier stuff? All reported on as pseudo-fact and eaten up by the readers of “news” sites. Even the people who do a great job of prediction based on solid research missed a few key details in the final launch. it just goes to show that no one is 100% accurate in bold predictions.


Tom’s Take

I still do predictions for other people. Sometimes I try to make tongue-in-cheek ones for fun. Other times I try to be serious and do a little research. But I also think that figuring out what’s coming 5 years from now is a waste of my time. I’d rather try to figure out how to use what I have today and build that toward the future. I’d rather be a happy iPhone user than the people that predicted that Apple’s move into the mobile market would fail miserably. Because that’s a headline you’ll never live down.

I’d like to thank my friends at Network Collective for inspiring this post. Make sure you check out their video podcast!

An Opinion On Offense Against NAT

It’s been a long time since I’ve gotten to rant against Network Address Translation (NAT). At first, I had hoped that was because IPv6 transitions were happening and people were adopting it rapidly enough that NAT would eventually slide into the past of SAN and DOS. Alas, it appears that IPv6 adoption is getting better but still not great.

Geoff Huston, on the other hand, seems to think that NAT is a good thing. In a recent article, he took up the shield to defend NAT against those that believe it is an abomination. He rightfully pointed out that NAT has extended the life of the modern Internet and also correctly pointed out that the slow pace of IPv6 deployment was due in part to the lack of urgency of address depletion. Even with companies like Microsoft buying large sections of IP address space to fuel Azure, we’re still not quite at the point of the game when IP addresses are hard to come by.

So, with Mr. Huston taking up the shield, let me find my +5 Sword of NAT Slaying and try to point out a couple of issues in his defense.

Relationship Status: NAT’s…Complicated

The first point that Mr. Huston brings up in his article is that the modern Internet doesn’t resemble the one build by DARPA in the 70s and 80s. That’s very true. As more devices are added to the infrastructure, the simple packet switching concept goes away. We need to add hierarchy to the system to handle the millions of devices we have now. And if we add a couple billion more we’re going to need even more structure.

Mr. Huston’s argument for NAT says that it creates a layer of abstraction that allows devices to be more mobile and not be tied to a specific address in one spot. That is especially important for things like mobile phones, which move between networks frequently. But instead of NAT providing a simple way to do this, NAT is increasing the complexity of the network by this abstraction.

When a device “roams” to a new network, whether it be cellular, wireless, wired, or otherwise, it is going to get a new address. If that address needs to be NATed for some reason, it’s going to create a new entry in a NAT state table somewhere. Any device behind a NAT that needs to talk to another device somewhere is going to create twice as many device entries as needed. Tracking those state tables is complicated. It takes memory and CPU power to do this. There is no ASIC that allows a device to do high-speed NATing. It has to be done by a general purpose CPU.

Adding to the complexity of NAT is the state that we’re in today when we overload addresses to get connectivity. It’s not just a matter of creating a singular one-to-one NAT. That type of translation isn’t what most people think of as NAT. Instead, they think of Port Address Translation (PAT), which allows hundreds or thousands of devices to share the same IP address. How many thousands? Well, as it turns out about 65,000 give or take. You can only PAT devices if you have free ports to PAT them on. And there are only 65,636 ports available. So you hit a hard limit there.

Mr. Huston talks in his article about extending the number of bits that can be used for NAT to increase the number of hosts that can be successfully NATed. That’s going to explode the tables of the NATing device and cause traffic to slow considerably if there are hundreds of thousands of IP translations going on. Mr. Huston argues that since the Internet is full of “middle boxes” anyway that are doing packet inspection and getting in the way of true end-to-end communications that we should utilize them and provide more space for NAT to occur instead of implementing IPv6 as an addressing space.

I’ll be the first to admit that chopping the IPv6 address space right in the middle to allow MAC addresses to auto-configure might not have been the best decision. But, in the 90s when we didn’t have DHCP it was a great idea in theory. And yes, assigning a /48 to a network does waste quite a bit of IP space. However, it does a great job of shrinking the size of the routing table, since that network can be summarized a lot better than having a bunch of /64 host routes floating around. This “waste” echoes the argument for and against using a /64 for a point-to-point link. If you’re worried about wasting several thousand addresses out of a potential billion then there might be other solutions you should look at instead.

Say My Name

One of the points that gets buried in the article that might shed some light on this defense of NAT is Mr. Huston’s championing for Named Data Networking. The concept of NDN is that everything on the Internet should stop being referred to as an address and instead should be tagged with a name. Then, when you want to look for a specific thing, you send a packet with that name and the Internet routes your packet to the thing you want. You then setup a communication between you and the data source. Sounds simple, right?

If you’re following along at home, this also sounds suspiciously like object storage. Instead of a piece of data living on a LUN or other SAN construct, we make every piece of data an object of a larger system and index them for easy retrieval. This idea works wonders for cloud providers, where object storage provides an overlay that hides the underlying infrastructure.

NDN is a great idea in theory. According to the Wikipedia article, address space is unbounded because you just keep coming up with new names for things. And since you’re using a name and not an address, you don’t have to NAT anything. That last point kind of blows up Mr. Huston’s defense of NAT in favor of NDN, right?

One question I have makes me go back to the object storage model and how it relates to NDN. In an object store, every piece of data has an Object ID, usually a UUID of 32 bits or 64 bits. We do this because, as it turns out, computers are horrible at finding names for things. We need to convert those names into numbers because computers still only understand zeros and ones at their most basic level. So, if we’re going to convert those names to some kind of numeric form anyway, why should we completely get rid of addresses? I mean, if we can find a huge address space that allows us to enumerate resources like an object store, we could duplicate a lot of NDN today, right? And, for the sake of argument, what if that huge address space was already based on hexadecimal?

Hello, Is It Me URLooking For?

To put this in a slightly different perspective, let’s look at the situation with phone numbers. In the US, we’ve had an explosion of mobile phones and other devices that have forced us to extend the number of area codes that we use to refer to groups of phone numbers. These area codes are usually geographically specific. We add more area codes to contain numbers that are being added. Sometimes these are specific to one city, like 212 is for New York. Other times they can cover a whole state or a portion of a state, like 580 does for Oklahoma.

It would be a whole lot easier for us to just refer to people by name instead of adding new numbers, right? I mean, we already do that in our mobile phones. We have a contact that has a phone number and an email address. If we want to contact John Smith, we look up the John Smith we want and choose our contact preference. We can call, email, or send a message through text or other communications method.

What address we use depends on our communication method. Calls use a phone number. If you’re on an iPhone like me, you can text via phone or AppleID (email address). You can also set up a video call the same way. Each of these methods of contact uses a different address for the name.

With Named Data Networking, are we going to have different addresses for each resource? If we’re doing away with addresses, how are we going to name things? Is there a name registry? Are we going to be allowed to name things whatever we want? Think about all the names of videos on Youtube if you want an idea of the nightmare that might be. And if you add some kind of rigid structure in the mix, you’re going to have to contain a database of names somewhere. As we’ve found with DNS, having a repository of information in a central place would make an awfully tempting target. Not to mention causing issues if it ever goes offline for some reason.


Tom’s Take

I don’t think there’s anything that could be said to defend NAT in my eyes. It’s the duct tape temporary solution that never seems to go away completely. Even with depletion and IPv6 adoption, NAT is still getting people riled up and ready to say that it’s the best option in a world of imperfect solutions. However, I think that IPv6 is the best way going forward. With more room to grow and the opportunity to create unique IDs for objects in your network. Even if we end up going down the road of Named Data Networking, I don’t think NAT is the solution you want to go with in the long run. Drive a sword through the heart of NAT and let it die.

VMware and VeloCloud: A Hedge Against Hyperconvergence?

VMware announced on Thursday that they are buying VeloCloud. This was a big move in the market that immediately set off a huge discussion about the implications. I had originally thought AT&T would buy VeloCloud based on their relationship in the past, but the acquistion of Vyatta from Brocade over the summer should have been a hint that wasn’t going to happen. Instead, VMware swooped in and picked up the company for an undisclosed amount.

The conversations have been going wild so far. Everyone wants to know how this is going to affect the relationship with Cisco, especially given that Cisco put money into VeloCloud in both 2016 and 2017. Given the acquisition of Viptela by Cisco earlier this year it’s easy to see that these two companies might find themselves competing for marketshare in the SD-WAN space. However, I think that this is actually a different play from VMware. One that’s striking back at hyperconverged vendors.

Adding The Value

If you look at the marketing coming out of hyperconvergence vendors right now, you’ll see there’s a lot of discussion around platform. Fast storage, small footprints, and the ability to deploy anywhere. Hyperconverged solutions are also starting to focus on the hot new trends in compute, like containers. Along the way this means that traditional workloads that run on VMware ESX hypervisors aren’t getting the spotlight they once did.

In fact, the leading hyperconvergence vendor Nutanix has been aggressively selling their own hypervisor, Acropolis as a competitor to VMware. They tout new features and easy configuration as the major reason to use Acropolis over ESX. The push by Nutanix is to get their customers off of ESX and on to Acropolis to get a share of the VMware budget that companies are currently paying.

For VMware, it’s a tough sell to keep their customers on ESX. There’s a very big ecosystem of software out there that runs on ESX, but if you can replicate a large portion of it natively like Acropolis and other hypervisors do there’s not much of a reason to stick with ESX. And if the VMware solution is more expensive over time you will find yourself choosing the cheaper alternative when the negotiations come up for renewal.

For VMware NSX, it’s an even harder road. Most of the organizations that I’ve seen deploying hyperconverged solutions are not huge enterprises with massive centralized data centers. Instead, they are the kind small-to-medium businesses that need some functions but are very budget conscious. They’re also very geographically diverse, with smaller branch offices taking the place of a few massive headquarters locations. While NSX has some advantages for these companies, it’s not the best fit for them. NSX works optimally in a data center with high-speed links and a well-built underlay network.

vWAN with VeloCloud

So how is VeloCloud going to play into this? VeloCloud already has a lot of advantages that made them a great complement to VMware’s model. They have built-in multi tenancy. Their service delivery is virtualized. They were already looking to move toward service providers as their primary market, but network services and managed service providers. This sounds like their interests are aligning quite well with VMware already.

The key advantage for VMware with VeloCloud is how it will allow NSX to extend into the branch. Remember how I said that NSX loves an environment with a stable underlay? That’s what VeloCloud can deliver. A stable, encrypted VPN underlay. An underlay that can be managed from one central location, or in the future, perhaps even a vCenter plugin. That gives VeloCloud a huge advantage to build the underlay to get connectivity between branches.

Now, with an underlay built out, NSX can be pushed down into the branch. Branches can now use all the great features of NSX like analytics, some of which will be bolstered by VeloCloud, as well as microsegmentation and other heretofore unseen features in the branch. The large headquarters data center is now available in a smaller remote size for branches. That’s a huge advantage for organizations that need those features in places that don’t have data centers.

And the pitch against using other hypervisors with your hyperconverged solution? NSX works best with ESX. Now, you can argue that there is real value in keeping ESX on your remote branches is not costs or features that you may one day hope to use if your WAN connection gets upgraded to ludicrous speed. Instead, VeloCloud can be deployed between your HQ or main office and your remote site to bring those NSX functions down into your environment over a secure tunnel.

While this does compete a bit with Cisco from a delivery standpoint, it still doesn’t affect them with complete overlap. In this scenario, VeloCloud is a service delivery platform for NSX and not a piece of hardware at the edge. Absent VeloCloud, this kind of setup could still be replicated with a Cisco Viptela box running the underlay and NSX riding on top in the overlay. But I think that the market that VMware is going after is going to be building this from the ground up with VMware solutions from the start.


Tom’s Take

Not every issues is “Us vs. Them”. I get that VMware and Cisco seem to be spending more time moving closer together on the networking side of things. SD-WAN is a technology that was inevitably going to bring Cisco into conflict with someone. The third generation of SD-WAN vendors are really companies that didn’t have a proper offering buying up all the first generation startups. Viptela and VeloCloud are now off the market and they’ll soon be integral parts of their respective parent’s strategies going forward. Whether VeloCloud is focused on enabling cloud connectivity for VMware or retaking the branch from the hyperconverged vendors is going to play out in the next few months. But instead of focusing on conflict with anyone else, VeloCloud should be judged by the value it brings to VMware in the near term.