Data Is Not The New Oil, It’s Nuclear Power

Posted on December 20, 2017 by networkingnerd

Big Data. I believe that one phrase could get millions in venture capital funding. I don’t even have to put a product with it. Just say it. And make no mistake about it: the rest of the world thinks so too. Data is “the new oil”. At least, according to some pundits. It’s a great headline making analogy that describes how data is driving business and controlling it can lead to an empire. But, data isn’t really oil. It’s nuclear power.

Black Gold, Texas Tea

Crude oil is a popular resource. Prized for a variety of uses, it is traded and sold as a commodity and refined into plastics, gasoline, and other essential items of modern convenience. Oil creates empires and causes global commerce to hinge on every turn of the market. Living in a state that is a big oil producer, the exploration and refining of oil has a big impact.

However, when compared to Big Data, oil isn’t the right metaphor. Much like oil, data needs to be refined before use. But oil can be refined into many different distinct things. Data can only be turned into information. Oil burns up when consumed. Aside from some smoke and a small amount of residuals oil all but disappears after it expends the energy trapped within. Data doesn’t disappear after being turned into information.

In fact, the biggest issue that I have with the entire “Data as Oil” argument is that oil doesn’t stick around. We don’t see massive pools of oil on the side of the road from spills. We don’t hear about our massive issues with oil disposal or securing our spent oil to prevent theft. People that treat data like oil are only looking at the refined product as the final form. They tend to forget that the raw form of data sticks around after the transformation. Most people will tell you that’s a good thing because you can run analytics and machine learning against static datasets and continue to derive value from it. But doesn’t make me think of oil at all.

Welcome To The Nuclear Age

In fact, Big Data reminds me most of a nuclear power plant. Much like oil, the initial form of radioactive material isn’t very useful. It radiates and creates a small amount of heat but not enough to run the steam generators in a power plant. Instead, you must bombard the uranium 235 pellets with neutrons to start a fission reaction. Once you have a sustained controllable reaction the amount of generated heat rises and creates the resource you need to power the rest of your machinery.

Much like data, nuclear fission reactions don’t do much without the proper infrastructure to harness them. Even after you transform you data into information you need to parse, categorize, and analyze it. The byproduct of the transformation is the critical part of the whole process.

Much like nuclear fuel rods, data stays in place for years. It continues to produce the resource after being modified and transformed. It sits around in the hopes that it can be useful until the day that it no longer serves its purpose. Data that is past the useful shelf life goes into a data warehouse when it will eventually be forgotten. Spent nuclear fuel rods are also eventually removed and placed somewhere where they can’t affect other things. Maybe it’s buried deep underground. Or shot into space. Or placed under enough concrete that they will never be found again.

The danger in data and in nuclear power is not what happens when everything goes right. Instead, it’s what happens when everything goes wrong. With nuclear power, wrong is a chain reaction meltdown. Or wrong could be improper disposal of waste. It could be a disaster at the plant or even a theft of fissile nuclear material from the plant. The fuel rods themselves are simultaneously our source of power and the source of our potential disaster.

Likewise, data that is just sitting around and stored improperly can lead to huge disasters. We’re only four years removed from Target’s huge data breach. And how many more are waiting out there to happen? It seems the time between data leaks is shrinking as more and more bad actors are finding ways to steal, manipulate, and appropriate data for their own ends. And, much like nuclear fuel rods, the methods of protecting the data are few compared to other fuel sources.

Data isn’t something that can easily be hidden or compacted. It needs to be readable to be useful. It needs to be fast to be useful. All the things that make it easy to use also make it easy to exploit. And once we’re done with it, the way that it is stored in perpetuity only increase the likelihood of it being used improperly. Unless we’re willing to bury it under metaphorical concrete we’re in for a bad future if we forget how to handle spent data.

Tom’s Take

Data as Oil is a stupid metaphor. It’s meant to impress upon finance CEOs and Wall Street wonks how important it is for data to be taken seriously. Data as Oil is something a data scientist would say to get a job. By drawing a bad comparison, you make data seem like a commodity to be traded and used as collateral for empire building. It’s not. It’s a ticking bomb of disastrous proportions when not handled correctly. Rather than coming up with a pithy metaphor for cable news consumption and page views, let’s treat data with the respect it deserves and make sure we plan for how we’re going to deal with something that won’t burn up into smoke whenever it’s convenient.

Should We Build A Better BGP?

Posted on December 15, 2017 by networkingnerd

One story that seems to have flown under the radar this week with the Net Neutrality discussion being so dominant was the little hiccup with BGP on Wednesday. According to sources, sources inside AS39523 were able to redirect traffic from some major sites like Facebook, Google, and Microsoft through their network. Since the ISP in question is located inside Russia, there’s been quite a lot of conversation about the purpose of this misconfiguration. Is it simply an accident? Or is it a nefarious plot? Regardless of the intent, the fact that we live in 2017 and can cause massive portions of Internet traffic to be rerouted has many people worried.

Routing by Suggestion

BGP is the foundation of the modern Internet. It’s how routes are exchanged between every autonomous system (AS) and how traffic destined for your favorite cloud service or cat picture hosting provider gets to where it’s supposed to be going. BGP is the glue that makes the Internet work.

But BGP, for all of the greatness that it provides, is still very fallible. It’s prone to misconfiguration. Look no further than the Level 3 outage last month. Or the outage that Google caused in Japan in August. And those are just the top searches from Google. There have been a myriad of problems over the course of the past couple of decades. Some are benign. Some are more malicious. And in almost every case they were preventable.

BGP runs on the idea that people configuring it know what they’re doing. Much like RIP, the suggestion of a better route is enough to make BGP change the way that traffic flows between systems. You don’t have to be a evil mad genius to see this in action. Anyone that’s ever made a typo in their BGP border router configuration will tell you that if you make your system look like an attractive candidate for being a transit network, BGP is more than happy to pump a tidal wave of traffic through your network without regard for the consequences.

But why does it do that? Why does BGP act so stupid sometimes in comparison to OSPF and EIGRP? Well, take a look at the BGP path selection mechanism. CCIEs can probably recite this by heart. Things like Local Preference, Weight, and AS_PATH govern how BGP will install routes and change transit paths. Notice that these are all set by the user. There are not automatic conditions outside of the route’s origin. Unlike OSPF and EIGRP, there is no consideration for bandwidth or link delay. Why?

Well, the old Internet wasn’t incredibly reliable from the WAN side. You couldn’t guarantee that the path to the next AS was the “best” path. It may be an old serial link. It could have a lot of delay in the transit path. It could also be the only method of getting your traffic to the Internet. Rather than letting the routing protocol make arbitrary decisions about link quality the designers of BGP left it up to the person making the configuration. You can configure BGP to do whatever you want. And it will do what you tell it to do. And if you’ve ever taken the CCIE lab you know that you can make BGP do some very interesting things when you’re faced with a challenge.

BGP assumes a minimum level of competency to use correctly. The protocol doesn’t have any built in checks to avoid doing stupid things outside of the basics of not installing incorrect routes in the routing table. If you suddenly start announcing someone else’s AS with better metrics then the global BGP network is going to think you’re the better version of that AS and swing traffic your way. That may not be what you want. Given that most BGP outages or configurations of this type only last a couple of hours until the mistake is discovered, it’s safe to say that fat fingers cause big BGP problems.

Buttoning Down BGP

How do we fix this? Well, aside from making sure that anyone touching BGP knows exactly what they’re doing? Not much. Some Regional Internet Registrars (RIRs) require you to preconfigure new prefixes with them before they can be brought online. As mentioned in this Reddit thread, RIPE is pretty good about that. But some ISPs, especially ones in the US that work with ARIN, are less strict about that. And in some cases, they don’t even bring the pre-loaded prefixes online at the correct time. That can cause headaches when trying to figure out why your networks aren’t being announced even though your config is right.

Another person pointed out the Mutually Agreed Norms for Routing Security (MANRS). These look like some very good common sense things that we need to be doing to ensure that routing protocols are secure from hijacks and other issues. But, MANRS is still a manual setup that relies on the people implementing it to know what they’re doing.

Lastly, another option would be the Resource Public Key Infrastructure (RPKI) service that’s offered by ARIN. This services allows people that own IP Address space to specify which autonomous systems can originate their prefixes. In theory, this is an awesome idea that gives a lot of weight to trusting that only specific ASes are allowed to announce prefixes. In practice, it requires the use of PKI cryptographic infrastructure on your edge routers. And anyone that’s ever configured PKI on even simple devices knows how big of a pain that can be. Mixing PKI and BGP may be enough to drive people back to sniffing glue.

Tom’s Take

BGP works. It’s relatively simple and gets the job done. But it is far too trusting. It assumes that the people running the Internet are nerdy pioneers embarking on a journey of discovery and knowledge sharing. It doesn’t believe for one minute that bad people could be trying to do things to hijack traffic. Or, better still, that some operator fresh from getting his CCNP isn’t going to reroute Facebook traffic through a Cisco 2524 router in Iowa. BGP needs to get better. Or we need to make some changes to ensure that even if BGP still believes that the Internet is a utopia someone is behind it to ensure those rose colored glasses don’t cause it to walk into a bus.

Network Visibility with Barefoot Deep Insight

Posted on December 8, 2017 by networkingnerd

As you may have heard this week, Barefoot Networks is back in the news with the release of their newest product, Barefoot Deep Insight. Choosing to go down the road of naming a thing after what it actually does, Barefoot has created a solution to finding out why network packets are behaving the way they are.

Observer Problem

It’s no secret that modern network monitoring is coming out of the Dark Ages. ping, traceroute, and SNMP aren’t exactly the best tools to be giving any kind of real information about things. They were designed for a different time with much less packet flow. Even Netflow can’t keep up with modern networks running at multi-gigabit speeds. And even if it could, it’s still missing in-flight data about network paths and packet delays.

Imagine standing outside of the Holland Tunnel. You know that a car entered at a specific time. And you see the car exit. But you don’t know what happened to the car in between. If the car takes 5 minutes to traverse the tunnel you have no way of knowing if that’s normal or not. Likewise, if a car is delayed and takes 7-8 minutes to exit you can’t tell what caused the delay. Without being able to see the car at various points along the journey you are essentially guessing about the state of the transit network at any given time.

Trying to solve this problem in a network can be difficult. That’s because the OS running on the devices doesn’t generally lend itself to easy monitoring. The old days of SNMP proved that time and time again. Today’s networks are getting a bit better with regard to APIs and the like. You could even go all the way up the food chain and buy something like Cisco Tetration if you absolutely needed that much visibility.

Embedding Reporting

Barefoot solves this problem by using their P4 language in concert with the Tofino chipset to provide a way for there to be visibility into the packets as they traverse the network. P4 gives Tofino the flexibility to build on to the data plane processing of a packet. Rather than bolting the monitoring on after the fact you can now put it right along side the packet flow and collect information as it happens.

The other key is that the real work is done by the Deep Insight Analytics Software running outside of the switch. The Analytics platform takes the data collected from the Tofino switches and starts processing it. It creates baselines of traffic patterns and starts looking for anomalies in the data. This is why Deep Insight claims to be able to detect microbursts. Because the monitoring platform can analyze the data being fed to it and provide the operator with insights.

It’s important to note that this is info only. The insights gathered from Deep Insight are for informational purposes. This is where the skill of network professional comes into play. By gaining perspective into what could be causing issues like microbursts from the software you gain the ability to take your skills and fix those issues. Perhaps it’s a misconfigured ECMP pair. Maybe it’s a dead or dying cable in a link. Armed with the data from the platform, you can work your networking magic to make it right.

Barefoot says that Deep Insight builds on itself via machine learning. While machine learning is seems to be one of the buzzwords du jour it could be hoped that a platform that can analyze the states of packets can start to build an idea of what’s causing them to behave in certain ways. While not mentioned in the press release, it could also be inferred that there are ways to upload the data from your system to a larger set of servers. Then you can have more analytics applied to the datasets and more insights extracted.

Tom’s Take

The Deep Insight platform is what I was hoping to see from Barefoot after I saw them earlier this year at Networking Field Day 14. They are taking the flexibility of the Tofino chip and the extensibility of P4 and combining them to build new and exciting things that run right alongside the data plane on the switches. This means that they can provide the kinds of tools that companies are willing to pay quite a bit for and do it in a way that is 100% capable of being audited and extended by brilliant programmers. I hope that Deep Insight takes off and sees wide adoption for Barefoot customers. That will be the biggest endorsement of what they’re doing and give them a long runway to building more in the future.

Does Juniper Need To Be Purchased?

Posted on December 1, 2017 by networkingnerd

You probably saw the news this week that Nokia was looking to purchase Juniper Networks. You also saw pretty quickly that the news was denied, emphatically. It was a curious few hours when the network world was buzzing about the potential to see Juniper snapped up into a somewhat larger organization. There was also talk of product overlap and other kinds of less exciting but very necessary discussions during mergers like this. Which leads me to a great thought exercise: Does Juniper Need To Be Purchased?

Sins of The Father

More than any other networking company I know of, Juniper has paid the price for trying to break out of their mold. When you think Juniper, most networking professionals will tell you about their core routing capabilities. They’ll tell you how Juniper has a great line of carrier and enterprise switches. And, if by some chance, you find yourself talking to a security person, you’ll probably hear a lot about the SRX Firewall line. Forward thinking people may even tell you about their automation ideas and their charge into the world of software defined things.

Would you hear about their groundbreaking work with Puppet from 2013? How about their wireless portfolio from 2012? Would anyone even say anything about Junosphere and their modeling environments from years past? Odds are good you wouldn’t. The Puppet work is probably bundled in somewhere, but the person driving it in that video is on to greener pastures at this point. The wireless story is no longer a story, but a footnote. And the list could go on longer than that.

When Cisco makes a misstep, we see it buried, written off, and eventually become the butt of really inside jokes between groups of engineers that worked with the product during the short life it had on this planet. Sometimes it’s a hardware mistake. Other times it’s software architecture missteps. But in almost every case, those problems are anecdotes you tell as you watch the 800lb gorilla of networking squash their competitors.

With Juniper, it feels different. Every failed opportunity is just short of disaster. Every misstep feels like it lands on a land mine. Every advance not expanded upon is the “one that got away”. Yet we see it time and time again. If a company like Cisco pushed the envelope the way we see Juniper pushing it we would laud them with praise and tell the world that they are on the verge of greatness all over again.

Crimes Of The Family

Why then does Juniper look like a juicy acquisition target? Why are they slow being supplanted by Arista as the favored challenger of the Cisco Empire? How is it that we find Juniper under the crosshairs of everyone, fighting to say alive?

As it turns out, wars are expensive. And when you’re gearing to fight Cisco you need all the capital you can. That forces you to make alliances that may not be the best for you in the long run. And in the case of Juniper, it brought in some of the people that thought they could get in on the ground floor of a company that was ready to take on the 800lb gorilla and win.

Sadly, those “friends” tend to be the kind that desert you when you need them the most. When Juniper was fighting tooth and nail to build their offerings up to compete against Cisco, the investors were looking for easy gains and ways to make money. And when those investors realize that toppling empires takes more than two quarters, they got antsy. Some bailed. Those needed to go. But the ones that stayed cause more harm than good.

I’ve written before about Juniper’s issues with Elliott Capital Management, but it bears repeating here. Elliott is an activist investor in the same vein as Carl Ichan. They take a substantial position in a company and then immediately start demanding changes to raise the stock price. If they don’t get their way, they release paper after paper decrying the situation to the market until the stock price is depressed enough to get the company to listen to Elliott. Once Elliott’s demands are met, the company exits their position. They get a small profit and move on to do it all over again, leaving behind a shell of a company wonder what happened.

Elliott has done this to Juniper in droves. Pulse VPN. Trapeze. They’ve demanded executive changes and forced Juniper to abandon good projects that have long term payoffs because they won’t bounce the stock price higher this quarter. And worse yet, if you look back over the last five years you can find story in the finance industry about Juniper being up for sale or being a potential acquisition target. Five. Years. When’s the last time you heard about Cisco being a potential target for buyout? Hell, even Arista doesn’t get shopped as much as Juniper.

Tom’s Take

I think these symptoms are all the same root issue. Juniper is a great technology company that does some exciting and innovative things. But, much like a beautiful potted plant in my house, they are reaching the maximum amount of size they can grow to without making a move. Like a plant, you can only grow as big as their container. If you leave them in a small one, they’ll only ever be small. You can transfer them to something larger but you risk harm or death. But you’ll never grow if you don’t change. Juniper has the minds and the capability to grow. And maybe with the eyes of the Wall Street buzzards looking elsewhere for a while, they can build a practice that gives them the capability to challenge in the areas they are good at, not just being the answer for everything Cisco is doing.

Complexity Isn’t Always Bad

Posted on November 24, 2017 by networkingnerd

I was reading a great post this week from Gian Paolo Boarina (@GP_Ifconfig) about complexity in networking. He raises some great points about the overall complexity of systems and how we can never really reduce it, just move or hide it. And it made me think about complexity in general. Why are we against complex systems?

Confusion and Delay

Complexity is difficult. The more complicated we make something the more likely we are to have issues with it. Reducing complexity makes everything easier, or at least appears to do so. My favorite non-tech example of this is the carburetor of an internal combustion engine.

Carburetors are wonderful devices that are necessary for the operation of the engine. And they are very complicated indeed. A minor mistake in configuring the spray pattern of the jets or the alignment of them can cause your engine to fail to work at all. However, when you spend the time to learn how to work with one properly, you can make the engine perform even above the normal specifications.

Carburetors have been largely replaced in modern engines by computerized fuel injectors. These systems accomplish the same goal of injecting the fuel-air mixture into the engine. However, they are completely controlled by a computer system instead of being mechanically configured. It’s a great leap forward for people that aren’t mechanics or gear heads. The system either works or it doesn’t. There’s no configuration parameters. Of course, if it doesn’t work there’s also very little that you as a non-mechanic can do to rectify the situation. As Gian Paolo points out, the complexity in the system has just been moved from the carburetor to the computer system running it.

But why is that a bad thing? If the standard user is never supposed to fiddle with the system why is moving the complexity a bad thing? It could be argued that removing complications from the operation and diagnostics of the system are good, but only if you ever intend untrained people to ever work on the system. A non-mechanic might never be able to fix a fuel injector system, but a trained person should be able to fix it quickly. Here, the complexity isn’t a barrier to the people who have been trained properly to anticipate it.

Complexity is only a problem for people who don’t understand it. Whether it’s a routing protocol or a file system, complex things are going to exist no matter what we do. Understanding them doesn’t have to be the job of everyone that uses the system.

A Tangled Web

I remember briefly working with Novell’s original identity management system back when it was still called DirXML. It was horribly complicated. It required a number of XML drivers importing information into eDirectory, which itself had quirks. And that identity repository fed multiple systems via XML rules to populate those data structures. It was a complicated nightmare to end all nightmares.

Except when it worked. When the system did the job properly, it looked like magic. Information entered for a new employee in the HR system automatically created an Active Directory user account in a different system, provisioned an email account in a third different system, and even created a time card entry in a fourth totally different system. The complexity under the hood churned its way through to provide usability to the people that relied on the system. Could they have manually entered all of that information? Sure. But having it automatically happen was a huge time saver for them. And when you apply it to a school where those actions needed to be repeated dozens of times for new students you can see how it would save a significant amount of time.

Here, complexity is the reason the system exists. You didn’t have the capability to feed those individual systems at the time because of the lack of API support or various other reasons. You had to find a way to force feed the information to a system that wasn’t expecting to get it any other way. Complexity here was required. And it worked. Until it didn’t.

Troubleshooting the XML issues in the system and keeping it running with new updates and broken links consumed a huge amount of time for the people I knew that were good at using it. So much time, in fact, that a couple of them made a business out of remotely supported DirXML for customers that utilized it and either didn’t know how to use it or didn’t have the specific knowledge necessary to make it work the way they wanted. Here, the complexity wasn’t only a necessity of the system, but it was a driver to create a new support level for it.

Ultimately, DirXML went away as it was consumed by the Novell Identity Manager. And now, the idea of these systems not having an API is silly. We focus our efforts more on the programming of the API and not on the extra complexity of the layers on top of it. But even those API interactions can be complex. So, we’ve essentially traded one complexity for another. We have simplified some aspects of the complexity while introducing others. We’ve also standardized things without necessarily making them an easier to do.

Tom’s Take

Complexity is bad when we don’t understand it. Trying to explain lagrange points and orbital dynamics is a huge pain when you aren’t talking to rocket scientists. However, most people that understand the complexities of the college football playoff system are more than happy to explain it to you in depth simply because they “get it”. Complexity isn’t always the enemy. If the people working on the system understand it enough to get the reasons why it needs to be complex to fulfill a job requirement then it’s not a bad thing. Instead of trying to move or reduce complexity, we should instead to try to ensure that we don’t add any additional complexity to the system. That’s how you keep the complexity snowball from rolling you over.

Predictions As A Service

Posted on November 17, 2017 by networkingnerd

It’s getting close to the end of the year and it’s time once again for the yearly December flood of posts that will be predicting what’s coming in 2018. Long time readers of my blog know that I don’t do these kinds of posts. My New Year’s Day posts are almost always introspective in nature and forward looking from my own personal perspective. But I also get asked quite a bit to contribute to other posts about the future. And I wanted to tell you why I think the prediction business is a house of cards built on quicksand.

The Layups

It’s far too tempting in the prediction business to play it safe. Absent a ton of research, it’s just easier to play it safe with some not-so-bold predictions. For instance, here’s what I could say about 2018 right now:

Whitebox switching will grow in revenue.
Software will continue to transform networking.
Cisco is going to buy companies.

Those are 100% true. Even without having spent one day in 2018. They’re also things I didn’t need to tell you at all. You already knew them. They’re almost common sense at this point. If I needed to point out that Cisco is going to buy at least two companies next year, you are either very new to networking or you’ve been studying for your CCIE lab and haven’t seen the sun in eight months.

Safe predictions have a great success rate. But they say nothing. However, they are used quite a bit for the lovely marketing fodder we see everywhere. In three months, you could see presentation from an SD-WAN vendor that says, “Industry analyst Tom Hollingsworth predicts that 2018 is going to be a big year for software networking.” It’s absolutely true. But I didn’t say SD-WAN. I didn’t name any specific vendors. So that prediction could be used by anyone for any purpose and I’d still be able to say in December 2018 that I was 100% right.

Playing it safe is the most useless kind of prediction there is. Because all you’re doing is reinforcing the status quo and offering up soundbites to people that like it that way.

Out On A Limb

The other kind of prediction that people love to get into is the crazy, far out bold prediction. These are the ones that get people gasping and following your every word to see if it pays off. But these predictions are prone to failure and distraction.

Let’s run another example. Here are four bold sample predictions for 2018:

Cisco will buy Arista.
VMware will cease to be a separate brand inside Dell.
Hackers will release a tool to compromise iPhones remotely.
HPE will go out of business.

Those predictions are more exciting! They name companies like Cisco and VMware and Apple. They have very bold statements like huge purchases or going out of business. But guess what? They’re completely made up. I have no insight or research that tells me anything even close to those being true or not.

However, those bold predictions just sit out there and fester. People point to them and say, “Tom thinks Cisco will buy Arista in 2018!” And no one will every call me on the carpet if I’m wrong. If Cisco does end up buying Arista in 2020 or later, people will just say I was ahead of my time. If it never comes to pass, people will forget and just focus on my next bold prediction of VMware buying Cisco. It’s a hype train with no end.

And on the off chance that I do nail a big one, people are going to think I have the inside track. My little predictions will be more important. And if I hit half of my bold ones, I would probably start getting job offers from analyst firms and such. These people are the prediction masters extraordinaire. If they aren’t telling you something you already know, they’re pitching you something that have no idea about.

Apple has a cottage industry built around crazy predictions. Just look back to August to see how many crazy ideas were out there about the iPhone X. Fingerprint sensor under the glass? 3D rear camera? Even crazier stuff? All reported on as pseudo-fact and eaten up by the readers of “news” sites. Even the people who do a great job of prediction based on solid research missed a few key details in the final launch. it just goes to show that no one is 100% accurate in bold predictions.

Tom’s Take

I still do predictions for other people. Sometimes I try to make tongue-in-cheek ones for fun. Other times I try to be serious and do a little research. But I also think that figuring out what’s coming 5 years from now is a waste of my time. I’d rather try to figure out how to use what I have today and build that toward the future. I’d rather be a happy iPhone user than the people that predicted that Apple’s move into the mobile market would fail miserably. Because that’s a headline you’ll never live down.

I’d like to thank my friends at Network Collective for inspiring this post. Make sure you check out their video podcast!

An Opinion On Offense Against NAT

Posted on November 10, 2017 by networkingnerd

It’s been a long time since I’ve gotten to rant against Network Address Translation (NAT). At first, I had hoped that was because IPv6 transitions were happening and people were adopting it rapidly enough that NAT would eventually slide into the past of SAN and DOS. Alas, it appears that IPv6 adoption is getting better but still not great.

Geoff Huston, on the other hand, seems to think that NAT is a good thing. In a recent article, he took up the shield to defend NAT against those that believe it is an abomination. He rightfully pointed out that NAT has extended the life of the modern Internet and also correctly pointed out that the slow pace of IPv6 deployment was due in part to the lack of urgency of address depletion. Even with companies like Microsoft buying large sections of IP address space to fuel Azure, we’re still not quite at the point of the game when IP addresses are hard to come by.

So, with Mr. Huston taking up the shield, let me find my +5 Sword of NAT Slaying and try to point out a couple of issues in his defense.

Relationship Status: NAT’s…Complicated

The first point that Mr. Huston brings up in his article is that the modern Internet doesn’t resemble the one build by DARPA in the 70s and 80s. That’s very true. As more devices are added to the infrastructure, the simple packet switching concept goes away. We need to add hierarchy to the system to handle the millions of devices we have now. And if we add a couple billion more we’re going to need even more structure.

Mr. Huston’s argument for NAT says that it creates a layer of abstraction that allows devices to be more mobile and not be tied to a specific address in one spot. That is especially important for things like mobile phones, which move between networks frequently. But instead of NAT providing a simple way to do this, NAT is increasing the complexity of the network by this abstraction.

When a device “roams” to a new network, whether it be cellular, wireless, wired, or otherwise, it is going to get a new address. If that address needs to be NATed for some reason, it’s going to create a new entry in a NAT state table somewhere. Any device behind a NAT that needs to talk to another device somewhere is going to create twice as many device entries as needed. Tracking those state tables is complicated. It takes memory and CPU power to do this. There is no ASIC that allows a device to do high-speed NATing. It has to be done by a general purpose CPU.

Adding to the complexity of NAT is the state that we’re in today when we overload addresses to get connectivity. It’s not just a matter of creating a singular one-to-one NAT. That type of translation isn’t what most people think of as NAT. Instead, they think of Port Address Translation (PAT), which allows hundreds or thousands of devices to share the same IP address. How many thousands? Well, as it turns out about 65,000 give or take. You can only PAT devices if you have free ports to PAT them on. And there are only 65,636 ports available. So you hit a hard limit there.

Mr. Huston talks in his article about extending the number of bits that can be used for NAT to increase the number of hosts that can be successfully NATed. That’s going to explode the tables of the NATing device and cause traffic to slow considerably if there are hundreds of thousands of IP translations going on. Mr. Huston argues that since the Internet is full of “middle boxes” anyway that are doing packet inspection and getting in the way of true end-to-end communications that we should utilize them and provide more space for NAT to occur instead of implementing IPv6 as an addressing space.

I’ll be the first to admit that chopping the IPv6 address space right in the middle to allow MAC addresses to auto-configure might not have been the best decision. But, in the 90s when we didn’t have DHCP it was a great idea in theory. And yes, assigning a /48 to a network does waste quite a bit of IP space. However, it does a great job of shrinking the size of the routing table, since that network can be summarized a lot better than having a bunch of /64 host routes floating around. This “waste” echoes the argument for and against using a /64 for a point-to-point link. If you’re worried about wasting several thousand addresses out of a potential billion then there might be other solutions you should look at instead.

Say My Name

One of the points that gets buried in the article that might shed some light on this defense of NAT is Mr. Huston’s championing for Named Data Networking. The concept of NDN is that everything on the Internet should stop being referred to as an address and instead should be tagged with a name. Then, when you want to look for a specific thing, you send a packet with that name and the Internet routes your packet to the thing you want. You then setup a communication between you and the data source. Sounds simple, right?

If you’re following along at home, this also sounds suspiciously like object storage. Instead of a piece of data living on a LUN or other SAN construct, we make every piece of data an object of a larger system and index them for easy retrieval. This idea works wonders for cloud providers, where object storage provides an overlay that hides the underlying infrastructure.

NDN is a great idea in theory. According to the Wikipedia article, address space is unbounded because you just keep coming up with new names for things. And since you’re using a name and not an address, you don’t have to NAT anything. That last point kind of blows up Mr. Huston’s defense of NAT in favor of NDN, right?

One question I have makes me go back to the object storage model and how it relates to NDN. In an object store, every piece of data has an Object ID, usually a UUID of 32 bits or 64 bits. We do this because, as it turns out, computers are horrible at finding names for things. We need to convert those names into numbers because computers still only understand zeros and ones at their most basic level. So, if we’re going to convert those names to some kind of numeric form anyway, why should we completely get rid of addresses? I mean, if we can find a huge address space that allows us to enumerate resources like an object store, we could duplicate a lot of NDN today, right? And, for the sake of argument, what if that huge address space was already based on hexadecimal?

Hello, Is It Me URLooking For?

To put this in a slightly different perspective, let’s look at the situation with phone numbers. In the US, we’ve had an explosion of mobile phones and other devices that have forced us to extend the number of area codes that we use to refer to groups of phone numbers. These area codes are usually geographically specific. We add more area codes to contain numbers that are being added. Sometimes these are specific to one city, like 212 is for New York. Other times they can cover a whole state or a portion of a state, like 580 does for Oklahoma.

It would be a whole lot easier for us to just refer to people by name instead of adding new numbers, right? I mean, we already do that in our mobile phones. We have a contact that has a phone number and an email address. If we want to contact John Smith, we look up the John Smith we want and choose our contact preference. We can call, email, or send a message through text or other communications method.

What address we use depends on our communication method. Calls use a phone number. If you’re on an iPhone like me, you can text via phone or AppleID (email address). You can also set up a video call the same way. Each of these methods of contact uses a different address for the name.

With Named Data Networking, are we going to have different addresses for each resource? If we’re doing away with addresses, how are we going to name things? Is there a name registry? Are we going to be allowed to name things whatever we want? Think about all the names of videos on Youtube if you want an idea of the nightmare that might be. And if you add some kind of rigid structure in the mix, you’re going to have to contain a database of names somewhere. As we’ve found with DNS, having a repository of information in a central place would make an awfully tempting target. Not to mention causing issues if it ever goes offline for some reason.

Tom’s Take

I don’t think there’s anything that could be said to defend NAT in my eyes. It’s the duct tape temporary solution that never seems to go away completely. Even with depletion and IPv6 adoption, NAT is still getting people riled up and ready to say that it’s the best option in a world of imperfect solutions. However, I think that IPv6 is the best way going forward. With more room to grow and the opportunity to create unique IDs for objects in your network. Even if we end up going down the road of Named Data Networking, I don’t think NAT is the solution you want to go with in the long run. Drive a sword through the heart of NAT and let it die.

VMware and VeloCloud: A Hedge Against Hyperconvergence?

Posted on November 3, 2017 by networkingnerd

VMware announced on Thursday that they are buying VeloCloud. This was a big move in the market that immediately set off a huge discussion about the implications. I had originally thought AT&T would buy VeloCloud based on their relationship in the past, but the acquistion of Vyatta from Brocade over the summer should have been a hint that wasn’t going to happen. Instead, VMware swooped in and picked up the company for an undisclosed amount.

The conversations have been going wild so far. Everyone wants to know how this is going to affect the relationship with Cisco, especially given that Cisco put money into VeloCloud in both 2016 and 2017. Given the acquisition of Viptela by Cisco earlier this year it’s easy to see that these two companies might find themselves competing for marketshare in the SD-WAN space. However, I think that this is actually a different play from VMware. One that’s striking back at hyperconverged vendors.

Adding The Value

If you look at the marketing coming out of hyperconvergence vendors right now, you’ll see there’s a lot of discussion around platform. Fast storage, small footprints, and the ability to deploy anywhere. Hyperconverged solutions are also starting to focus on the hot new trends in compute, like containers. Along the way this means that traditional workloads that run on VMware ESX hypervisors aren’t getting the spotlight they once did.

In fact, the leading hyperconvergence vendor Nutanix has been aggressively selling their own hypervisor, Acropolis as a competitor to VMware. They tout new features and easy configuration as the major reason to use Acropolis over ESX. The push by Nutanix is to get their customers off of ESX and on to Acropolis to get a share of the VMware budget that companies are currently paying.

For VMware, it’s a tough sell to keep their customers on ESX. There’s a very big ecosystem of software out there that runs on ESX, but if you can replicate a large portion of it natively like Acropolis and other hypervisors do there’s not much of a reason to stick with ESX. And if the VMware solution is more expensive over time you will find yourself choosing the cheaper alternative when the negotiations come up for renewal.

For VMware NSX, it’s an even harder road. Most of the organizations that I’ve seen deploying hyperconverged solutions are not huge enterprises with massive centralized data centers. Instead, they are the kind small-to-medium businesses that need some functions but are very budget conscious. They’re also very geographically diverse, with smaller branch offices taking the place of a few massive headquarters locations. While NSX has some advantages for these companies, it’s not the best fit for them. NSX works optimally in a data center with high-speed links and a well-built underlay network.

vWAN with VeloCloud

So how is VeloCloud going to play into this? VeloCloud already has a lot of advantages that made them a great complement to VMware’s model. They have built-in multi tenancy. Their service delivery is virtualized. They were already looking to move toward service providers as their primary market, but network services and managed service providers. This sounds like their interests are aligning quite well with VMware already.

The key advantage for VMware with VeloCloud is how it will allow NSX to extend into the branch. Remember how I said that NSX loves an environment with a stable underlay? That’s what VeloCloud can deliver. A stable, encrypted VPN underlay. An underlay that can be managed from one central location, or in the future, perhaps even a vCenter plugin. That gives VeloCloud a huge advantage to build the underlay to get connectivity between branches.

Now, with an underlay built out, NSX can be pushed down into the branch. Branches can now use all the great features of NSX like analytics, some of which will be bolstered by VeloCloud, as well as microsegmentation and other heretofore unseen features in the branch. The large headquarters data center is now available in a smaller remote size for branches. That’s a huge advantage for organizations that need those features in places that don’t have data centers.

And the pitch against using other hypervisors with your hyperconverged solution? NSX works best with ESX. Now, you can argue that there is real value in keeping ESX on your remote branches is not costs or features that you may one day hope to use if your WAN connection gets upgraded to ludicrous speed. Instead, VeloCloud can be deployed between your HQ or main office and your remote site to bring those NSX functions down into your environment over a secure tunnel.

While this does compete a bit with Cisco from a delivery standpoint, it still doesn’t affect them with complete overlap. In this scenario, VeloCloud is a service delivery platform for NSX and not a piece of hardware at the edge. Absent VeloCloud, this kind of setup could still be replicated with a Cisco Viptela box running the underlay and NSX riding on top in the overlay. But I think that the market that VMware is going after is going to be building this from the ground up with VMware solutions from the start.

Tom’s Take

Not every issues is “Us vs. Them”. I get that VMware and Cisco seem to be spending more time moving closer together on the networking side of things. SD-WAN is a technology that was inevitably going to bring Cisco into conflict with someone. The third generation of SD-WAN vendors are really companies that didn’t have a proper offering buying up all the first generation startups. Viptela and VeloCloud are now off the market and they’ll soon be integral parts of their respective parent’s strategies going forward. Whether VeloCloud is focused on enabling cloud connectivity for VMware or retaking the branch from the hyperconverged vendors is going to play out in the next few months. But instead of focusing on conflict with anyone else, VeloCloud should be judged by the value it brings to VMware in the near term.

Devaluing Data Exposures

Posted on October 27, 2017 by networkingnerd

I had a great time this week recording the first episode of a new series with my co-worker Rich Stroffolino. The Gestalt IT Rundown is hopefully the start of some fun news stories with a hint of snark and humor thrown in.

One of the things I discussed in this episode was my belief that no data is truly secure any more. Thanks to recent attacks like WannaCry and Bad Rabbit and the rise of other state-sponsored hacking and malware attacks, I’m totally behind the idea that soon everyone will know everything about me and there’s nothing that anyone can do about it.

Just Pick Up The Phone

Personal data is important. Some pieces of personal data are sacrificed for the greater good. Anyone who is in IT or works in an area where they deal with spam emails and robocalls has probably paused for a moment before putting contact information down on a form. I have an old Hotmail address I use to catch spam if I’m relative certain that something looks shady. I give out my home phone number freely because I never answer it. These pieces of personal data have been sacrificed in order to provide me a modicum of privacy.

But what about other things that we guard jealously? How about our mobile phone number. When I worked for a VAR that was the single most secretive piece of information I owned. No one, aside from my coworkers, had my mobile number. In part, it’s because I wanted to make sure that it got used properly. But also because I knew that as soon as one person at the customer site had it, soon everyone would. I would be spending my time answering phone calls instead of working on tickets.

That’s the world we live in today. So many pieces of information about us are being stored. Our Social Security Number, which has truthfully been misappropriated as an identification number. US Driver’s Licenses, which are also used as identification. Passport numbers, credit ratings, mother’s maiden name (which is very handy for opening accounts in your name). The list could be a blog post in and of itself. But why is all of this data being stored?

Data Is The New Oil

The first time I heard someone in a keynote use the phrase “big data is the new oil”, I almost puked. Not because it’s a platitude the underscores the value of data. I lost it because I know what people do with vital resources like oil, gold, and diamonds. They horde them. Stockpiling the resources until they can be refined. Until every ounce of value can be extracted. Then the shell is discarded until it becomes a hazard.

Don’t believe me? I live in a state that is legally required to run radio and television advertisements telling children not to play around old oilfield equipment that hasn’t been operational in decades. It’s cheaper for them to buy commercials than it is to clean up their mess. And that precious resource? It’s old news. Companies that extract resources just move on to the next easy source instead of cleaning up their leftovers.

Why does that matter to you? Think about all the pieces of data that are stored somewhere that could possibly leak out about you. Phone numbers, date of birth, names of children or spouses. And those are the easy ones. Imagine how many places your SSN is currently stored. Now, imagine half of those companies go out of business in the next three years. What happens to your data then? You can better believe that it’s not going to get destroyed or encrypted in such a way as to prevent exposure. It’s going to lie fallow on some forgotten server until someone finds it and plunders it. Your only real hope is that it was being stored on a cloud provider that destroys the storage buckets after the bill isn’t paid for six months.

Devaluing Data

How do we fix all this? Can this be fixed? Well, it might be able to be done, but it’s not going to be fun, cheap, or easy. It all starts by making discrete data less valuable. An SSN is worthless without a name attached to it, for instance. If all I have are 9 random numbers with no context I can’t tell what they’re supposed to be. The value only comes when those 9 numbers can be matched to a name.

We’ve got to stop using SSN as a unique identifier for a person. It was never designed for that purpose. In fact, storing SSN as all is a really bad idea. Users should be assigned a new, random ID number when creating an account or filling out a form. SSN shouldn’t be stored unless absolutely necessary. And when it is, it should be treated like a nuclear launch code. It should take special authority to query it, and the database that queries it should be directly attached to anything else.

Critical data should be stored in a vault that can only be accessed in certain ways and never exposed. A prime example is the trusted enclave in an iPhone. This enclave, when used for TouchID or FaceID, stores your fingerprints and your face map. Pretty important stuff, yes? However, even with biometric ID systems become more prevalent there isn’t any way to extract that data from the enclave. It’s stored in such a way that it can only be queried in a specific manner and a result of yes/no returned from the query. If you stole my iPhone tomorrow, there’s no way for you to reconstruct my fingerprints from it. That’s the template we need to use going forward to protect our data.

Tom’s Take

I’m getting tired of being told that my data is being spread to the four winds thanks to it lying around waiting to be used for both legitimate and nefarious purposes. We can’t build fences high enough around critical data to keep it from being broken into. We can’t keep people out, so we need to start making the data less valuable. Instead of keeping it all together where it can be reconstructed into something of immense value, we need to make it hard to get all the pieces together at any one time. That means it’s going to be tough for us to build systems that put it all together too. But wouldn’t you rather spend your time solving a fun problem like that rather than making phone calls telling people your SSN got exposed on the open market?

Scotty Isn’t DevOps

Posted on October 19, 2017 by networkingnerd

I was listening to the most recent episode of our Gestalt IT On-Presmise IT Roundtable where Stephen Foskett mentioned one of our first episodes where we discussed whether or not DevOps was a disaster, or as I put it a “dumpster fire”. Take a listen here:

Around 13 minutes in, I have an exchange with Nigel Poulton where I mention that the ultimate operations guy is Chief Engineer Montgomery Scott of the USS Enterprise. Nigel countered that Scotty was the epitome of the DevOps mentality because his crazy ideas are what kept the Enterprise going. In this post, I hope to show that not only was Scott not a DevOps person, he should be considered the antithesis of DevOps.

Engineering As Operations

In the fictional biography of Mr. Scott, all he ever wanted to do was be an engineer. He begrudging took promotions but found ways to get back to the engine room on the Enterprise. He liked working starships. He hated building them. His time working on the transwarp drive of the USS Excelsior proved that in the third Star Trek film.

Scotty wasn’t developing new ideas to implement on the Enterprise. He didn’t spend his time figuring out how to make the warp engines run at increased efficiency. He didn’t experiment with the shields or the phasers. Most of his “miraculous” moments didn’t come from deploying new features to the Enterprise. Instead, they were the fruits of his ability to streamline operations to combat unforeseen circumstances.

In The Apple, Scott was forced to figure out a way to get the antimatter system back online after it was drained by an unseen force. Everything he did in the episode was focused on restoring functions to the Enterprise. This wasn’t the result of a failed upgrade or a continuous deployment scenario. The operation of his ship was impacted. In Is There No Truth In Beauty, Mr. Scott even challenges the designer of the Enterprise’s engines that he can’t handle them as well as Scotty. Mr. Scott was boasting that he was better at operations than a developer. Plain and simple.

In the first Star Trek movie, Admiral Kirk is pushing Scotty to get the Enterprise ready to depart in hours after an eighteen month refit. Scotty keeps pushing back that they need more time to work out the new systems and go on a shakedown cruise. Does that sound like a person that wants to do CI/CD to a starship? Or does it sound more like the caution of an operations person wanting to make sure patches are deployed in a controlled way? Every time someone in the series or movies suggested doing major upgrades or redesigns to the Enteprise, Scotty always warned against doing it in the field unless absolutely necessary.

Montgomery Scott isn’t the King of DevOps. He’s a poster child for simple operations. Keep the systems running. Deal with problems as they arise. Make changes only if necessary. And don’t monkey with the systems! These are the tried-and-true refrains of a person that knows that his expertise isn’t in building things but in making them run.

Engineering as DevOps

That’s not to say that Star Trek doesn’t have DevOps engineers. The Enterprise-D had two of the best examples of DevOps that I’ve ever seen – Geordi LaForge and Data. These two operations officers spent most of their time trying new things with the Enterprise. And more than a few crises arose because of their development aspirations.

LaForge and Data were constantly experimenting on the Enterprise in an attempt to make it run better. Given that the mission of the Enterprise-D did not have the same five-year limit as the original, they were expected to keep the technology on the Enterprise more current in space. However, their experiments often led to problems. Destabilizing the warp core, causing shield harmonics failures, and even infecting the Enterprise’s computer with viruses were somewhat commonplace during Geordi’s tenure as Chief Engineer.

Commander Data was also rather fond of finding out about new technology that was being developed and trying to integrate it into the Enterprise’s systems. Many times, he mentioned finding out about something being developed the the Daystrom Institute and wanting to see if it would work for them. Which leads me to think that the Daystrom Institute is the Star Trek version of Stack Overflow – copy some things you think will make everything better and hope it doesn’t blow up because you didn’t understand it.

Even if it was a plot convenience device, it felt like the Enterprise was often caught in the middle of applying a patch or an upgrade right when the action started. An exploding star or an enemy vessel always waited until just the right moment to put the Enterprise in harm’s way. Even Starfleet seemed to decide the Enterprise was the only vessel that could help after the DevOps team took the warp core offline to make it run 0.1% faster.

Perhaps instead of pushing forward with an aggressive DevOps mentality for the flagship of the Federation, Geordi and Data would have done better to take lessons from Mr. Scott and wait for appropriate windows to make changes and upgrades and quite tinkering with their ship so often that it felt like it was being held together by duct tape and hope.

Tom’s Take

Despite being fictional characters, Scotty, Geordi, and Data all represent different aspects of the technology we look at today. Scotty is the tried-and-true operations person. Geordi and Data are leading the charge to keep the technology fresh. Each of them has their strong points, but it’s hard to overlook Scotty as being a bastion of simple operations mentalities. Even when they all met together in Relics, Scotty was thinking more about making things work and less on making them fast or pretty or efficient. I think the push to the DevOps mentality would do well to take a seat and listen to the venerable chief engineer of the original Enterprise.

Black Gold, Texas Tea

Welcome To The Nuclear Age

Tom’s Take

Share this:

Routing by Suggestion

Buttoning Down BGP

Tom’s Take

Share this:

Observer Problem

Embedding Reporting

Tom’s Take

Share this:

Sins of The Father

Crimes Of The Family

Tom’s Take

Share this:

Confusion and Delay

A Tangled Web

Tom’s Take

Share this:

The Layups

Out On A Limb

Tom’s Take

Share this:

Relationship Status: NAT’s…Complicated

Say My Name

Hello, Is It Me URLooking For?

Tom’s Take

Share this:

Adding The Value

vWAN with VeloCloud

Tom’s Take

Share this:

Just Pick Up The Phone

Data Is The New Oil

Devaluing Data

Tom’s Take

Share this:

Engineering As Operations

Engineering as DevOps

Tom’s Take

Share this: