Increasing Entropy with Crypto4A

Posted on April 26, 2019 by networkingnerd

Have you ever thought about the increasing disorder in your life? Sure, it may seem like things are constantly getting crazier every time you turn around, but did you know that entropy is always increasing in the universe? It’s a Law of Thermodynamics!

The idea that organized systems want to fall into disorder isn’t too strange when you think about it. Maintaining order takes a lot of effort and disorder is pretty easy to accomplish by just giving up. Anyone with a teenager knows that the amount of disorder that can be accomplished in a bedroom is pretty impressive.

One place where we don’t actually see a lot of disorder is in the computing realm. Computers are based on the idea that there is order and rationality in everything that we do. This is so prevalent that finding a way to be random is actually pretty hard. Computer programmers have tried a number of ways to come up with random number generators that take a variety of inputs into the formula and come up with something that looks sufficiently random. For most people just wanting the system to guess a number between 1 and 100 it’s not too bad. But when it comes to really, really large numbers like the ones used in cryptography, those pseudorandom numbers aren’t good enough.

This All Looks So Familiar…

One of the reasons for this comes down to good old fashioned efficiency. In the old days computers programmers could rely on people to generate pseudorandom input. By sampling mouse clicks or delay between computer keyboard keystrokes you could easily come up with a number that looks nice and random. However, we’ve taken people out of the loop now. Thanks to the cloud and automation and any one of a number of new ways to reduce human input we’ve managed to remove mouse clicks and keystrokes.

That’s fine for running scripts and programs. It’s even good for building things at a huge scale. But it’s really bad when you need something that looks relatively random. And it’s really, really bad when your program relies on that randomness to keep you secure. Kind of like key generation in Public Key Cryptography (PKI).

A group of security researchers working for the National Institute of Standards and Technology (NIST) found out a few years ago that public keys were starting to collide at greater rates than random chance. The study, conducted in 2012, found that 5% of HTTPS and 10% of SSH public keys were duplicates. A collision in a hashing algorithm is when two inputs produce the same output, which renders that hashing function broken. In PKI, having a two different inputs output the same public key is really bad, because it could lead to key collisions that impact a variety of service.

What caused it? As it turns out, lack of orderly disorder. Because automation and non-human interaction have led to other pseudorandom inputs being used in key generation it appeared to the researchers that the same inputs were being used all over the place. That meant that a lot of the public keys that were being generated were being done in such as way as to make collisions more likely. When you look at how many things are relying on automated sources to generate keys it can be quite scary. Think about a smart lightbulb or other IoT device that’s trying to generate pseudorandom input from a CPU that’s just big enough to turn things on. Now imagine that CPU multiplied by the number of smart lightbulbs out there. Not a pleasant thought, is it?

Disorder In The Court

This fascinating discussion came from an interview I had with Bruno Couillard, the President and CTO of Crypto4A. Crypto4A is a company that provides Entropy-as-a-Service. What exactly does that mean?

Crypto4A has an appliance they call QAOS. QAOS is designed to give you the best possible disorder that you can get. It does this the old fashioned way. Instead of trying to use software as a Random Number Generator (RNG) QAOS instead uses hardware sources to generate entropy for their RNG. This includes a quantum RNG, which produces high quality disorder that’s difficult to fake any other way.

QAOS is designed to feed software with entropy to generate randomness sufficient to prevent PKI public key collisions. The software developers can follow the NIST guidelines on EaaS to have the program call an entropy source. QAOS, acting as that entropy source, will seed the RNG on the target system with good randomness and allow it to generate good keys. This could also be configured in the kernel of the OS to call a system like QAOS on boot and start the seed value with a good amount of random entropy in the case of old programs that can’t be modified to call anything other than a system-based RNG source like /dev/random/.

Tom’s Take

The NIST guidelines around EaaS are constantly evolving, but the idea that companies are already racing to fill the void that has been created by insufficient randomness in cryptography is telling. When you think about nth the number of devices that are going to be using PKI for secure communications, the need for something like Crypto4A QAOS is pretty clear. If we are going to rely on automated systems to run our daily lives, we need to have the resources in place to ensure they have a solid foundation of randomness to build on.

The Confluence of SD-WAN and Microsegmentation

Posted on April 18, 2019 by networkingnerd

If you had to pick two really hot topics in the networking space right now, you’d be hard-pressed to find two more discussed than SD-WAN and microsegmentation. SD-WAN is the former “king of the hill” in the network engineering. I can remember having more conversations about SD-WAN in the last couple of years than anything else. But as the SD-WAN market has started to consolidate and iterate, a new challenger has arrived. Microsegmentation is the word of the day.

However, I think that SD-WAN and microsegmentation are quickly heading toward a merger of ideas and solutions. There are a lot of commonalities between the two technologies that make a lot of sense running together.

SD-WAN isn’t just about packet switching and routing any longer. That’s because networking people have quickly learned that packet-by-packet processing of traffic is inefficient. All of our older network analysis devices could only see things one IP packet at a time. But the new wave of devices think in terms of flows. They can analyze a stream of packets to figure out what’s going on. And what generates those flows?

Applications.

The key to the new wave of SD-WAN technology isn’t some kind of magic method of nailing up VPNs between branch offices. It’s not about adding new connectivity types. Instead, it’s about application identification. App identification is how SD-WAN does QoS now. The move to using app markers means a more holistic method of treating application traffic properly.

SD-WAN has significant value in application handling. I recently chatted with Kumar Ramachandran of CloudGenix and he echoed that part of the reason why they’ve been seeing growth and recently received a Series C funding round was because of what they’re doing with applications. The battle of MPLS versus broadband has already been fought. The value isn’t going to come from edge boxes unless there is software that can help differentiate the solutions.

Segmenting Your Traffic

So, what does this have to do with microsegmentation? If you’ve been following that market, you already know that the answer is the application. Microsegmentation doesn’t work on a packet-by-packet basis either. It needs to see all the traffic flows from an application to figure out what is needed and what isn’t. Platforms that do this kind of work are big on figuring out which protocols should be talking to which hosts and shutting everything else down to secure that communication.

Microsegmentation is growing in the cloud world for sure. I’ve seen and talked to people from companies like Guardicore, Illumio, ShieldX, and Edgewise in recent months. Each of them has a slightly different approach to doing microsegmentation. But they all look at the same basic approach form the start. The application is the basic building block of their technology.

With the growth of microsegmentation in the cloud market to help ensure traffic flows between hosts and sites is secured, it’s a no-brainer that the next big SD-WAN platform needs to add this functionality to their solution. I say this because it’s not that big of a leap to take the existing SD-WAN application analytics software that optimizes traffic flows over links and change it to restrict traffic flow with policy support.

For SD-WAN vendors, it’s another hedge against the inexorable march of traffic into the cloud. There are only so many Direct Connect analogs that you can build before Amazon decides to put you out of business. But, if you can integrate the security aspect of application analytics into your platform you can make your solution very sticky. Because that functionality is critical to meeting audit goals and ensuring compliance. And you’re going to wish you had it when the auditors come calling.

Tom’s Take

I don’t think the current generation of SD-WAN providers are quite ready to implement microsegmentation in their platforms. But I really wouldn’t be surprised to see it in the next revision of solutions. I also wonder if that means that some of the companies that have already purchased SD-WAN companies are going to look at that functionality. Perhaps it will be VMware building NSX microsegmentaiton on top of VeloCloud. Or maybe Cisco will include some of their microsegmentation from ACI in Viptela. They’re going to need to look at that strongly because once companies that are still on their own figure it out they’re going to be the go-to solution for companies looking to provide a good, secure migration path to the cloud. And all those roads lead to an SD-WAN device with microsegmentation capabilities.

802.11ax Is NOT A Wireless Switch

Posted on April 10, 2019 by networkingnerd

802.11ax is fast approaching. Though not 100% ratified by the IEEE, the spec is at the point where most manufacturers and vendors are going to support what’s current as the “final” version for now. While the spec for what marketing people like to call Wi-Fi 6 is not likely to change, that doesn’t mean that the ramp up to get people to buy it is showing any signs of starting off slow. One of the biggest problems I see right now is the decision by some major AP manufacturers to call 802.11ax a “wireless switch”.

Complex Duplex

In case you had any doubts, 802.11ax is NOT a switch.¹ But the answer to why that is takes some explanation. It all starts with the network. More specifically, with Ethernet.

Ethernet is a broadcast medium. Packets are launched into the network and it is hoped that the packet finds the destination. All nodes on the network listen and, if the packet isn’t destined for them they discard it. This is the nature of the broadcast. If multiple stations try to talk at once, the packets collide and no one hears anything. That’s why Ethernet developed a collision detection system called CSMA/CD.

Switches solved this problem by segmenting the collision domain to a single port. Now, the only communications between the stations would be in the event that the switch couldn’t find the proper port to forward a packet. In every other case, the switch finds where the packet is meant to be sent and forwards it to that location. It prevents collisions by ensuring that no two stations can transmit at any one time except to the switch in the middle. This also allows communications to be full-duplex, meaning the stations can send and receive at the same time.

Wireless is a different medium. The AP still speaks Ethernet, and there is a bridge between the Ethernet interface and the radios on the other side. But the radio interfaces work differently than Ethernet. Firstly, they are half-duplex only. That means that they have to send traffic or listen to receive traffic but they can’t do both at the same time. Wireless also uses a different version of collision detection called CSMA/CA, where the last A stands for “avoidance”. Because of the half-duplex nature of wireless, clients have a complex process to make sure the frequency is clear before transmitting. They have to check whether or not other wireless clients are talking and if the ambient RF is within the proper thresholds. After all those checks are confirmed, then the client transmits.

Because of the half-duplex wireless connection and the need for stations to have permission to send, some people have said that wireless is a lot like an Ethernet hub, which is pretty accurate. All stations and APs exist in the contention (collision) domain. Aside from the contention algorithm, there’s nothing to stop the stations from talking all at once. And for the entire life of 802.11 so far, it’s worked. 802.11ac started to introduce more features designed to let APs send frames to multiple stations at the same time. That’s what’s called Multi-user Multi-In, Multi-out (MU-MIMO).In theory, it could allow for full-duplex transmissions by allowing a client to send on one antenna and receive on another, but utilizing client radios in this way has impacts on other things.

Switching It Up

Enter 802.11ax. The Wi-Fi 6 feature that has most people excited is Orthogonal Frequency-Division Multiple Access (OFDMA). Simply put, OFDMA allows the clients and APs to not use the entire transmission channel for sending data. It can be sliced up into sub-channels that can be used for low-bandwidth applications to reserve time to talk to the AP. Combined with enhanced MU-MIMO support in 802.11ax, the idea is that clients can talk directly to the AP and allocate a specific sub-channel resource unit all to themselves.

To the marketing people in the room, this sounds just like a switch. Reserved channels, single station access, right? Except it is still not a switch. The AP is still a bridge between two media types for one thing, but more importantly the transmission medium still hasn’t magically become full-duplex. Stations may get around this with some kind of trickery, but they still need to wait for the all-clear to send data. Remember that all stations and APs still hear all the transmissions. It’s still a broadcast medium at the most basic. No amount of software configuration is going to fix that. And for the networking people in the room that might be saying “so what?”, remember when Cisco tried to sell us on the idea that StackWise was capable of 40Gbps of throughput because it could send in both directions on the StackWise ring at once? Remember when you started screaming “THAT’S NOT HOW BANDWIDTH WORKS!!!” That’s what this is, basically. Smoke and mirrors and ignoring the underlying physical layer constraints.

In fact, if you read the above resources, you’re going to find a lot of caveats at the end about support for protocols coming up and not being in the first version of the spec. That’s exactly what happened with 802.11ac. The promise of “gigabit Wi-Fi” took a couple of years and the MU-MIMO enhancements everyone was trumpeting never fully materialized. Just like all technology, the really good stuff was deferred to the next release.

To make sure that both sides are heard, it is rightly pointed out by wireless professionals like Sam Clements (@Samuel_Clements) that 802.11ax is the most “switch-like” so far, with multiple dynamic collision domains. However, in the immortal words of Tyler Durden, “Sticking feathers up your butt doesn’t make you a chicken.” The switch moniker is still a marketing construct and doesn’t hold any water in reality aside from a comparison to a somewhat similar technology. The operation of wireless APs may be hub-like or switch-like, but these devices are not either of those types of devices.

CPU Bound and Determined

The other issue that I see that prevents this from becoming a switch is the CPU on the AP becoming a point of contention. In a traditional Ethernet switch, the forwarding hardware is a specialized ASIC that is optimized to forward packets super fast. It does this with some trickery, including cut through for packets and trusting the incoming CRC. When packets bounce up to the CPU to be process-switched, it bogs the entire system down terribly. That’s why most networking texts will tell you to avoid process switching at all costs.

Now apply those lessons to wireless. All this protocol enhancement is now causing the CPU to have to do extra duty to work on time-slicing and sub-channel optimization. And remember that those CPUs are operating on 18-28 watts of power right now. Maybe the newer APs will get over 30 watts with new PoE options, but that means those CPUs are still going to be eating a lot of power to process all this extra software work. Even adding dedicated processing power to the AP isn’t going to fix things in the long run. That might be one of the reasons why Cisco has been pushing enhanced PoE in the run-up to their big 802.11ax launch at the end of April.

Tom’s Take

Let me say it again for the cheap seats: 802.11ax is NOT a wireless switch! The physical layer technology that 802.11 is built on won’t be switchable any time soon. 802.11ax has given us a lot of enhancements in the protocol and there is a lot to be excited about, like OFDMA, BSS coloring, and TWT. But, like the decision to over-simplify the marketing name, the idea of calling it a wireless switch just to give people a frame of reference so they buy more of them is just silly. It’s disingenuous and sounds more like a snake oil salesman than honest technology marketing. Rather than trying to trick the users with cute sounding terms, how about we keep the discussion honest and discuss the pros and cons of the technology?

Special thanks to my friends in the wireless space for proofreading this post and correcting my errors in technology:

The title was kind of a spoiler ↩︎

OpenConfig and Wi-Fi – The Winning Combo

Posted on April 5, 2019 by networkingnerd

Wireless isn’t easy by any stretch of the imagination. Most people fixate on the spectrum analysis part of the equation when they think about how hard wireless is. But there are many other moving parts in the whole architecture that make it difficult to manage and maintain. Not the least of which is how the devices talk to each other.

This week at Aruba Atmosphere 2019, I had the opportunity to moderate a panel of wireless and security experts for Mobility Field Day Exclusive. It was a fun discussion, as you can see from the above video. As the moderator, I didn’t really get a change to explain my thoughts on OpenConfig, but I figured now would be a great time to jump in with some color on my side of the conversation.

Yin and YANG

One of the most exciting ideas behind OpenConfig for wireless people should be the common YANG data models. This means that you can use NETCONF to have a common programming language against specific YANG models. That means no more fumbling around to remember esoteric commands. You just tell the system what you want it to do and the rest is easy.

As outlined in the video, this has a huge impact on trying to keep configurations standard across different types of APs. Imagine the nightmare of trying to configure power settings or radio thresholds with 3 or more AP manufacturers in your building. Now, imagine being able to do it across your building or dozens of others with a few simple commands and some programming know-how? Doesn’t seem quite as daunting now, does it? It’s easy because you’re speaking the same language across all those APs.

So, what if you don’t care, like Richard McIntosh (@802TopHat) points out? What if your vendor doesn’t support OpenConfig? Well, that’s fine. Not everyone has to support it today. But if you work on building a model system and setting up the automation and API interfaces, are you just going to throw it out the window during your refresh cycle because the new APs that you’re buying don’t support OpenConfig? Or is the need for OpenConfig going to be a huge driver for you and part of the selection process.

Companies are motived by their customers. If you tell them that they need to develop OpenConfig for their devices, they will do it because they run the risk of losing sales. And if the industry moves toward adopting a standard northbound API, what happens to those that get left out in the cold after a few missed refresh cycles? I bet they’ll quickly realize the lost opportunities more than cover the development costs of supporting OpenConfig.

Telemetry Short-Cuts

The other big piece of OpenConfig and wireless is telemetry. SNMP-based monitoring doesn’t work well in today’s wired networks and it’s downright broken in wireless. There are too many variables out there in the average deployment to be able to account for them with anything other than telemetry. Many vendors are starting to adopt the idea of streaming the data directly to collectors via a subscription model. OpenConfig makes this easy with the ability to subscribe to the data you want using OpenConfig models.

From a manufacturer perspective, this is a huge chance to get into telemetry and offer it as a differentiator. If you’re not tied to using an archaic platform with proprietary data models you can embrace OpenConfig and deliver a modern telemetry infrastructure that users will want to adopt. And if the radio performance is the same between all of the offerings, telemetry could be a the piece that tips the scales in your favor.

Single-Vendor Isn’t So Single

I remember doing a deployment for a wireless system once that was “state of the art” when we put it in. I had my challenges and made everything work well and the customer was happy. Until a month later when the supporting vendor announced they were buying a competing company and using that company as their offering going forward. My customer was upset, naturally, but so was I. I spent a lot of time working out how to build and deploy that system and now it was on the scrap heap.

It’s even worse when you keep buying from single vendors and suddenly find that the new products don’t quite conform to the same syntax or capabilities. Maybe the new model of router or AP has a new board that is 95% compatible with the old one, except of that one command you use all the time.

OpenConfig can change that. Because the capabilities of the device have to be outlined you can easily figure out if there are any missing parts and pieces. You can also be sure that your provisioning scripts and programs aren’t going to break or cause problems because a critical command was deprecated. And since you can keep the models around for old hardware as well as new you aren’t going to be duplicating efforts everywhere trying to keep things moving between the platforms.

Tom’s Take

OpenConfig is a great idea for any system that has distributed components. Even if it never takes off in Wi-Fi, it will force the manufacturers to do a bit better job of documenting their capabilities and making them easy to consume via API. And anything that exposes more functionality to be consumed by automation and programmability is a good thing indeed.

Fast Friday – Aruba Atmosphere 2019

Posted on March 29, 2019 by networkingnerd

A couple of quick thoughts that I’m having ahead of Aruba Atmosphere next week in Las Vegas, NV. Tech Field Day has a lot going on and you don’t want to miss a minute of the action for sure, especially on Wednesday at 3:15pm PST. In the meantime:

IoT is really starting to more down-market. Rather than being focused on enabling large machines with front-end devices to act as gateways we’re starting to see more and more IoT devices either come with integrated connective technology or interface with systems that do. Building control systems aren’t just for large corporations any more. You can automate an office on the cheap today. Just remember that any device that can talk can also listen. Security posture is going to be huge.
I remember some of the discussions that we had during the heady early days of SDN and how unimpressed wireless and mobility people were when they figured out how the controllers and dumb edge devices really worked. Most wireless pros have been there and done that already. However, recently there has been a lot of movement in the OpenConfig community around wireless devices. And that really has the wireless folks excited. Because the promise of SDN for them has never been about control, but instead about compatibility. The real key isn’t building another controller but instead making all the APs and controllers work better together.
Another great thing I’m looking forward too seeing at Atmosphere is Aruba HER. It’s an event focused on building stronger communities and increasing diversity for all. You can read a bit more about what will be going on there in this post from Claire Chaplais. Make sure to check out Zoë Rose keynoting the event as well! She’s got a very powerful story to tell. She gave us all a bit of it at Security Field Day last December in this Ignite Talk.

Tom’s Take

Make sure you stay tuned for all the things we’re going to be discussing during the event. We’re going to be using the event hashtag #ATM19 but also using #MFDx as a way to let you know about the great stuff we will have going on!

The Blogging Mirror

Posted on March 21, 2019 by networkingnerd

Writing isn’t always the easiest thing in the world to do. Coming up with topics is hard, but so too is making those topics into a blog post. I find myself getting briefings on a variety of subjects all the time, especially when it comes to networking. But translating those briefings into blog posts isn’t always straight forward. When I find myself stuck and ready to throw in the towel I find it easy to think about things backwards.

A World Of Pure Imagination

When people plan blog posts, they often think about things in a top-down manner. They come up with a catchy title, then an amusing anecdote to open the post. Then they hit the main idea, find a couple of supporting arguments, and then finally they write a conclusion that ties it all together. Sound like a winning formula?

Except when it isn’t. How about when the title doesn’t reflect the content of the post? Or the anecdote or lead in doesn’t quite fit with the overall tone? How about when the blog starts meandering away from the main idea halfway through with a totally separate argument? Or when the conclusion is actually the place where the lede is buried like the Ark of the Covenant?

All of these things are artifacts of the creative process. We often brainstorm great ideas halfway through the process and it derails our train of thought. That leads us down tangents we never intended to go down and create posts that aren’t thematic or even readable in some cases.

It happens all the time. In fact, even in writing this post I thought of a catchy title for a subject heading and had to move it when I was done because the heading didn’t fit the content of the section that followed. It’s okay to have the freedom to change that as soon as you see it. Provided you have a plan for the rest of the post. And that’s where the key here comes into play.

Strike That, Reverse It

I find the easiest way to plan a blog post is to actually write it in reverse. Instead of thinking about things from a top-down method, I start off by thinking about thinks bottom up. Literally.

Start From The End – It’s easiest to write the conclusion of your post first. After all, you’re just restating what you’ve been arguing or demonstrating in the post, right? So start with that. Use it as the main idea of your writing. Always refer back to it. If what you’ve typed doesn’t fit the tone of the conclusion, you either need to support it or cut it.
Support Your Conclusion – Now that you know what you’re going to be talking about, figure out how to support it. that means figuring out how to break your argument in to paragraphs and logical sections. Note that even though you’re trying to optimize for reading on screens today, you still need to follow basic structure. Paragraphs have multiple sentences that support the main idea. One you have two or three of those arguments, you’ve got support for your conclusion.
State The Topic – After you build your support for your conclusion then you can write the topic. After all, you just spent a lot of time spelling it all out. This paragraph at the top is where you state the purpose or theme of the post. Don’t worry about getting into too much detail here. That’s what the support is for. Your readers will get the idea by the time they get to the conclusion, which serves to wrap it all together.
Build Your Anecdote – If you are the type of writer that likes to open with an anecdote, much like a cold open in a drama, this is where you write it. Now that you’ve basically outlined the whole post you can tie your anecdote into the rest of the narrative. You don’t have to worry about building your discussion to support the really cool story. Because you’re adding the story at the end of the creative process you can guarantee that it’s going to fit.
Title Card – Now that you’ve written the post you can title it. This keeps you from making a title that doesn’t fit the narrative. It also allows the title to make a bit more sense in context. Either because you called the post something cute and catchy or because you made the most SEO optimized title in history to reap those sweet, sweet Google searches.

Tom’s Take

As you can see, posts are easier to write in reverse. When you think about things the opposite way from the restrictive methods of writing you’re much more free to express your creativity while also keeping yourself on track to make sure everything makes sense. Some people thrive in the realm of structure and can easily crank out a post from the top down. But when you find yourself stuck because you can’t tie everything together the right way try looking in a blogging mirror. The results will end up the same, but backwards might just be the way forward.

QoS Is Dead. Long Live QoS!

Posted on March 14, 2019 by networkingnerd

Ah, good old Quality of Service. How often have we spent our time as networking professionals trying to discern the archaic texts of Szigeti to learn how to make you work? QoS is something that seemed so necessary to our networks years ago that we would spend hours upon hours trying to learn the best way to implement it for voice or bulk data traffic or some other reason. That was, until a funny thing happened. Until QoS was useless to us.

Rest In Peace and Queues

QoS didn’t die overnight. It didn’t wake up one morning without a home to go to. Instead, we slowly devalued and destroyed it over a period of years. We did it be focusing on the things that QoS was made for and then marginalizing them. Remember voice traffic?

We spent years installing voice over IP (VoIP) systems in our networks. And each of those systems needed QoS to function. We took our expertise in the arcane arts of queuing and applied it to the most finicky protocols we could find. And it worked. Our mystic knowledge made voice better! Our calls wouldn’t drop. Our packets arrived when they should. And the world was a happy place.

That is, until voice became pointless. When people started using mobile devices more and more instead of their desk phones, QoS wasn’t as important. When the steady generation of delay-sensitive packets instead moved back to LTE instead of IP it wasn’t as critical to ensure that FTP and other protocols in the LAN interfered with it. Even when people started using QoS on their mobile devices the marking was totally inconsistent. George Stefanick (@WirelesssGuru) found that Wi-Fi calling was doing some weird packet marking anyway:

@jsnyder81 @revolutionwifi @networkingnerd pic.twitter.com/zdsq2XMALu

— Orthos (@wirelesssguru) October 9, 2015

So, without a huge packet generation issue, QoS was relegated to some weird traffic shaping roles. Maybe it was video prioritization in places where people cared about video? Or perhaps it was creating a scavenger class for traffic in order to get rid of unwanted applications like BitTorrent. But overall QoS languished as an oddity as more and more enterprises saw their collaboration traffic moving to be dominated by mobile devices that didn’t need the old dark magic of QoS.

QoupS de Gras

The real end of QoS came about thanks to the cloud. While we spent all of our time trying to find ways to optimize applications running on our local enterprise networks, developers were busy optimizing applications to run somewhere else. The ideas were sound enough in principle. By moving applications to the cloud we could continually improve them and push features faster. By having all the bit off the local network we could scale massively. We could even collaborate together in real time from anywhere in the world!

But applications that live in the cloud live outside our control. QoS was always bounded by the borders of our own networks. Once a packet was launched into the great beyond of the Internet we couldn’t control what happened to it. ISPs weren’t bound to honor our packet markings without an SLA. In fact, in most cases the ISP would remark all our packets anyway just to ensure they didn’t mess with the ISP’s ideas of traffic shaping. And even those were rudimentary at best given how well QoS plays with MPLS in the real world.

But cloud-based applications don’t worry about quality of service. They scale as large as you want. And nothing short of a massive cloud outage will make them unavailable. Sure, there may be some slowness here and there but that’s nothing less than you’d expect to receive running a heavy application over your local LAN. The real genius of the cloud shift is that it forced developers to slim down applications and make them more responsive in places where they could be made to be more interactive. Now, applications felt snappier when they ran in remote locations. And if you’ve every tried to use old versions of Outlook across slow links you now how critical that responsiveness can be.

The End is The Beginning

So, with cloud-based applications here to stay and collaboration all about mobile apps now, we can finally carve the tombstone for QoS right? Well, not quite.

As it turns out, we are still using lots and lots of QoS today in SD-WAN networks. We’re just not calling it that. Instead, we’ve upgraded the term to something more snappy, like “Application Visibility”. Under the hood, it’s not much different than the QoS that we’ve done for years. We’re still picking out the applications and figuring out how to optimize their traffic patterns to make them more responsive.

The key with the new wave of SD-WAN is that we’re marrying QoS to conditional routing. Now, instead of being at the mercy of the ISP link to the Internet we can do something else. We can push bulk traffic across slow cheap links and ensure that our critical business applications have all the space they want on the fast expensive ones instead. We can push our out-of-band traffic out of an attached 4G/LTE modem. We can even push our traffic across the Internet to a gateway closer to the SaaS provider with better performance. That last bit is an especially delicious piece of irony, since it basically serves the same purpose as Tail-end Hop Off did back in the voice days.

And how does all this magical new QoS work on the Internet outside our control? That’s the real magic. It’s all tunnels! Yes, in order to make sure that we get our traffic where it needs to be in SD-WAN we simply prioritize it going out of the router and wrap it all in a tunnel to the next device. Everything moves along the Internet and the hop-by-hop treatment really doesn’t care in the long run. We’re instead optimizing transit through our network based on other factors besides DSCP markings. Sure, when the traffic arrives on the other side it can be optimized based on those values. However, in the real world the only thing that most users really care about is how fast they can get their application to perform on their local machine. And if SD-WAN can point them to the fastest SaaS gateway, they’ll be happy people.

Tom’s Take

QoS suffered the same fate as Ska music and NCIS. It never really went away even when people stopped caring about it as much as they did when it was the hot new thing on the block. Instead, the need for QoS disappeared when our traffic usage moved away from the usage it was designed to augment. Sure, SD-WAN has brought it back in a new form, QoS 2.0 if you will, but the need for what we used to spend hours of time doing with ancient tomes on knowledge is long gone. We should have a quiet service for QoS and acknowledge all that it has done for us. And then get ready to invite it back to the party in the form that it will take in the cloud future of tomorrow.

Silo 2: On-Premise with DevOps

Posted on March 7, 2019 by networkingnerd

I had a great time stirring up the hornet’s nest with the last post about DevOps, so I figured that I’d write another one with some updated ideas and clarifications. And maybe kick the nest a little harder this time.

Grounding the Rules

First, we need to start out with a couple of clarifications. I stated that the mantra of DevOps was “Move Fast, Break Things.” As has been rightly pointed out, this was a quote from Mark Zuckerberg about Facebook. However, as has been pointed out by quite a few people, “The use of basic principles to enable business requirements to get to production deployments with appropriate coordination among all business players, including line of business, developers, classic operations, security, networking, storage and other functional groups involved in service delivery” is a bit more of definition than motto.

What exactly is DevOps then? Well, as I have been educated, it’s a principle. It’s an idea. A premise, if you will. An ideal to strive for. So, to say that someone is on a DevOps team is wrong. There is no such thing as a classic DevOps team. DevOps is instead something that many other teams do in addition to their other jobs.

That being said, go ask someone what their job is in an organization. I’m willing to be that a lot of people will tell you their on the “DevOps Team”. I know this because some did a report, which I wrote about here and it includes responses from the “DevOps” team. Which, according to the classic definition, is wrong. Right?

Well, almost. See, this is where this tweet of mine comes into play:

Modifying a quote for an upcoming blog post:

“In DevOps theory, there’s no difference between theory and practice. In DevOps practice, there is.”

Feedback so far suggests the *pure* DevOps ideas don’t always get implemented in their purest form.

— Tom Hollingsworth (@NetworkingNerd) March 1, 2019

“Pure” DevOps is hard to manage. It involves organizational shifts. It pisses people off because it’s hard to track metrics. You can’t track a person that does some traditional stuff and some of that new Dev-Op stuff. Where does that part of their job end up on a report? Putting someone in a team or a silo is almost as much for the purposes of managing that person as it is for them to do their job. If I put you in a silo, I know what you do. Or, at the very least, I can assign you tasks and responsibilities that you should be doing and grade you on those. If your “silo” is a principle and not a team, it’s crazy to grade the effectiveness of how you integrated with the developers to deliver services effectively. It can be tracked, but not as easily as a checkbox.

Likewise, people fear change. So, instead of putting their people into roles that cross functional barriers and reorganize the workflows, they instead just take the young people that are talking about the “new way” of doing things and put them in a team together. They slap a DevOps on the door and it’s done. We do DevOps now. Or, worse yet, they take the old infrastructure teams, move a few people off of them into a new team, and tell them to figure out what to do while they’re repainting the team name on the door. This has rightly been called “DevOps Washing” but a lot of people.

But what happens when that team starts Devving the Ops? Do they look at the enshrined principles of The Holy Book of DevOps and start trying to change organizational culture a little bit at a time to get the happy ending from The Phoenix Project? Do they eliminate the Brents of the world and give the security teams peace of mind?

Or, do they carve out their own little fiefdoms and start behaving like an integrated team with responsibilities and politics? Do they do things like deploy new projects to the cloud with little support from other teams. With the idea that they now “own” that workflow and can control how it’s used and how their team is viewed? If you read the article above with the report from Veriflow, you’ll find that a lot of organizations are seeing this second behavior.

Just as much as people fear proper change, they also get greedy in their new roles and want to be important to the business. And taking ownership of all the new initiatives, like cloud development, is a great way to be important. And, as much as The Phoenix Project preaches that security should be integrated into the DevOps workflow, you still half the 330 respondents to the above survey saying there is an increase in security threats to their new initiatives in public cloud.

Redefining DevOps

In a way, this “definition” of DevOps is like the title of this post. I’m sure more than a few of you bristled at the use of on-premise. Because, in today’s IT landscape we’re fighting a losing battle against a premise. When you refer to something as happening in a location, you say “on-premises”. If you say “on-premise”, you should be referring to an idea or concept. And yet, so many people in Silicon Valley say “on-premise” when referring to “on site” or “on location”. It’s grammatically wrong. But it sounds hip. It’s not the classical definition of the word and yet that word is slowly be redefined to mean what people are using it to mean. It literally happened with “literally”.

For those railing against the DevOps Washing that’s going on, ask yourself this question: Why? If the pure principles of DevOps are so much better and easier, why is everyone just slapping DevOps on existing teams or reforming other people into teams and running with the DevOps idea instead of following the rules as laid down by the sacred DevOps texts?

It could be that all organizations that are doing it this way are wrong. But are their more organizations doing it the proper way? Or is the lazy way more prevalent? I don’t know the answer, but given the number of products I see aimed at “the DevOps team” or the number of people that have given me feedback about how their organization’s DevOps teams display the same behaviors I talked about in my other blog post, I’d say there are more bad apples than purists out there.

So, what does this all mean for DevOps? Are we going to go on pointing and laughing at the DevOps-In-Name-Only crowd? Are we going to silently moan about how Real DevOps doesn’t happen and that we need to stay pure to the ideals? Or are we going to step back and realize that, just like every other technology or organizational shift that has ever occurred, nothing really gets implemented in its purest form? Instead of complaining that those not doing it the “proper” way are wrong, let’s examine why things get done the way they do and figure out how to fix it.

If businesses are implementing DevOps teams to execute the things they need done, find out why it has to be a dedicated team. Maybe they’re doing it wrong, or maybe they’ve stumbled across something that wasn’t included in the strictest definitions of DevOps. If people are giving work to those teams to accomplish and excluding other functional teams at the same time, don’t just wag your finger at them and tell them that’s not the “right way”. Find out what enabled that team to violate the ideas in the first place. Maybe the DevOps Team is responsible for all cloud deployments. Maybe they want some control over things instead of just a nebulous connection to an ideal.

Tom’s Take

DevOps in theory is a great thing. DevOps as presented in The Phoenix Project is a marvelous idea. But we all know that when theory meets reality, what we get is something different than we expected. It’s not unlike von Moltke’s famous quote, “No plan survives first contact with the enemy.” In theory, DevOps is pure and works like it should. But we’re seeing practice differing greatly from reality. The results are usually the same but the paths are radically different. And for the purists out there, if you don’t want DevOps to suffer the same fate as on-premise, you need to start asking yourself the same hard questions we are supposed to ask organizations as they start to deploy these ideas.

DevOps is a Silo

Posted on February 28, 2019 by networkingnerd

Silos are bad. We keep hearing how IT is too tribal and broken up into teams that only care about their swim lanes. The storage team doesn’t care about the network. The server teams don’t care about the storage team. The network team is a bunch of jerks that don’t like anyone. It’s a viscous cycle of mistrust and playground cliques.

Except for DevOps. The savior has finally arrived! DevOps is the silo-busting mentality that will allow us all to get with the program and get everything done right this time. The DevOps mentality doesn’t reinforce teams or silos. It focuses on the only pure thing left in the world – committing code. The way of the CI/CD warrior. But what if I told you that DevOps was just another silo?

Team Players

Before the pitchforks and torches come out, let’s examine why IT has been so tribal for so long. The silo mentality came about when we started getting more specialized with regards to infrastructure. Think about the original compute resources – mainframes. There weren’t any silos with mainframes because everyone pretty much had to know what they were doing with every part of the system. Everything was connected to the mainframe. The mainframe itself was the silo.

When we busted the mainframe apart and started down the road of client/server computing the hardware started becoming more specialized. Instead of one giant machine we had lots of little special machines everywhere. The more we deconstructed the mainframe, the more we needed to focus. The direct-attached storage became NAS and eventually SAN. The computer got bigger and bigger and eventually morphed into a virtualized hypervisor. The network exists to connect everything to the rest of the world, and as technology wore on the network became the transport for the infrastructure to talk to everything else.

Silos exist because you had to have specialized knowledge to operate your specialized infrastructure. Sure, there could be some cross training at lower levels or administration. Buy one you got into really complex topics like disk geometry optimization or route redistribution the ability for a layperson to understand what was going on was shot. Each silo exists to reinforce their own infrastructure. Each silo has their norms and their schedules. The storage team will never lose data. The network always has to be available.

Even as these silos got crammed together and subsumed into new job roles, the ideas behind them stayed consistent. Some of the storage admin’s job roles combined with the virtualization team to be some kind of a hybrid. The networking team has been pushed to adopt more agile development methodologies like automation and orchestration. Through it all, the silos were coming down as people pushed the teams to embrace more software focused on the infrastructure. That is, until DevOps burst onto the scene.

OpSilo

The DevOps tribe has a mantra: “Move Fast. Break Things. Ship. Ship. SHIP!” Maybe not those exact words but something very similar. DevOps didn’t come from mainframes. It didn’t even come from the early days of client/server. DevOps grew out of a time when everything was blown apart and on the verge of being moved into the cloud. These new DevOperators didn’t think about infrastructure as a team or a tribe. Instead, it was an impediment to shipping code.

When you work in software, moving fast and breaking things works. Because you’re pushing the limits of what you can do. You’re focused on features. You want new shiny things. Stability can wait as long as the next code commit is right around the corner. Who cares about what you’ve been doing.

In order to have the best experience with Software X, please turn on Automatic Updates so we can push the code as fast as our commits will allow.

Sound familiar? Who cares about disk geometry or route reflectors. Make my stuff work! Your infrastructure supports all my awesome code. I write the stuff that pays your salary. This place would be out of business if it wasn’t for me!

Granted that’s a little extreme, but the mentality is the same. Infrastructure exists to be consumed. IT is there to support the mission of Moving Fast, Breaking Things, and Shipping. It’s almost like a tribal behavior. Everyone has the same objective – ALL THE COMMITS!

Move fast and break things is the exact opposite of the storage and networking teams. You really don’t want to be screaming along at 800Mph when deploying a new SAN or trying to get iBGP stood up. You want careful. Calm. Collected. You’re working with a whole system that’s built on a house of cards. Unlike DevOps, breaking a thing in a SAN or on the edge of a network could impact the entire system, not just one chat module.

That’s why Networking and storage admins are so methodical. I harken back to some of my days in network engineering. When the network was running slow or the storage array was taxed, it took time to get data back. People were irritated but they got used to the idea of slowness. But if those systems ever went down, it was all-hands-on-deck panic! Contrast that with the mentality of the DevOps tribe. Who cares if it’s kind of broken right now? We need to ship the next feature or patch.

DevOps isn’t a silo buster. It’s just a different kind of tribal silo. The DevOps folks all have similar mentalities and view infrastructure in the same way. Cloud appeals to them because it minimizes infrastructure and gives them the tools they need to focus on developing. Cloud sprawl can easily happen when planning doesn’t occur. When specialized groups get together and talk about what they need, there is a reduction in consumed resources. Storage admins know how to get the most out of what they have. They don’t just spin up another bucket and keep deploying.

Tom’s Take

If you treat DevOps like a siloed tribe you’ll find their behavior is much easier to predict and work with. Don’t look at them as a cross-functional solution to all your problems. Even if you deploy all your assets to the cloud you’re going to need specialized teams to manage them once the infrastructure grows too big to manage by movement. Specialization isn’t the result of bad planning or tribalism. Instead, those specialized teams developed because of the need for deeper understanding. Just like DevOps developed out of a need to understand rapid deployment and fast-moving consumption of infrastructure. In time, the next “solution” to the DevOps problem will come along and we’ll find as well that it’s just another siloed team.

Atmosic and the Power of RF?

Posted on February 21, 2019 by networkingnerd

I recently talked to a company doing some very interesting things in the mobility space and I thought I’d take a stab at writing about them. Most of my mobility posts are about access points or controller software or me just complaining in general about the state of Wi-Fi 6. But this idea had me a little intrigued. And confused.

Bluetooth Moon Rising

Atmosic is a company that is focusing on low-power chips, especially for IoT applications. Most of their team came from Atheros, which you may recall powers a ton of the reference architectures used in wireless APs in many, many AP manufacturers that don’t make their own chips. Their team has the chops to make good wireless stuff one would think.

Atmosic wants to make IoT devices that use Bluetooth Low Energy (BLE). So far, this is sounding pretty good to me. I’ve seen a lot of crazy awesome ideas for BLE, like location tracking indoors or on-demand digital signage. Sure, there are some tracking issues that go along with that but it’s mostly okay. BLE is what the industry has decided to standardize on for a ton of IoT functionality.

How does Atmosic want to change things in the BLE space? Well, those Atheros chipset guys started out by building a chip that uses 5-10 times less power than before. That’s a staggering number when you think about it. BLE beacons already don’t use a ton of power. They’re designed to be used in concert with APs or with standalone, battery-powered devices. The BLE beacons I’ve seen from Aruba are about the size of the AirPods case. And that battery can last for a couple of years.

If Atmosic really did build a chip that can power those beacons with event 5x less power usage, you’re looking at a huge increase in the lifespan of a beacon! Imagine being able to deploy these things everywhere and have them run for a decade? You could literally cover a stadium or a hotel with them for next to nothing. Even if you included the chip in a new AP, which Atmosic is partnering to do, you could effectively run the BLE side of things for free from a power budget perspective.

This is something that is pretty big news. So why did I suddenly see things start sliding off the rails?

Unlimited Power!

The next part of the Atmosic pitch came when they told me about the the other part of their trinity of power savings. On their technology page, they tout the above mentioned chipset along with the special on-demand wake feature that allows the chips to be put into a deep sleep mode that will only awaken when it receives a special packet designed to rouse the chip like a custom alarm clock.

That third thing, though. Power harvesting. Now we’re starting to get into the real weeds of Wi-Fi stuff. Essentially, Atmosic is saying they can power their low-power BLE beacon indefinitely by harvesting power from RF in the air. Yeah, that’s right. They’re literally pulling nanoamps of power from remote power sources. Evidently, their power system is more reliable because they use known sources like 900MHz for coverage as opposed to just trying to pull the power from whatever happens to be around.

At this point, you’re probably saying one of two things:

This is crap and it will never work.
This is the most amazing thing ever!

Right now I tend to fall on the side of the first one. Why? Because if they really did invent a way to pull power from thin air, some really should cut them a check because they need to be building bigger, badder everythings! Imagine being able to power whatever you wanted without clunky batteries or power cords. It would be a revolution!

Sadly, the reality is that the Atmosic trinity pretty much requires all three parts to be so revolutionary. I talked to a couple of my friends in the wireless industry about this and Jonathan Davis (@SubNetwork) was about as skeptical as I was. Since he’s a real math wizard, he figured out that the amount of power being pulled in by the Atmosic chips through the air has to be pretty tiny. Like below nanoamps. And that’s not enough to run an active BLE beacon.

Building a Lower Powered Mousetrap

That’s where the whole system comes into play. It takes a very low power chip with a custom wake sensor (read: Passive Beacon) in order to be able to run on the kinds of power that you can draw from RF waves. And this is where the utility of the whole thing starts breaking down for me. Sure, you could do something crazy like put this on a piece of paper and “hide” it in the service tag of a piece of equipment like a laptop. Then you have a BLE that can track that device even if it’s powered down. But you still need a way to excite the BLE chip and make it wake up. And, at this point, if you’re doing passive Bluetooth is the solution really any better than a passive RFID tag that has the same lifespan? And is a lot cheaper?

The other issue that I have with this solution is the proposed longevity. Forever is the tag on the Atmosic site. For. Ev. Er. Sounds like a great idea in theory, right? Deploy a device in your network that can run forever on free energy and you never have to replace the batteries. Okay, that’s great. How old is your iPhone? Your laptop? Better yet, how old is the oldest piece of enterprise tech that you have on your person right now? I’d wager it can’t be more than 6 years old at this point.

Enterprises get chided for having old technology all the time. Maybe that laptop is 6 years old and still running. Perhaps those servers should have been decommissioned a refresh cycle ago. Compared to the mayfly lifespan of an iPhone, your average piece of enterprise tech is pretty long in the tooth. But not all Enterprise tech is that outdated. Take a look at wireless access points, for example. If you are running the oldest 802.11ac access point made it’s still just barely five years old, the standard having been ratified in December 2013. Most enterprises have already refreshed their 11ac Wave 1 APs. If they haven’t, they’re just holding off long enough for 802.11ax to maybe get certified this year so they can push out hot new hardware.

So, with 5-6 years as the standard for “old” technology in the enterprise, what on earth are we going to do with beacons that are a decade old? With the low-power chipset you’re already looking at a 5-7 lifespan on current battery technology if it really does deliver 5x power savings. Even current BLE beacons are designed with a short lifespan for a reason. Technology changes very fast. If you try and keep that device stuck the wall or a laptop for too long, it’s going to be out of sync with the rest of the tech around it.

Imagine trying to hook up a Bluetooth 2.x device to a current iPhone. It will work because the standards are there but it’s going to be painful because newer devices offer so much more functionality. Trying to keep devices around forever for the sake of doing it isn’t practical. And if you’re going to try and counter the argument by saying IoT devices can be around for quite a while you’re not going to win there either. Most IoT devices that are embedded for long term use wouldn’t use wireless or Bluetooth in the first place. They would be hardwired to cut down on potential points of failure. Sure, you might include something like this in the system, but it’s going to be powered enough already to not need to harvest power through the air.

Tom’s Take

I think the Atmosic people have the right idea for a baseline but their stretch goal is a bit lofty and sci-fi for my tastes. Sure, the idea of being able to harvest unlimited power from RF to run devices without batteries for years is great in theory. But technology demands for both enterprise tech and consumer/enterprise IoT devices is going to drive people to use the lowest common denominator of simplicity. I think that Atmosic has a lot of upside with these new super efficient chips. But I doubt we’re going to see anyone sucking power out of thin air any time soon.

The Networking Nerd

Networking With A Side of Snark

Increasing Entropy with Crypto4A

This All Looks So Familiar…

Disorder In The Court

Tom’s Take

The Confluence of SD-WAN and Microsegmentation

Segmenting Your Traffic

Tom’s Take

802.11ax Is NOT A Wireless Switch

Complex Duplex

Switching It Up

CPU Bound and Determined

Tom’s Take

OpenConfig and Wi-Fi – The Winning Combo

Yin and YANG

Telemetry Short-Cuts

Single-Vendor Isn’t So Single

Tom’s Take

Fast Friday – Aruba Atmosphere 2019

Tom’s Take

The Blogging Mirror

A World Of Pure Imagination

Strike That, Reverse It

Tom’s Take

Silo 2: On-Premise with DevOps

Grounding the Rules

Redefining DevOps

Tom’s Take

DevOps is a Silo

Team Players

OpSilo

Tom’s Take

Atmosic and the Power of RF?

Bluetooth Moon Rising

Unlimited Power!

Building a Lower Powered Mousetrap

Tom’s Take

This All Looks So Familiar…

Disorder In The Court

Tom’s Take

Share this:

Segmenting Your Traffic

Tom’s Take

Share this:

Complex Duplex

Switching It Up

CPU Bound and Determined

Tom’s Take

Share this:

Yin and YANG

Telemetry Short-Cuts

Single-Vendor Isn’t So Single

Tom’s Take

Share this:

Tom’s Take

Share this:

A World Of Pure Imagination

Strike That, Reverse It

Tom’s Take

Share this:

Rest In Peace and Queues

QoupS de Gras

The End is The Beginning

Tom’s Take

Share this:

Grounding the Rules

Redefining DevOps

Tom’s Take

Share this:

Team Players

OpSilo

Tom’s Take

Share this:

Bluetooth Moon Rising

Unlimited Power!

Building a Lower Powered Mousetrap

Tom’s Take

Share this: