About networkingnerd

Tom Hollingsworth, CCIE #29213, is a former network engineer and current organizer for Tech Field Day. Tom has been in the IT industry since 2002, and has been a nerd since he first drew breath.

Resource Contention In IT – Time Is Never Enough

I’m at Future:NET this week and there’s a lot of talk about the future of what networking is going to look like from the perspective of vendors like Apstra, Veriflow, and Forward Networks. There’s also a great deal of discussion from customers and end users as well. One of the things that I think is being missed in all the talk about resources.

Time Is Not On Your Side

Many of the presenters, like Truman Boyes of Bloomberg and Peyton Maynard-Koran of EA, discussed the idea of building boxes from existing components instead of buying them from established networking vendors like Cisco and Arista. The argument does hold some valid ideas. If you can get your hardware from someone like EdgeCore or Accton and get your software from someone else like Pluribus Networks or Pica8 it looks like a slam dunk. You get 90% to 95% of a solution that you could get from Cisco with much less cost to you overall.

Companies like Facebook and Google have really pioneered this solution. Facebook’s OCP movement is really helping networking professionals understand the development that goes into building their own switches. Facebook’s commitment is also helping reduce the price of the components when an eager person wants to go build an OCP switch from parts they find at Radio Shack or from Amazon.

But, for Facebook, the development of a switch like this or the development of a platform is a sunk cost. Because the important resource to Facebook isn’t time. Facebook has teams of engineers sitting around developing things. For them, the time the least important resource. Time is something they have in abundance. Why is that? Because their development is entirely focused on their product. Google can afford to have 500 people working on a product with an IT focus like Google Reader or Google Wave because that’s what Google hires people to do.

Contrast that with the typical IT department at an enterprise. Even with thousands of users in Marketing, Management, and Finance there are usually only a handful of IT professionals. And those people have to cover storage, compute, networking, wireless, and software. The focus of the average law firm is not using IT resources to create a product. The focus of the business is leveraging IT to provide a service. A finance firm doesn’t have the time resources to commit to developing in-house solutions or creating IT hardware from components and freely available software.

Money, Money, Money.

Let’s look at the other side of the coin. Facebook and Google have oodles and oodles of time to build and develop things. They can get their developers to work together to build the hardware and software to integrate at a deep level. And because they understand it at that level, they can easily debug it instead of asking someone to solve their problem. What Facebook and Google don’t have is money.

To large firms like these, money is more important than time. When you have to purchase networking or storage equipment by the thousands or tens of thousands of units money becomes a huge issue. If you can save a few dollars per switch that can translate to huge savings in the long run. Even Facebook is doing this with OCP. By creating a demand for specific components for these devices, they can drive costs down across the board and save money for them. For Facebook, money is what is tracked for creating their infrastructure. The more that is saved, the more they can do with it.

In the enterprise, money isn’t quite as important as time. Money is important to businesses for sure. You don’t keep the lights on if you aren’t making money. But because IT supports the business and isn’t the entire business, money can be more easily allocated to projects from budgets. There are pools of money that can be used to purchase office furniture, catering services, or IT hardware. These resources can be reallocated efficiently like Facebook allocates time for projects. If the storage array needs to be upgraded or the wireless needs to be refreshed there can be discussions about how to accomplish it. Maybe the CEO doesn’t get a new desk this quarter. Or maybe there needs to be a few new sales discussions to create capital. But money isn’t as valuable as time. If you think I’m crazy try to get 10 minutes on a CEO’s calendar. Versus getting him to sign off on a purchase.

That’s the real value of cloud computing for IT professionals. They aren’t paying for scale or for availability. They’re really paying for time. They’re paying for a process that reduces the amount of time that they spend configuring low level tasks that are menial and time consuming. Building systems takes time. Automation reduces the time it takes. Process reduces it even further. So organizations looking to move to the cloud are essentially trading one resource, money, for a more important resource, time. Likewise, the large cloud providers that are building these systems are trading their resource, time, for a more valuable resource, money.


Tom’s Take

I don’t believe that smaller enterprises will ever truly embrace the idea of building their own OCP switches running custom Linux distros and custom built routing processes. Because to them, time is way too important. Time to focus on the business. Time to focus on supporting the way that the money is made. Time to do more. Likewise, I expect that large enterprises and providers like Facebook will continue to push the envelope of development and create new solutions. Because they have the time to play and test and build. And use those skills to make money. But I never see a world where those two places meet. Because resource contention is different between these two groups and it causes different outcomes. And the value of those resources are unlikely to change without massive disruption.

Advertisements

The Complexity Of Choice

Russ White had an interesting post this week about the illusion of choices and how herd mentality is driving everything from cell phones to network engineering and design. I understand where Russ is coming from with his points, but I also think that Russ has some underlying assumptions in his article that ignore some of the complexity that we don’t always get to see in the world. Especially when it comes to the herd.

Collapse Into Now

Russ talks about needing to get a new mobile phone. He talks about how there are only really two choices left in the marketplace and how he really doesn’t want either of them. While I applaud Russ and his decision to stand up for his principals, there are more than two choices. He could easily purchase a used Windows mobile phone from eBay. He could choose to run a Palm Tree 650 or a Motorola RAZR from 2005. He could even choose not to carry a phone.

You’re probably saying, “That’s not a fair comparison. He needs feature X on his phone, so he can’t use phone Y.”

And you would be right! So right, in fact, that you’ve already missed one of the complexities behind making choices. We create artificial barriers to reduce the complexity of options because we have needs to meet. We eliminate chaos when making decisions by creating order to limit our available pool of resources.

Let’s return to the mobile phone argument for a moment. Obviously, not having a phone is not an option. But what does Russ need on his phone that pushed him to the iPhone? I’m assuming he needs more than just voice calling capability. That means he can’t use a Jitterbug even though it’s a very serviceable phone for the user base that wants that specific function set. I’m also sure he needs a web browser capable of supporting modern web designs. Perhaps it’s a real keyboard and not T9 predictive text for everything you could want to type. That eliminates the RAZR from contention.

What we’re left with when we develop criteria to limit choices is not two things we feel very ambivalent about. It’s actually the resulting set of operations on chaos to provide order. Android and iPhone don’t strongly resemble each other because of coincidence. It’s because the resulting set of decision making by consumers has led them to this point. Windows Mobile also suspiciously looks like Android/iOS for the same reason. If the mobile OS on Russ’s phone looked more like Windows Mobile 5.0 instead, I’m sure his reasons for not choosing iOS or Android would have been much stronger.

Automatic For The People

So, why is it in networking and mobile phones and even extra value meals that we find our choices “eliminated” so frequently? How is it that we have the illusion of choice without many real choices at all? As it turns out, the illusion is there not to reinforce our propensity to make choices, but instead to narrow our list of possibly choices to something greater than the set of Everything In The World.

Let’s take a hamburger for instance. If you want a hamburger, you will probably go to a place that makes them. You’ve already made a lot of choices in just that one decision, but let’s move past the basic choices. What is your hamburger going to look like? One meat patty? Three? Will it have a special kind of sauce? Will it be round? Square? Four inches thick? These are all unconscious choices that we make. Sometimes, we don’t make these choices by eliminating every option. Instead, we make them by choosing a location and analyzing the options available there. If I pick McDonald’s as my hamburger location because of another factor, like location or time, then I’ve artificially limited my choices to what’s available at McDonald’s at that moment. I can’t get a square hamburger with extra bacon. Not because it’s unfair that McDonald’s doesn’t offer that option. But because I made my choice about available options before I ever got to that point. The available set of my choices will include round meat patties and American cheese. If I want something different, I need to pick a different restaurant, not demand McDonald’s give me more choices.

Artificially limiting choices for people making decisions sometimes isn’t about resource availability. It’s about product creation. It’s about assembly and complexity in making devices. It’s about what people make important in their decision making process. Let’s take a pretty well-known example:

On the left is the way cell phones were designed before 2007. Every one of them looks interesting in some way. Some had keyboards. Some had flip options. Some had pink cases or external antennas. Now, look at the cluster on the right. They all look identical. They all look like Apple Mobile Phone Device that debuted in 2007. Why? Because customers were making a choice based on form factor that informed the rest of their choices. In 2008, if you wanted a mobile device that was a black rectangle with a multi-touch screen and a software keyboard, your options were pretty limited. Now, I challenge you to find a phone that doesn’t have those options. It’s like trying to find a car without power windows. They do exist, but you have to make some very specific choices to find one.


Tom’s Take

Some choices are going to be made for us before we ever make our first decision. We can’t buy networking switches at McDonald’s. We can’t eat our mobile phones. But what we can do to give ourselves more choices is realize that the trade off for that is expending energy. We must do more to find alternatives that meet our requirements. We have to “think outside the box”, whether that means finding a used device we really like somewhere or rolling our own Linux distribution from Slackware instead of taking the default Ubuntu installation. It means that we’re going to have to make an effort to include more choices instead of making choices that automatically exclude certain options. And after you’ve done that for more than a few things, you will realize that the illusion of few choices isn’t really an illusion, but a mask that helps you preserve your energy for other things.

Automating Documentation

Tedium is the enemy of productivity. The fastest way for a task to not be done is to make it long, boring, and somewhat complicated. People who feel that something is tedious or repetitive are the ones more likely to marginalize a task. And I think I speak for the entire industry when I say that there is no task more tedious and boring than documentation. So how can we fix it?

Tell Me What You Did

I’m not a huge fan of documentation. When I decide on a plan of action, I rarely write it down step-by-step unless I’m trying to train someone. Even then, it looks more like notes with keywords instead of a narrative to follow. It’s a habit that has been borne out of years of firefighting in networks and calls to “do it faster”. The essential items of a task are refined and reduced until all that remains is the work and none of the ancillary items, like documentation.

Based on my previous life as a network engineer, I can honestly say that I’m not alone in this either. My old company made lots of money doing network discovery engagements. Sometimes these came because the previous admins walked out the door with no documentation. Other times, it was simply because the network had changed so much since the last person made any notes that what was going on didn’t resemble anything like what they thought it was supposed to look like.

This happens everywhere. It doesn’t take many instances of an network or systems professional telling themselves, “Oh, I’ll write it down later…” for later to never come. Devices get added, settings get changed, and not one word is ever written down. That’s the kind of chaos that causes disorganization at best and outages at worst. And I doubt there’s any networking pro out there that hasn’t been affected by bad documentation at one time or another.

So, how do we fix documentation? It’s tedious for sure. Requiring it as part of the process just invites people to find ways around it. And good documentation takes time. Is there a way to combine the lack of time, lack of requirement, and repetition and make documentation something that is done again? I think there is. And it requires a little help from process.

Not Too Late To Automate

Automation is a big thing right now. SDN is driving it. Network complexity is practically requiring it. Yet networking professionals are having a hard time embracing it. Why?

In part, networking pros don’t like to spend hours solving a problem that can be done in minutes. If you don’t believe me, watch one of the old SNL Nick Burns sketches. Nick is more likely to tell you to move than tell you how to fix your problem. Likewise, if a network pro is spending four hours writing an automation script that is supposed to execute a change that can be made in 20 minutes, they’re not going to want to do it. It’s just the nature of the job and the desire of the network professional to make every minute count.

So, how can we drive adoption of automation? As it turns out, automating documentation can be a huge driver. Automation of tedious tasks is exactly the thing that scripting and automation was designed to solve. Instead of focusing on the automation of the task, like adding VLANs to a set of switches, focus on the ability of the system to create documentation on the fly from the change.

Let’s walk through an example. In order for documentation to matter, it has to answer the 5 Ws. How can we automate that?

Let’s start with Who. Automation can create documentation saying user Hollingsworth made a change through an automated process. That helps the accounting side of the house figure out the person making changes in the network. If that person is actually a script, the Who can be changed to reflect that it was an automated process called by a person related to a change ticket. That gives everyone the ability to track the changes back to a given problem. And it can all be pulled in without user intervention.

What is also an easy automation task. List the configuration being applied. At first, the system can simply list the configuration to be programmed. But for menial and repetitive tasks like VLAN additions you can program the system with a real description like “Adding VLANs to $Switch to support $ticket”. Those variables can be autopopulated based on the work to be done. Again, we reference a ticket number in order to prove that these changes are coming from somewhere.

When is also critical. Are these changes happening in a maintenance window? Or did someone check them in in the middle of the day because they won’t cause any problems? (SPOILER ALERT: They will) By required a timestamp for changes, you can track which professionals are being cavalier with their change management. You can also find out if someone is getting into the system after hours to cause problems or attempt to compromise things. Even if the cause of the change is “immediately” due to downtime or emergency, knowing why it had to be checked in right away is a clue to finding problems that recur in the network.

Where is a two-pronged reason. It’s important to check where the changes are going to be applied. Is it going to be done to all switches in the organization? Or just a set in a remote office. Sanity checking via documentation will keep you from bricking your entire organization in one fell swoop. Likewise, knowing where the change is being checked in from is important. Is a remote office trying to change config on HQ switches? Is a remote engineer dialed in making changes related to an open support case? Is someone from a foreign nation making changes via VPN at 4:30am local time? In every case, you’d really want to know what’s going on before those changes get made.

Why is the one that will trip up everything. If you don’t believe me, I’d like to give you the top two reasons why Windows Server 2003 is shut down and rebooted with the shutdown justification dialog box:

  1. a;lkdjfalkdflasdfkjadlf;kja;d
  2. JUST ****ing SHUT DOWN!!!!!

People don’t like justifying their decisions. Even when I worked for Gateway 2000 on their national help desk, our required call documentation was a bit spotty when it came to justification for changes. Why did you decide to FDISK and reload? Why are you going into the registry to fix the icon colors? Change justification is half of documentation. It gives people something to audit. It gives people a way to look at things and figure out why you started down the path of a particular reasoning for problem solving. It also provides context for you after the fact when you can’t figure out why you did it the way you did.


Tom’s Take

Automation isn’t going to take away your job. Automation is going to do the jobs you hate doing. It’s going to make your life easier to concentrate on the tasks that need to be done by freeing you from the tasks that should be done and aren’t. If we can make automation document our networks for just six months, I think you’ll find the value in programming things to work this way. I also think you’ll be happier with the level of detail on your network. And once you can prove the value of automating just one task to your teams, I’m sure they’ll see the value of increasing automation all around.

Virtual Reality and Skeuomorphism

Remember skeuomorphism? It’s the idea that the user interface of a program needs to resemble a physical a physical device to help people understand how to use it. Skeuomorphism is not just a software thing, however. Things like faux wooden panels on cars and molded clay rivets on pottery are great examples of physical skeuomorphism. However, most people will recall the way that Apple used skeuomorphism in the iOS when they hear the term.

Scott Forrestal was the genius behind the skeuomorphism in iOS for many years. Things like adding a fake leather header to the Contacts app, the wooden shelves in the iBooks library, and the green felt background in the Game Center app are the examples that stand out the most. Forrestal used skeuomorphism to help users understand how to use apps on the new platform. Users needed to be “trained” to touch the right tap targets or to feel more familiar with an app on sight.

Skeuomorphism worked quite well in iOS for many years. However, when Jonny Ive took over as the lead iOS developer, he started phasing out skeuomorphism starting in iOS 7. With the advent of flat design, people didn’t want fake leather and felt any longer. They wanted vibrant colors and integrated designs. As Apple (and others) felt that users had been “trained” well enough, the decision was made to overhaul the interface. However, skeuomorphism is poised to make a huge comeback.

Virtual Fake Reality

The place where skeuomorphism is about to become huge again is in the world of virtual reality (VR) and augmented reality (AR). VR apps aren’t just limited to games. As companies start experimenting with AR and VR, we’re starting to see things emerge that are changing the way we think about the use of these technologies. Whether it be something as simple as using the camera on your phone combined with AR to measure the length of a rug or using VR combined with a machinery diagram to teach someone how to replace a broken part without the need to send an expensive technician.

Look again at the video above of the AR measuring app. It’s very simple, but it also displays a use of skeuomorphism. Instead of making the virtual measuring tape a simple arrow with a counter to keep track of the distance, it’s a yellow box with numbers printed every inch. Just like the physical tape measure that it is displayed beside. It’s a training method used to help people become acclimated to a new idea by referencing a familiar object. Even though a counter with tenths of an inch would be more accurate, the developer chose to help the user with the visualization.

Let’s move this idea along further. Think of a more robust VR app that uses a combination of eye tracking and hand motions to give access to various apps. We can easily point to what we want with hand tracking or some kind of pointing device in our dominant hand. But what if we want to type? The system can be programmed to respond if the user places their hands palms down 4 inches apart. That’s easy to code. But how to do tell the user that they’re ready to type? The best way is to paint a virtual keyboard on the screen, complete with the user’s preferred key layout and language. It triggers the user to know that they can type in this area.

How about adjusting something like a volume level? Perhaps the app is coded to increase and reduce volume if the hand is held with fingers extended and the wrist rotated left or right. How would the system indicate this to the user? With a circular knob that can be grasped and manipulated. The ideas behind these applications for VR training are only limited by the designers.


Tom’s Take

VR is going to lean heavily on skeuomorphism for many years to come. It’s one thing to make a 2D user interface resemble an amplifier or a game table. But when it comes to recreating constructs in 3D space, you’re going to need to train users heavily to help them understand the concepts in use. Creating lookalike objects to allow users to interact in familiar ways will go a long way to helping them understand how VR works as well as helping the programmers behind the system build a user experience that eases VR adoption. Perhaps my kids or my grandkids will have VR and AR systems that are less skeuomorphic, but until then I’m more than happy to fiddle with virtual knobs if it means that VR adoption will grow more quickly.

It’s Probably Not The Wi-Fi

After finishing up Mobility Field Day last week, I got a chance to reflect on a lot of the information that was shared with the delegates. Much of the work in wireless now is focused on analytics. Companies like Cape Networks and Nyansa are trying to provide a holistic look at every part of the network infrastructure to help professionals figure out why their might be issues occurring for users. And over and over again, the resound cry that I heard was “It’s Not The Wi-Fi”

Building A Better Access Layer

Most of wireless is focused on the design of the physical layer. If you talk to any professional and ask them to show your their tool kit, they will likely pull out a whole array of mobile testing devices, USB network adapters, and diagramming software that would make AutoCAD jealous. All of these tools focus on the most important part of the equation for wireless professionals – the air. When the physical radio spectrum isn’t working users will complain about it. Wireless pros leap into action with their tools to figure out where the fault is. Either that, or they are very focused on providing the right design from the beginning with the tools validating that access point placement is correct and coverage overlap provides redundancy without interference.

These aren’t easy problems to solve. That’s why wireless folks get paid the big bucks to build it right or fix it after it was built wrong. Wired networkers don’t need to worry about microwave ovens or water pipes. Aside from the errant fluorescent light or overly aggressive pair of cable pliers, wired networks are generally free from the kinds of problems that can plague a wire-free access layer.

However, the better question that should be asked is how the users know it’s the wireless network that’s behind the faults? To the users, the system is in one of three states: perfect, horribly broken, or slow. I think we can all agree that the first state of perfection almost never actually exists in reality. It might exist shortly after installation when user load is low and actual application use is negligible. However, users are usually living in one of the latter states. Either the wireless is “slow” or it’s horribly broken. Why?

No-Service Station

As it turns out, thanks to some of the reporting from companies like Cape and Nyansa, it turns out that a large majority of the so-called wireless issues are in fact not wireless related at all. Those designs that wireless pros spend so much time fretting over are removed from the equation. Instead, the issues are with services.

Yes, those pesky network services. The ones like DNS or DHCP that seem invisible until they break. Or those services that we pay hefty sums to every month like Amazon or Microsoft Azure. The same issues that plague wired networking exist in the wireless world as well and seem to escape blame.

DNS is invisible to the majority of users. I’ve tried to explain it many times with middling to poor results. The idea that computers on the internet don’t understand words and must rely on services to translate them to numbers never seems to click. And when you add in the reliance on this system and how it can be knocked out with DDoS attacks or hijacking, it always comes back to being about the wireless.

It’s not hard to imagine why. The wireless is the first thing users see when they start having issues. It’s the new firewall. Or the new virus. Or the new popup. It’s a thing they can point to as the single source of problems. And if there is an issue at any point along the way, it must be the fault of the wireless. It can’t possibly be DNS or routing issues or a DDoS on AWS. Instead, the wireless is down.

And so wireless pros find themselves defending their designs and configurations without knowing that there is an issue somewhere else down the line. That’s why the analytics platforms of the future are so important. By giving wireless pros visibility into systems beyond the spectrum, they can reliably state that the wireless isn’t at fault. They can also engage other teams to find out why the DNS servers are down or why the default gateway for the branch office has been changed or is offline. That’s the kind of info that can turn a user away from blaming the wireless for all the problems and finding out what’s actually wrong.


Tom’s Take

If I had a nickel for every problem that was blamed on the wireless network or the firewall or some errant virus when that actually wasn’t the case, I could retire and buy my own evil overlord island next to Larry Ellison. Alas, these are issues that are never going to go away. Instead, the only real hope that we have is speeding the time to diagnose and resolve them by involving professionals that manage the systems that are actually down. And perhaps having some pictures of the monitoring systems goes a long way to tell users that they should make sure that the issue is indeed the wireless before proclaiming that it is. Because, to be honest, it probably isn’t the Wi-Fi.

The History of The Wireless Field Day AirCheck

Mobility Field Day 2 just wrapped up in San Jose. It’s always a little bittersweet to see the end of a successful event. However, one thing that does bring a bit of joy to the end of the week is the knowledge that one of the best and longest running traditions at the event continues. That tradition? The Wireless/Mobility Field Day AirCheck.

The Gift That Keeps Giving

The Wireless Field Day AirCheck story starts where all stories start. The beginning. At Wireless Field Day 1 in March of 2011, I was a delegate and fresh off my first Tech Field Day event just a month before. I knew some wireless stuff and was ready to learn a lot more about site surveys and other great things. Little did I know that I was about to get something completely awesome and unexpected.

As outlined in this post, Fluke Networks held a drawing at the end of their presentation for a first-generation AirCheck handheld wireless troubleshooting tool. I was thrilled to be the winner of this tool. I took it home and immediately put it to work around my office. I found it easy to use and it provided great information about wireless networks that I could use to make my life easier. I even loaned it out to some of my co-workers during troubleshooting calls and they immediately told me the wanted one of my own.

As the rest of 2011 rolled forward, I found uses for my AirCheck but I didn’t do as much wireless as a lot of the other people out there. I knew that someone else could probably get more out of having it than I did. So, I hatched a plan. I told Stephen Foskett that if I had the chance to come back to Wireless Field Day 2, I would gladly give my AirCheck away to another worthy delegate. I wanted to keep the tool in use with the best and brightest people in the community and help them see how awesome it was.

Sure enough, I was invited to Wireless Field Day 2 in January 2012. I arrived with my AirCheck and waited until the proper moment. During the welcome dinner, Matt Simmons and I found a way to randomly draw a number and award the special prize to Matthew Norwood. He was just as thrilled to get the AirCheck as I was. I sent my prize from Wireless Field Day 1 on its way to a new home, content that I would help someone get more wireless knowledge.

But the giving didn’t stop there. Even though I wasn’t a delegate for Wireless Field Day 3 or Wireless Field Day 4, the AirCheck kept coming back. Matthew gave it to Dan Cybulskie. Dan gave it to Scott Stapleton. The AirCheck headed down under for half of 2013. When Wireless Field Day 5 rolled around, I was now a staff member for Tech Field Day and working behind the scenes. I had forgotten about the AirCheck until a box arrived from Australia with Scott’s postmark on it. He mailed it back to the US to continue the tradition!

And so, the AirCheck passed along to a new set of hands every event. Blake Krone got it at Wireless Field Day 5. Then Jake Snyder, followed by Richard McIntosh and Scott McDermott. Even when we changed the name of the event to Mobility Field Day in 2016, the AirCheck passed along to Rowell Dionicio.

Changing Of The Guard

In the interim, the AirCheck product moved over to Netscout. They developed a new version, the G2, that was released after Mobility Field Day 1 in 2016. The word also got around to the Netscout folks that there was a magical G1 AirCheck that was passed along to successive Wireless/Mobility Field Day delegates as a way of keeping the learning active in the community.

Netscout was a presenter during Mobility Field Day 2 in 2017. Chris Hinz contacted me before the event and asked if we still gave away the AirCheck during the event. I assured him that we did. He said that a tradition like that should continue, even if the G1 AirCheck was getting a bit long in the tooth. He told me that he might be able to help us all out.

After the Netscout presentation at Mobility Field Day 2, Chris presented me with his special surprise: a brand new G2 AirCheck! Since we hadn’t given the old unit to its new recipient just yet, we decided that it was time to “retire” the old G1 and pass along the G2 to the next lucky contestant. Shaun Neal was the lucky delegate this time and took the new and improved G2 home with him Wednesday night. I was happy to see it go to him knowing that he’ll get to put it through its paces and learn from it. And then he will get to bring it back to the next Mobility Field Day for it to pass along to a new delegate and continue the chain of sharing.


Tom’s Take

When I gave away my G1 AirCheck all those years ago, I never expected it would turn into something so incredible. The sharing and exchange of tools and knowledge at both Wireless Field Day and Mobility Field Day help remind me of why I do this job with Stephen. The community is an awesome and amazing place sometimes. The new G2 AirCheck will have a long life helping delegates troubleshoot wireless issues.

The old G1 AirCheck, my AirCheck, is in my suitcase. It’s ready to start its retirement in my office, having earned thousands of frequent flyer miles as well as becoming a very important part of Tech Field Day lore. I couldn’t be happier to get it back at the end of its life knowing how much happiness it brought to people along the way.

Context From The People

Are you ready for the flood of context-based networking solutions? If not, it’s time to invest in sandbags. After the launch of Cisco’s Intuitive Network solution set at Cisco Live, the rest of the context solutions are coming out to play. Granted, some of them are like Apstra and have been doing this for a while. Others are going to be jumping on the bandwagon of providing a solution that helps with context. But why are we here and why now?

Creating Context

The truth is that we’ve had context in the network for decades now. It’s not a part number that we can order from a vendor. It’s not a command that we type into the CLI to activate. In fact, it’s nothing that you can see at all right now, unless there’s a mirror handy.

The context in networks has been provided by people for as far back as anyone can remember. You do it every day without consciously realizing it. You interpret error messages and disregard those that aren’t important. People know how to program VLANs correctly to segment traffic in certain ways. Security context, application context, and more are delivered by breathing, thinking humans.

We have a massive number of tools to help us create additional context and understand things that are beginning to get out of our control. But these tools are still reliant on a person providing the necessary context to operate. Take red light fatigue, for example. This is a situation in which humans are providing context to a situation being reported. For some cases, the red light means that a condition has passed a threshold and there is a corresponding trigger. However, when the context of that trigger is deemed unimportant, the context applied is “ignore that one”. So much context can be applied to a board full of red lights that we eventually become blind to them and miss a real problem when it pops up.

Humans are great at stretching the bounds of thought to understand why situations need context. Humans know when a link is congested enough to need to be configured for longer routing protocol hello timers. Or when a service policy needs to have a bigger exception for scavenger traffic due to incorrect packet markings. But, the question for the future is “How can humans scale?”

Scaling Context

With the rapid expansion of SDN and programmability, we are quickly seeing that mistakes and errors in context can cause massive issues in a short amount of time. Whether it be a botched upgrade or a script that nullifies interfaces, the system is now capable of making mistakes at a very rapid pace. This sounds like the perfect reason for humans to step into the loop and ensure that mistakes like this can’t happen quickly.

But can humans really scale as fast as needed to keep networks running efficiently? That’s the crux of the issue with modern infrastructure teams. The compute and storage group that is moving toward a DevOps-style frameworks wants to make quick decisions and let the system execute them. When those decisions involve the archaic networking team, the process slows down. Why does the network admin need to put this stuff in manually? Why are they reading through the change orders? Why can’t they just trust us and make it happen?!?

Humans are the current source of context for the network. But, much like many other areas of technology, that context needs to be transferred into a form that can scale. We need to teach the network how to handle exceptions and “think” about how to solve issues. We need to begin the process of training the system to replace us. That’s not because we hope that we will eventually be replaced by a shell script. It’s because we have more important things to apply our knowledge to that need to be solved.

Much like network admins have outgrown the need to manually input VLANs and memorize which ports MySQL needs to have opened on the firewall, so too must we start moving away from constantly checking in on the software running the system to ensure that things are running smoothly. Like the learning television programs that show us what an assembly line looks like when slowed down, we as humans need to let the system operate at the speed it can reach without slowing down to help us understand what it’s doing.


Tom’s Take

I long for the day when I don’t need to look at debug output or error messages and provide my opinion about what might be wrong. Networks with machine learning are still in their infancy. They take way too much processing power to determine issues. But they are getting better. More and more companies are going to start leveraging distributed intelligence to help make low-level decisions about operations. That means humans can start focusing on design matters more. And that’s the kind of context where we shine more than anything else.