When Redundancy Strikes

Networking and systems professionals preach the value of redundancy. When we tell people to buy something, we really mean “buy two”. And when we say to buy two, we really mean buy four of them. We try to create backup routes, redundant failover paths, and we keep things from being used in a way that creates a single point of disaster. But, what happens when something we’ve worked hard to set up causes us grief?

Built To Survive

The first problem I ran into was one I knew how to solve. I was installing a new Ubiquiti Security Gateway. I knew that as soon as I pulled my old edge router out that I was going to need to reset my cable modem in order to clear the ARP cache. That’s always a thing that needs to happen when you’re installing new equipment. Having done this many times, I knew the shortcut method was to unplug my cable modem for a minute and plug it back in.

What I didn’t know this time was that the little redundant gremlin living in my cable modem was going to give me fits. After fifteen minutes of not getting the system to come back up the way that I wanted, I decided to unplug my modem from the wall instead of the back of the unit. That meant the lights on the front were visible to me. And that’s when I saw that the lights never went out when the modem was unplugged.

Turns out that my modem has a battery pack installed since it’s a VoIP router for my home phone system as well. That battery pack was designed to run the phones in the house for a few minutes in a failover scenario. But it also meant that the modem wasn’t letting go of the cached ARP entries either. So, all my efforts to make my modem take the new firewall were being stymied by the battery designed to keep my phone system redundant in case of a power outage.

The second issue came when I went to turn up a new Ubiquiti access point. I disconnected the old Meraki AP in my office and started mounting the bracket for the new AP. I had already warned my daughter that the Internet was going to go down. I also thought I might have to reprogram her device to use the new SSID I was creating. Imagine my surprise when both my laptop and her iPad were working just fine while I was hooking the new AP up.

Turns out, both devices did exactly what they were supposed to do. They connected to the other Meraki AP in the house and used it while the old one was offline. Once the new Ubiquiti AP came up, I had to go upstairs and unplug the Meraki to fail everything back to the new AP. It took some more programming to get everything running the way that I wanted, but my wireless card had done the job it was supposed to do. It failed to the SSID it could see and kept on running until that SSID failed as well.

Finding Failure Fast

When you’re trying to troubleshoot around a problem, you need to make sure that you’re taking redundancy into account as well. I’ve faced a few problems in my life when trying to induce failure or remove a configuration issue was met with difficulty because of some other part of the network or system “replacing” my hard work with a backup copy. Or, I was trying to figure out why packets were flowing around a trouble spot or not being inspected by a security device only to find out that the path they were taking was through a redundant device somewhere else in the network.

Redundancy is a good thing. Until it causes issues. Or until it makes your network behave in such a way as to be unpredictable. Most of the time, this can all be mitigated by good documentation practices. Being able to figure out quickly where the redundant paths in a network are going is critical to diagnosing intermittent failures.

It’s not always as easy as pulling up a routing table either. If the entire core is down you could be seeing traffic routing happening at the edge with no way of knowing the redundant supervisors in the chassis are doing their job. You need to write everything down and know what hardware you’re dealing with. You need to document redundant power supplies, redundant management modules, and redundant switches so you can isolate problems and fix them without pulling your hair out.

Tom’s Take

I rarely got to work with redundant equipment when I was installing it through E-Rate. The government doesn’t believe in buying two things to do the job of one. So, when I did get the opportunity to work with redundant configurations I usually found myself trying to figure out why things were failing in a way I could predict. After a while, I realized that I needed to start making my own notes and doing some investigation before I actually started troubleshooting. And even then, like my cable modem’s battery, I ran into issues. Redundancy keeps you from shooting yourself in the foot. But it can also make you stab yourself in the eye in frustration.


Cisco Live CAE and Guest Keynote Announcements

As you may have heard by now, there have been a few exciting announcements from Cisco Live 2018 regarding the venue for the customer appreciation event and the closing keynote speakers.

Across The Universe

The first big announcement is the venue for the CAE. When you’re in Orlando, there are really only two options for the CAE. You either go to the House of the Mouse or you go to Universal Studios. The last two times that Cisco Live has gone to Orlando it has been to Universal. 2018 marks the third time!

Cisco is going big this year. They’ve rented the ENTIRE Universal Studios park. Not just the backlot. Not just the side parks. They WHOLE thing. You can get your fix on the Transformers ride, visit Harry Potter, or even partake of some of the other attractions as well. It’s a huge park with a lot of room for people to spread out and enjoy the scenery.

That’s not all. The wristband that gets you into the CAE also gets you access to Islands of Adventure before the full park opens! You can pregame the party by hanging out at Hogwarts, going to Jurassic Park, or joining your favorite superheroes for a picture or two for the kids. Access to Islands of Adventure isn’t exclusive, so you’ll be there with all the other tourists from around the world but it’s a great place to hang out before the party gets going!

Note that this year you will need the new Imagine pass or the Party Pass Add-on in order to access the CAE. There is no standalone social pass option or social add-on for conference passes.

Welcome To The Future

The closing keynote speakers have also been announced. Dr. Michio Kaku and Amy Webb will be on stage talking about the future of technology and how it will be impacting our society. Given the keynote that Rowan Trollope delivered during Cisco Live Barcelona, this comes as no surprise to me.

Cisco is very much trying to show that they are getting back on the leading edge of technology and driving innovation in the market. The problem with being the “800lb Gorilla” is that you’re also big and difficult to move. IBM faced the same problem before they shed their legacy and became leaner, more future-focused company. Others that tried to follow in their footsteps were less successful and either split apart or got scooped up in mergers.

Cisco is going through a transition period after the departure of John Chambers. Chuck Robbins is turning the ship as quickly as possible, but there need to be more outwards signs that things are being done to look toward a future where hardware isn’t as important as the innovation happening in software. By bringing in two of the most well known futurists in science and technology, Cisco is sending a signal to their audience of users and investors that the focus is going to be on emerging technology. This is a bit of a gamble for Cisco but it’s hoped that things pay off for them.

Note that there are also going to be other speakers in the Big Ideas Theater on the World of Solutions floor during the event. Access to the World of Solutions is restricted behind the new Imagine pass or full conference pass. There is no Social Pass option, and the party pass add-on does not grant access to the World of Solutions floor.

Tom’s Take

The Cisco Live CAE in Orlando is pretty much a known thing. It’s nice to see all of Universal this year with access to the new attractions at Islands of Adventure. People should be able to enjoy being outside in the Florida humidity instead of the blistering Las Vegas inferno. As well, the rides are going to be fun for a large number of the attendees.

It’s also good to see future-looking keynote speakers that are going to give their viewpoints on things that will impact our lives. With two speakers, I’m expecting another “interview” style closing keynote, which isn’t quite my favorite. But this is a step in the right direction. Here’s hoping that these additions to the event make Cisco Live a great show for those that will be attending.

Memcached DDoS – There’s Still Time to Save Your Mind

In case you haven’t heard, there’s a new vector for Distributed Denial of Service (DDoS) attacks out there right now and it’s pretty massive. The first mention I saw this week was from Cloudflare, where they details that they were seeing a huge influx of traffic from UDP port 11211. That’s the port used by memcached, a database caching system.

Surprisingly, or not, there were thousands of companies that had left UDP/11211 open to the entire Internet. And, by design, memcached responds to anyone that queries that port. Also, carefully crafted packets can be amplified to have massive responses. In Cloudflare’s testing they were able to send a 15 byte packet and get a 134KB response. Given that this protocol is UDP and capable of responding to forged packets in such a way as to make life miserable for Cloudflare and, now, Github, which got blasted with the largest DDoS attack on record.

How can you fix this problem in your network? There are many steps you can take, whether you are a system admin or a network admin:

  • Go to Shodan and see if you’re affected. Just plug in your company’s IP address ranges and have it search for UDP 11211. If you pop up, you need to find out why memcached is exposed to the internet.
  • If memcached isn’t supposed to be publicly available, you need to block it at the edge. Don’t let anyone connect to UDP port 11211 on any device inside your network from outside of it. That sounds like a no-brainer, but you’d be surprised how many firewall rules aren’t carefully crafted in that way.
  • If you have to have memcached exposed, make sure you talk to that team and find out what their bandwidth requirements are for the application. If it’s something small-ish, create a policer or QoS policy that rate limits the memcached traffic so there’s no way it can exceed that amount. And if that amount is more than 100Mbit of traffic, you need to have an entirely different discussion with your developers.
  • From Cloudflare’s blog, you can disable UDP on memcached on startup by adding the -U 0 flag. Make sure you check with the team that uses it before you disable it though before you break something.

Tom’s Take

Exposing unnecessary services to the Internet is asking for trouble. Given an infinite amount of time, a thousand monkeys on typewriters will create a Shakespearean play that details how to exploit that service for a massive DDoS attack. The nature of protocols to want to help make things easier doesn’t make our jobs easier. They respond to what they hear and deliver what they’re asked. We have to prevent bad actors from getting away with things in the network and at the system level because application developers rarely ask “may I” before turning on every feature to make users happy.

Make sure you check your memcached settings today and immunize yourself from this problem. If Github got blasted with 1.3Tbps of traffic this week there’s no telling who’s going to get hit next.

Wireless Doctors

Wireless is a complicated thing. Even when you try to distill it down to networking basics on the wired side of the access point, you still have a very hard problem to solve on the radio side. Even I’ve talked in the past about how wireless is now considered a “solved” problem. But, the more I interact with wireless professionals and the more I think about the problem, the issue isn’t that IT departments think wireless is solved, it’s that they don’t appreciate the value of a specialist.

The Last Place Doctor

There’s an old joke that goes, “What do you call the person that graduated last in their medical school class? Doctor.” Professionals spend a lot of their time learning a tradecraft and practicing it to get better. And it’s not just doctors. So do plumbers, electricians, and teachers. Anyone that has ever tried to do any of these trades will tell you that the basics are capable of being figured out by the average non-professional, but the details are a huge leap.

You’d never assume that being able to put on a Band-Aid on a scrape would qualify you to do brain surgery. Or that changing a lightbulb would mean you can rewire a house. Why is it then that most IT people think that knowing the radiation pattern of an access point antenna qualifies them to just hang them wherever they want with no regard for coverage or interference?

Specialists are an important part of society. They spend their time learning things so that they can do them better than anyone else. You’d never argue that a basketball player would make a good offensive lineman. That’s a physical difference and a difference in skillset that can’t translate between the two. So why do we do it in IT?

For wireless specifically people think that it’s easy because it “just works”. Time and time again when I talk to my friends in the wireless community they tell me that it’s far too easy to put up wireless that works badly. And because it’s functional people just go with it. Whether it’s a hotel or a public venue or a coffee shop, people are content to tolerate bad design and terrible implementation. Yet, when someone steps in to try and help them fix the problem there is hesitation on the part of the customer to make it happen.

The Specials

Why are customers hesitant to make their wireless work correctly with help from a specialist? The answer could be that people think wireless is so easy that paying someone to do it is too much of an expense. It could also be that when the wireless professional starts talking about the pieces that are “hard”, namely the radio design, antenna selection, and site survey, that people just tune out the jargon and think they are getting sold a bill of goods. Yet, when the doctor starts telling them about all the procedures that need to be done to get them healthy they won’t bat an eye.

Wireless professionals need to be treated just like any other specialized professional that provides a service to help people. It may not be brain surgery or arguing a case before the Supreme Court, but it’s a piece of specialized knowledge that they spend their time practicing to the point where they are very, very good at it.

Wireless professionals also need to make sure they justify their value when the conversation inevitably turn toward costing too much or not adding any value to a design or survey. Stand up for what you do! Tell the customer that your skills are crucial to make this deployment work properly. It’s always amazing to me that no one bats an eye at someone when they say they need time to figure out OSPF in a network but when a wireless professional says they need to do a site survey there is a huge discussion about it.

Customers too need to realize that their wireless deployments are easier to accomplish when the proper resources are allocated to make them happen quickly and efficiently. You can either pay a professional with years of experience to make it happen or you can grab a CWNA book and start learning the trade. But thinking that wireless is an easy problem does a disservice for both the wireless professional community and for your users as well.

Tom’s Take

The more I talk to my wireless friends, the more I realize that wireless is hard. I spend a lot of time with the packets on the wire side and I understand how those things work. But I also don’t have to worry about trees, microwaves, Bluetooth, or any one of the hundred other problems that can interfere with an otherwise perfect deployment. The people that know more than me have learned over years how to do it right. And so, when someone asks me if I can do a big wireless installation for them I don’t have a problem going to one of my friends. Because I’d rather the wireless doctors do it right than me doing it halfway.

Making Alexa Tech Demos Useful

Technology always marches on. People want to see the latest gadgets doing amazing things, whether it be flying electric cars or telepathic eyeglasses. Our society is obsessed with the Jetsons and the look of the future. That’s why we’re developing so many devices to help us get there. But it’s time for IT to reconsider how they are using one of them for a purpose far from the original idea.

Speaking For The People

By all accounts, the Amazon Echo is a masterful device. It’s a smart speaker that connects to an Amazon service that offers you a wider variety of software programs, called skills, to enhance what you can do with it. I have several of these devices that were either given out as conference attendance gifts or obtained from other giveaways.

I find the Echo speaker a fascinating thing. It’s a good speaker. It can play music through my phone or other Bluetooth-connected devices. But, I don’t really use it for that purpose. Instead, I use the skills to do all kinds of other things. I play Jeopardy! frequently. I listen to news briefings and NPR on a regular basis. I get weather forecasts. My son uses the Echo to check simple fraction math when he’s doing homework. My daughter uses it to time her math facts practice.

It would appear that the power behind an Echo speaker lies not in the hardware, but in the software stack built on it. It’s so powerful that most people don’t even refer to the speaker as an “Echo”, but instead as “Alexa”, the default name used to activate the listening service. People ask Alexa all kinds of things. And Alexa provides answers or ways to get the answers. It’s so popular that modern IT organizations have started to get in on the action.

Alexa, Tell Me A Story

Enterprise IT vendors are starting to show off their programming skills by creating Alexa skills to integrate with their software. Ostensibly, this would be to showcase how the platform has a rich API that allows for a large amount of information to be queried all at once. Users could ask Alexa to give them a readout of what’s going on without having to log into the system at any given time. I’ve personally seen demos that ask Alexa to find out who is using all the network bandwidth, what is the status of the wireless network, and even details on protocols.

However, there is a huge downside to using Alexa for this purpose. Without specifically crafted questions, you get a readout that is like trying to drink from a monotone firehose. Alexa is just like any computer system in that it will dutifully read you whatever input is given to it. That’s fine if you want the kind of detail that you get in your average computer monitoring system. But, if you’re using a smart speaker to cut down on the amount of information you are processing, you probably don’t want the entire text of the system read out to you.

I always fall back on the idea of people trying to make small talk. When you ask someone how their day is going, you typically aren’t looking for a recitation of their entire schedule from start to finish with all the details they can pack in. You’re looking for simple answer – good, okay, or not good. That’s the basic level of information that anyone wants about anything. More specific queries can drill down into other areas, but the initial conversation needs to be easy to parse in one or two sentences.

Another issue with using Alexa for technical demos is how the system parses IP addresses and DNS names. Alexa will dutifully read an IP address to you one digit at a time, including periods between octets. That can be annoying for addresses in the old Class C range with lots of 3-digit numbers. Also, you’d have to write them down to get any kind of coherence about which system was being discussed, which does kind of eliminate the usefulness of getting information from a speaker. With DNS names, Alexa will try to read the name of the system as if it were a real word. That can produce results that range from hilarious to downright unintelligible. It makes trying to understand these briefings much harder.

So, how can this be fixed? The answer is actually quiet easy. Instead of making your Alexa skill read off every possible piece of information with a simple query, have it give a basic readout. Possible answers like:

  • Things look good now
  • There are a couple of trouble spots to look at. Would you like to know more?
  • There are quite a few problems. I suggest logging in to learn more.

Each of these answers gives the user a chance to understand things. A “good” response means everything is good and you don’t need to know more. An “okay” or middle response says there are only a couple of issues that could be summarized here. A “bad” response tells the user that there is too much information to be easily digested in an audio briefing an that they should log into the system to see more. That gives the user the option of getting more compact information in a format that makes sense to them rather than listening to the speaker drone on for 5 minutes about all the errors in the system.

Tom’s Take

Technology is a wonderful thing. Technology used for the proper purpose is even better. The Amazon Echo is a great tool that helps advance our understanding of what people listen to and how they use machine learning and AI to ask questions and get answers. But, ultimately the Echo is a consumer device built around consumer questions. It’s up to enterprise tech vendors to write skills that give us the chance to interact with the speaker, not just get an information dump first thing in the morning. Enterprise tech vendors need to understand that they are what makes Alexa’s briefing useful. Select the information they will receiving and package it in such as way as to make it digestible.

The Winds of Change From January

Some quick thoughts on networking from my last couple of weeks at Networking Field Day 17 and Tech Field Day Extra at Cisco Live Europe:

  • Cisco is in the middle of turning a big ship away from hardware. All their innovation is coming in the software side of the house. Big announcements around network assurance. It’s not enough any more to do the things. Now you need to prove they were done and show your work. Context and Intent only work if you can quantitatively show that they were applied.
  • Containers are still a thing. Cisco has a new container platform. I also had the chance to chat with a startup called AppOrbit that’s doing some interesting things around containers but including storage and networking. They should be primed for some announcements soon, so stayed tuned for that!
  • Automation is cool again. Well, maybe it never stopped being cool. But thanks to Extreme Networks and Juniper people are really hopping on the train to talk more about removing the limitations of the CLI and doing it with tools like Slack. Check out Lindsay Hill and Matt Oswalt showing this off to people in some finely crafted demos.
  • 2018 is the year that the CLI dies. Sure, we’ll go with that. Between Slack and Github and even Cisco’s push to drive ACI through literally everything we’re going to see more and more people configuring networks with a mouse instead of a keyboard. Which is a bit crazy when you think about it, but it’s not so far fetched as you might think compared to the way people are configuring AWS right now. I dare you to find the CLI for AWS’s switches in your control panel.
  • Lastly, change is inevitable. People reading through the above items may say to themselves that their job is going to away. They may worry that they’re going to be an old fuddy duddy before they know it. If you never want to change, that’s fine. As Truman Boyes said this week: https://twitter.com/trumanboyes/status/961785937993846789 But if you want to really succeed and move along, you can’t be afraid to change. You need to pick up new skills and learn new things. Oceans and rivers don’t erode mountains because they are there. They wear them down because they are incapable of moving and changing. Change is thrust upon them.

Tom’s Take

Go out and make a change this week. Do something different. Use a different treadmill for your workout. Visit a store you’ve never seen before. Place yourself in a different situation and see how you respond to it. Then come back to your desk and look at your work. Look at containers and automation with new eyes. I bet it will look a lot less scary and lot more fun to you. Don’t be afraid of change. Embrace it and grow.


Is ACI Coming For The CLI?

I’m soon to depart from Cisco Live Barcelona. It’s been a long week of fun presentations. While I’m going to avoid using the words intent and context in this post, there is one thing I saw repeatedly that grabbed my attention. ACI is eating Cisco’s world. And it’s coming for something else very soon.

Devourer Of Interfaces

Application-Centric Infrastructure has been out for a while and it’s meeting with relative success in the data center. It’s going up against VMware NSX and winning in a fair number of deals. For every person that I talk to that can’t stand it I hear from someone gushing about it. ACI is making headway as the tip of the spear when it comes to Cisco’s software-based networking architecture.

Don’t believe me? Check out some of the sessions from Cisco Live this year. Especially the Software-Defined Access and DNA Assurance ones. You’re going to hear context and intent a lot, as those are the key words for this new strategy. You know what else you’re going to hear a lot?

Contract. Endpoint Group (EPG). Policy.

If you’re familiar with ACI, you know what those words mean. You see the parallels between the data center and the push in the campus to embrace SD-Access. If you know how to create a contract for an EPG in ACI, then doing it in DNA Center is just as easy.

If you’ve never learned ACI before, you can dive right in with new DNA Center training and get started. And when you finally figured out what you’re doing, you can not only use those skills to program your campus LAN. You can extend them into the data center network as well thanks to consistent terminology.

It’s almost like Cisco is trying to introduce a standard set of terms that can be used to describe consistent behaviors across groups of devices for the purpose of cross training engineers. Now, where have we seen that before?

Bye Bye, CLI

Oh yeah. And, while you’re at it, don’t forget that Arista “lost” a copyright case against Cisco for the CLI and didn’t get fined. Even without the legal ramifications, the Cisco-based CLI has been living on borrowed time for quite a while.

APIs and Python make programming networks easy. Provided you know Python, that is. That’s great for DevOps folks looking to pick up another couple of libraries and get those VLANs tamed. But it doesn’t help people that are looking to expand their skillset without leaning an entirely new language. People scared by semicolons and strict syntax structure.

That’s the real reason Cisco is pushing the ACI terminology down into DNA Center and beyond. This is their strategy for finally getting rid of the CLI across their devices. Now, instead of dealing with question marks and telnet/SSH sessions, you’re going to orchestrate policies and EPGs from your central database. Everything falls into place after that.

Maybe DNA Center does some fancy Python stuff on the back end to handle older devices. Maybe there’s even some crazy command interpreters literally force-feeding syntax to an ancient router. But the end goal is to get people into the tools used to orchestrate. And that day means that Cisco will have a central location from which to build. No more archaic terminal windows. No more console cables. Just the clean purity of the user interface built by Insieme and heavily influenced by Cisco UCS Director.

Tom’s Take

Nothing goes away because it’s too old. I still have a VCR in my house. I don’t even use it any longer. It sits in a closet for the day that my wife decides she wants to watch our wedding video. And then I spend an hour hooking it up. But, one of these days I’m going to take that tape and transfer it to our Plex server. The intent is still the same – my wife gets to watch videos. But I didn’t tell her not to use the VCR. Instead, I will give her a better way to accomplish her task. And on that day, I can retire that old VCR to the same pile as the CLI. Because I think the ACI-based terminology that Cisco is focusing on is the beginning of the end of the CLI as we know it.