About networkingnerd

Tom Hollingsworth, CCIE #29213, is a former network engineer and current organizer for Tech Field Day. Tom has been in the IT industry since 2002, and has been a nerd since he first drew breath.

The Winds of Change From January

Some quick thoughts on networking from my last couple of weeks at Networking Field Day 17 and Tech Field Day Extra at Cisco Live Europe:

  • Cisco is in the middle of turning a big ship away from hardware. All their innovation is coming in the software side of the house. Big announcements around network assurance. It’s not enough any more to do the things. Now you need to prove they were done and show your work. Context and Intent only work if you can quantitatively show that they were applied.
  • Containers are still a thing. Cisco has a new container platform. I also had the chance to chat with a startup called AppOrbit that’s doing some interesting things around containers but including storage and networking. They should be primed for some announcements soon, so stayed tuned for that!
  • Automation is cool again. Well, maybe it never stopped being cool. But thanks to Extreme Networks and Juniper people are really hopping on the train to talk more about removing the limitations of the CLI and doing it with tools like Slack. Check out Lindsay Hill and Matt Oswalt showing this off to people in some finely crafted demos.
  • 2018 is the year that the CLI dies. Sure, we’ll go with that. Between Slack and Github and even Cisco’s push to drive ACI through literally everything we’re going to see more and more people configuring networks with a mouse instead of a keyboard. Which is a bit crazy when you think about it, but it’s not so far fetched as you might think compared to the way people are configuring AWS right now. I dare you to find the CLI for AWS’s switches in your control panel.
  • Lastly, change is inevitable. People reading through the above items may say to themselves that their job is going to away. They may worry that they’re going to be an old fuddy duddy before they know it. If you never want to change, that’s fine. As Truman Boyes said this week: https://twitter.com/trumanboyes/status/961785937993846789 But if you want to really succeed and move along, you can’t be afraid to change. You need to pick up new skills and learn new things. Oceans and rivers don’t erode mountains because they are there. They wear them down because they are incapable of moving and changing. Change is thrust upon them.

Tom’s Take

Go out and make a change this week. Do something different. Use a different treadmill for your workout. Visit a store you’ve never seen before. Place yourself in a different situation and see how you respond to it. Then come back to your desk and look at your work. Look at containers and automation with new eyes. I bet it will look a lot less scary and lot more fun to you. Don’t be afraid of change. Embrace it and grow.

 

Advertisements

Is ACI Coming For The CLI?

I’m soon to depart from Cisco Live Barcelona. It’s been a long week of fun presentations. While I’m going to avoid using the words intent and context in this post, there is one thing I saw repeatedly that grabbed my attention. ACI is eating Cisco’s world. And it’s coming for something else very soon.

Devourer Of Interfaces

Application-Centric Infrastructure has been out for a while and it’s meeting with relative success in the data center. It’s going up against VMware NSX and winning in a fair number of deals. For every person that I talk to that can’t stand it I hear from someone gushing about it. ACI is making headway as the tip of the spear when it comes to Cisco’s software-based networking architecture.

Don’t believe me? Check out some of the sessions from Cisco Live this year. Especially the Software-Defined Access and DNA Assurance ones. You’re going to hear context and intent a lot, as those are the key words for this new strategy. You know what else you’re going to hear a lot?

Contract. Endpoint Group (EPG). Policy.

If you’re familiar with ACI, you know what those words mean. You see the parallels between the data center and the push in the campus to embrace SD-Access. If you know how to create a contract for an EPG in ACI, then doing it in DNA Center is just as easy.

If you’ve never learned ACI before, you can dive right in with new DNA Center training and get started. And when you finally figured out what you’re doing, you can not only use those skills to program your campus LAN. You can extend them into the data center network as well thanks to consistent terminology.

It’s almost like Cisco is trying to introduce a standard set of terms that can be used to describe consistent behaviors across groups of devices for the purpose of cross training engineers. Now, where have we seen that before?

Bye Bye, CLI

Oh yeah. And, while you’re at it, don’t forget that Arista “lost” a copyright case against Cisco for the CLI and didn’t get fined. Even without the legal ramifications, the Cisco-based CLI has been living on borrowed time for quite a while.

APIs and Python make programming networks easy. Provided you know Python, that is. That’s great for DevOps folks looking to pick up another couple of libraries and get those VLANs tamed. But it doesn’t help people that are looking to expand their skillset without leaning an entirely new language. People scared by semicolons and strict syntax structure.

That’s the real reason Cisco is pushing the ACI terminology down into DNA Center and beyond. This is their strategy for finally getting rid of the CLI across their devices. Now, instead of dealing with question marks and telnet/SSH sessions, you’re going to orchestrate policies and EPGs from your central database. Everything falls into place after that.

Maybe DNA Center does some fancy Python stuff on the back end to handle older devices. Maybe there’s even some crazy command interpreters literally force-feeding syntax to an ancient router. But the end goal is to get people into the tools used to orchestrate. And that day means that Cisco will have a central location from which to build. No more archaic terminal windows. No more console cables. Just the clean purity of the user interface built by Insieme and heavily influenced by Cisco UCS Director.


Tom’s Take

Nothing goes away because it’s too old. I still have a VCR in my house. I don’t even use it any longer. It sits in a closet for the day that my wife decides she wants to watch our wedding video. And then I spend an hour hooking it up. But, one of these days I’m going to take that tape and transfer it to our Plex server. The intent is still the same – my wife gets to watch videos. But I didn’t tell her not to use the VCR. Instead, I will give her a better way to accomplish her task. And on that day, I can retire that old VCR to the same pile as the CLI. Because I think the ACI-based terminology that Cisco is focusing on is the beginning of the end of the CLI as we know it.

Legacy IT Is Not A Monument

During Networking Field Day 17, there was a lot of talk about legacy IT constructs, especially as they relate to the cloud. Cloud workloads are much better when they are new things with new applications and new processes. Existing legacy workloads are harder to move to the cloud, especially if they require some specific Java version or special hardware to work properly.

We talk a lot about how painful legacy IT is. So why do we turn it into a monument that spans the test of time?’

Keeping Things Around

Most monuments that we have from ancient times are things that we never really intended to keep. Aside from the things that were supposed to be saved from the beginning, most iconic things were never built to last. Even things like the Parthenon or the Eiffel Tower. These buildings were always envisioned to be torn down sooner or later.

Today, we can’t imagine a world without those monuments. We can’t conceive of a time without them. And, depending on the amount of time that elapsed between the building or creation of things and the decision to preserve it there could be irreparable damage. Yet that just adds to the charm.

Now, apply those factors to legacy IT. We have software that is outdated. We have applications that need old Java versions to run correctly. Or outdated DLLs. Or some other kind of thing that could cause complication or damage to our systems. Yet, we can’t bear to part with our old familiar IT. Maybe it was a UI change. Maybe it never really ran correctly on a new operating system. Or maybe the new version cost so much to upgrade that it was just cheaper to keep hacking the old thing to work slightly better each time.

Rather than examining how we could replicate the workload or find a better, quicker way to do things, we find ourselves building legacy IT into a preserved monument. We freeze the software or hardware and we never update it. We build crazier and more complicated solutions to keep something running that is well past the retirement date.

View Only

The other complication of legacy IT is that we use the programs but we never really do more than look at them. We don’t focus on the data or the process. Instead, we just plug things into a system and run what we need to run. I remember working for IBM and having to enter my weekly timesheet twice. Why? Because the new timesheet system was a Java app that ran on Windows 2000. But the system that paychecks were generated from for hourly employees (like me) was run from an AS/400 terminal window. So, after I spent half an hour entering my time for the week, I had to spend another half hour entering those exact same time entries into a console.

Would it have been easier to replicate the functions of the terminal program in Java and make time entry a single thing? Sure. Would it be easier for everything to integrate and reduce time for employees? Sure. But the people that had been using the console program for the last decade had it down. They could enter their time in a few minutes. In fact, even though the Java program was more precise for time increments most employees hated it. They’d rather use an imprecise and outdated program because it was faster and more familiar. Even though they had to manually edit their timesheet after the fact because the newer reason codes weren’t loaded in the old system.

Read-only legacy IT makes everyone’s life miserable. All kinds of crazy patches and hacks are necessary to make it run correctly with new functions. And no matter what the replacement solution is something that people will hate. Simply because it’s not the old system.

Hands Off

This, for me, is the hardest part of legacy IT monuments. Once it works, NEVER TOUCH IT AGAIN. You can’t migrate, upgrade, or move a machine. You can’t get newer hardware that’s under support. The number of times that I’ve had to buy parts from Ebay to fix broken legacy systems is much, much to high.

Now, we have to worry about what happens when the system never comes back. Or when we push it past the breaking point for some legacy app. Look at the shift from 32-bit applications to 64-bit. We’re in the transition process and yet, still, there are people that have some old application or hardware device that can’t run on newer software. Once we force a cutoff, we have to find a way to build band-aids that people can use to make a decade-old thing work.

This hands-off mentality is also part of the reason why cloud migration projects fail. Even if you can get 90% of the software in your environment to work the way you want, the odds are that the remaining 10% is composed of legacy applications that haven’t been retired because they are mission critical and very finicky. They won’t migrate. They might even be running in VMs that have to emulate old OSes because they can’t ever be upgraded. And those kinds of old familiar pets are the ones that take too much of your time.


Tom’s Take

Monuments are old buildings. We keep them around because they remind us of how things used to be. They might be falling apart. They may take more time to keep up but people love to see them and enjoy them for the nostalgia. Legacy IT is not that. It’s a headache. It’s a pain to have read-only apps that we can never change because we don’t want to or can’t get them running on something new. Rather than building them into a static monument, we need to retire them and find a way to build something new. Because no matter how beautiful they may be, no legacy IT project will ever stand the test of time like the Parthenon.

Can Routing Be Oversimplified?

I don’t know if you’ve had a chance to see this Reddit thread yet, but it’s a funny one:

We eliminated routing protocols from our network!

Short non-clickbait summary: We deployed SD-WAN and turned off OSPF. We now have a /16 route for the internal network and a default route to the Internet where a lot of our workloads were moved into the cloud.

Bravo for this networking team for simplifying their network to this point. All other considerations aside, does this kind of future really bode well for SD-WAN?

Now You See Me

As pointed out in the thread above, the network team didn’t really get rid of their dynamic routing protocols. The SD-WAN boxes that they put in place are still running BGP or some other kind of setup under the hood. It’s just invisible to the user. That’s nothing new. Six years ago, Ivan Pepelnjak found out Juniper QFabric was running BGP behind the scenes too.

Hiding the networking infrastructure from the end user is nothing new. It’s a trick that has been used for years to allow infrastructures to be tuned and configured in such a way as to deliver maximum performance without letting anyone tinker with the secret sauce under the hood. You’ve been using it for years whether you realize it or not. Have MPLS? Core BGP routing is “hidden” from you. SD-WAN? Routing protocols are running between those boxes. Moved a bunch of workloads to AWS/Azure/GCE? You can better believe there is some routing protocol running under that stack.

Making things complex for the sake of making them hard to work on is foolish. We’ve spent decades and millions of dollars trying to make things easy. If you don’t believe me, look at the Apple iPhone. That device is a marvel at hiding all the complexity underneath. But, it also makes it really hard to troubleshoot when things go wrong.

Building On Shoulders

SD-WAN is doing great things for networking. I can remember years ago the thought of turning up a multi-site IPSec VPN configuration was enough to give me hives, let alone trying to actually do it. Today, companies like Viptela, VeloCloud, and Silver Peak make it easy to do. They’re innovating on top of the stack instead of inside it.

So much discussion in the community happens around building pieces of the stack. We spend time and effort making a better message protocol for routing information exchange. Or we build a piece of the HTTP stack that should be used in a bigger platform. We geek out about technical pieces because that’s where our energy feels the most useful.

When someone collects those stack pieces and tries to make them “easy”, we shout that company down and say that they’re hiding complexity and making the administrators and engineers “forget” how to do the real work. We spend more time focusing on what’s hidden and not enough on what’s being accomplished with the pieces. If you are the person that developed the fuel injection system in a car, are you going to sit there and tell Ford and Chevrolet than bundling it into a automotive platform is wrong?

So, while the end goal of any project like the one undertaken above is simplification or reducing problems because of less complex troubleshooting it is not a silver bullet. Hiding complexity doesn’t make it magically go away. Removing all your routing protocols in favor of a /16 doesn’t mean your routing networking runs any better. It means that your going to have to spend more time trying to figure out what went wrong when something does break.

Ask yourself this question: Would you rather spend more time building out the network and understand every nook and cranny of it or would you rather learn it on the fly when you’re trying to figure out why something isn’t working the way that it should? The odds are very good that you’re going to put the same amount of time into the network either way. Do you want to front load that time? Or back load it?


Tom’s Take

The Reddit thread is funny. Because half the people are dumping on the poster for his decision and the rest are trying to understand the benefits. It surely was created in such a way as to get views. And that worked admirably. But I also think there’s an important lesson to learn there. Simplicity for the sake of being simple isn’t enough. You have to replace that simplicity with due diligence. Because the alternative is a lot more time spent doing things you don’t want to do when you really don’t want to be doing them.

Chipping Away At Technical Debt

We’re surrounded by technical debt every day. We have a mountain of it sitting in distribution closets and a yard full of it out behind the data center. We make compromises for budget reasons, for technology reasons, and for political reasons. We tell ourselves every time that this is the last time we’re giving in and the next time it’s going to be different. Yet we find ourselves staring at the landscape of technical debt time and time again. But how can we start chipping away at it?

Time Is On Your Side

You may think you don’t have any time to work on the technical debt problem. This is especially true if you don’t have the time due to fixing problems caused by your technical debt. The hours get longer and the effort goes up exponentially to get simple things done. But it doesn’t have to be that way.

Every minute you spend trying to figure out where a link goes or a how a server is connected to the rest of the pod is a minute that should have been spent documenting it somewhere. In a text document, in a picture, or even on the back of a napkin stuck to the faceplate of said server!

I once watch an engineer get paid an obscene amount of money to diagram a network. Because it had never been done. He didn’t modify anything. He didn’t change a setting. All he did was figure out what went where and what the interface addresses were. He put it in Visio and showed it to the network administrators. They were so thrilled they had it printed in poster size and framed! He didn’t do anything above and beyond showing them what they already had.

It’s easy to say that you’re going to document something. It’s hard to remember to do it. It’s even harder when you forget to do it and have to spend an hour remembering why you did it in the first place. So it’s better to take a minute to write in an interface description. Or perhaps a comment about why you did a thing the way you did it. Every one of those minutes chips away a little more technical debt.

Ask yourself this question next time: If I spent less time solving documentation problems, what could I do to reduce debt overall?

Mad As Hell And Not Going To Design It Anymore

The other sources of technical debt are money and politics. And those usually come into play in the design phase. People jockeying for position or trying to save money for something else. I don’t mind honest budget saving measures. What I mind is trying to cut budgets for things like standing desks and quarterly bonuses. The perception of IT is that it is still a budget sink with no real purpose to support the business. And then the email server goes down. That’s why IT’s real value comes out.

When you design a network, you need to take into account many factors. What you shouldn’t take into account is someone dictating how things are going to be just because they like the way that something looks or sounds. When I worked at a VAR there were always discussions like this. Who should we use for the switching hardware? Which company has a great rebate this quarter? I have a buddy that works at Vendor X so we should throw him a bone this time.

As the networking professional, you need to stand up for your decisions. If you put the hard work and effort into making something happen you shouldn’t sit down and let it get destroyed by someone else’s bad idea. That’s how technical debt starts. And if you let it start this way you’ll never be able to get rid of it. Both because it will be outside of your control and because you’ll always identify someone else as the source of the problem.

So, how do you get mad and fix it before it starts? Know your project cold. Know every bolt and screw. Justify every decision. Point out the mistakes when they happen in front of people that don’t like to be embarrassed. In short, be a jerk. The kind of self-important, overly cerebral jerk that is smug because he knows he’s right. And when people know you’re right they’ll either take your word for things or only challenge you when they know they’re right.

That’s not to say that you should be smug and dismissive all the time. You’re going to be wrong. We all are. Well, everyone except my wife. But for the rest of us, you need to know that when you’re wrong you need to accept it and discuss the situation. No one is 100% infallible, but someone that can’t recognize when they are wrong can create a whole mountain range of technical debt before they’re stopped.


Tom’s Take

Technical Debt is just a convenient term for something we all face. We don’t like it, but it’s part of the job. Technical Debt compounds faster than any finance construct. And it’s stickier than student loans. But, we can chip away at it slowly with some care. Take a little time to stop the little problems before they happen. Take a minute to save an hour. And when you get into that next big meeting about a project and you see a huge tidal wave of technical debt headed your way, stand up and let it be known. Don’t be afraid to speak out. Just make sure you’re right and you won’t drown.

2018 Is The Year Of Writing Everything

Welcome back to a year divisible by 2! 2018 is going to be a good year through the power of positive thinking. It’s going to be a fun year for everyone. And I’m going to do my best to have fun in 2018 as well.

Per my tradition, today is a day to look at what is going to be coming in 2018. I don’t make predictions, even if I take some shots at people that do. I also try not to look back to heavily on the things I’ve done over the past year. Google and blog searches are your friend there. Likely as not, you’ve read what I wrote this year and found one or two things useful, insightful, or amusing. What I want to do is set up what the next 52 weeks are going to look like for everyone that comes to this blog to find content.

Wearing Out The Keyboard

The past couple of years has shown me that the written word is starting to lose a bit of luster for content consumers. There’s been a bit push to video. Friends like Keith Townsend, Robb Boardman, and Rowell Dionicio have started making more video content to capture people’s eyes and ears. I myself have even noticed that I spend more time listening to 10-minute video recaps of things as opposed to reading 500-600 words on the subject.

I believe that my strengths lie in my writing. I’m going to continue to produce content around that here for the foreseeable future. But that’s not so say that I can’t dabble. With the help of my media-minded co-worker Rich Stroffolino, we’ve created a weekly video discussion called the Gestalt IT Rundown. It’s mostly what you would expect from a weekly video series. Rich and I find 2-3 tech news articles and make fun of them while imparting some knowledge and perspective. We’ve been having a blast recording them and we’re going to be bringing you more of them every Wednesday in 2018.

That’s not to say that my only forays on GestaltIT.com are going to be video focused. I’m also going to be writing more there as well. I’ve had some articles go up there this year that talked about some of the briefings that I’ve been taking from networking companies. That is going to continue through 2018 as well. I’ve also had some great long form writing, including my favorite article from last year about vendors and VARs. I’m going to continue to post articles on Gestalt IT in 2018 as well as posting them here. I’m going to need to figure out the right mix of what to put up for my day job and what to put up here for my Batman job. If there’s a kind of article you prefer to see here please make sure to let me know so I can find a way to make more of them.

Lastly, in 2018 I’m going to try some new things as well with regard to my writing. I’m going to write more coverage of the events I attend. Things like Cisco Live, WLPC, and other industry events will have some discussions of what goes on there for people that can’t go or are looking for the kind of information that you can’t get from an event website. I’m also going to try to find a way to incorporate some of my more detailed pieces together as reports for consumption. I’m not sure how this is going to happen, but I figured if I put it down here as a goal then I would be required to make it happen next year.


Tom’s Take

I like writing. I like teaching and informing and telling the occasional bad joke about BGP. It’s all because I get regular readers that want to hear what I have to say that I continue to do what I do. Rather than worry about the lack of content and coverage that I’m seeing in the community, I’ve decided to do my part to put out more. It is a challenging exercise to come up with new ideas every week and put them out for the world to see. But as long as folks like you keep reading what I have to say I’ll keep writing it all down.

Programming Unbound

I’m doing some research on Facebook’s Open/R routing platform for a future blog post. I’m starting to understand the nuances a bit compared to OSPF or IS-IS, but during my reading I got stopped cold by one particular passage:

Many traditional routing protocols were designed in the past, with a strong focus on optimizing for hardware-limited embedded systems such as CPUs and RAM. In addition, protocols were designed as purpose-built solutions to solve the particular problem of routing for connectivity, rather than as a flexible software platform to build new applications in the network.

Uh oh. I’ve seen language like this before related to other software projects. And quite frankly, it worries me to death. Because it means that people aren’t learning their lessons.

New and Improved

Any time I see an article about how a project was rewritten from the ground up to “take advantage of new changes in protocols and resources”, it usually signals to me that some grad student decided to rewrite the whole thing in Java because they didn’t understand C. It sounds a bit cynical, but it’s not often wrong.

Want proof? Check out Linus Torvalds and his opinion about rewriting the Linux kernel in C++. Spoiler alert – “C++ is a horrible language.” And it gets more colorful from there. Linus has some very valid points about C++ that have been debated by lots of communities for the past ten years. But the fact remains that he has decided that completely rewriting the entire kernel in C++ is an exercise in futility.

In today’s world, we’re faced with a multitude of programming languages fighting for our attention. We’re evolved past FORTRAN, COBOL, C, and C++. We now live a world of Python, C#, J#, R, Ruby, and dozens more. And those don’t even include the languages that aren’t low-level and are more scripting. Every one of these languages was designed to solve a particular problem. And every one of them is begging to be used.

But it’s not enough that we have ten ways to write a function today. What’s more troublesome is that we’ve forgotten why certain languages were preferred over others in the past. We’ve forgotten that things used to be done the way they were done because we had no other alternatives. I can remember studying for my Novell CNE and taking 50-649, wherein Novell kept referring to OSPF as an “expensive” protocol to use on a NetWare server. At the time that test was created, OSPF was expensive from a CPU cycle standpoint. If the server was doing other things besides running a routing protocol you might see a potential impact if the CPU was slowed. And having OSPF calculations interrupted because someone was doing an FTP transfer could be expensive indeed.

No Wasted Space

More to the point, when people are faced with a limitation they have to be creative and concise. And nowhere is that more apparent than in E.T. the Extraterrestrial for the Atari 2600. Infamously, Howard Scott Warshaw had just one month to write a video game that would go on to be blasted critically, considered one of the worst of all time, and be blamed for the Video Game Crash of 1983. Yet, as one fan discovered years later when he set out to “fix” the game’s code, as bad is it may have been it was very well coded. From the article:

…it’s unlikely that Howard Scott Warshaw (the developer) included some useless code for us to replace…

So, a programmer for an outdated video game system had a month to code a complex game and managed to do it in such a way as to leave very little empty space to insert code patches? And yet my Facebook app on my iPhone requires how much space?!?

All joking aside, the issue with E.T. wasn’t code quality. The problems with the game have been documented over the years, but almost no one blames Warshaw’s coding capabilities. That’s because Warshaw was working within his limitations. He couldn’t magically make the Atari 2600 cartridge bigger. He couldn’t increase the CPU size on the system. He worked within him limitations and made the best game that he could make.

Now, let’s look at an article about Open/R and see some of Facebook’s criticisms of OSPF and IS-IS:

We didn’t want to get bogged down in discussions over the lower-level protocol details, such as frame formatting and handshakes…

While it might sound heavyweight compared with OSPF and ISIS, which use their own “lightweight” transports, we haven’t found this to be an issue in modern networking hardware…

Whether or not they were intended to be taken as such, these are some pretty interesting knocks against OSPF and IS-IS. What Facebook is essentially saying is that they didn’t want to worry about building the low level parts of the messaging system, so they picked something off the shelf. They also built it to be more resource intensive than necessary because they didn’t need to compromise when running it on Six Pack and Wedge.

So long as your routers have ample CPU cycles and memory, Open/R will run just fine. But how many people out there are running a data center server board in their edge router? How many routers out there have to take a reduced BGP table because they don’t have enough memory to fit the entire global IPv4 routing table in memory, let alone IPv6? If resources are infinite and time is irrelevant than building your protocols the way you want is of no consequence. But as soon as you add constraints to the equation, like support for older hardware or limited memory, you have to start making compromises to make things work.


Tom’s Take

I’m not saying that Open/R is a bad routing protocol. I’m going to save that analysis for a later time. But I do take a bit of umbrage with Facebook’s idea that OSPF and IS-IS are a bit outdated simply because they were programmed for a different era. If they were really that inept they would have been replaced or expanded by now. The fact that twenty-somethings got a bug to rewrite a routing protocol because they could and threw all caution to the wind with regard to resource usage should be a cautionary tale to any programmer out there. Never assume that you have more space than you need. Train yourself to do more with less. And be ready to compromise in case the worst case scenario becomes reality.