Documentation | The Networking Nerd

Rocket Science

Most people have a negative outlook on helpdesk work. They see it as entry-level and not worth admitting to. They also don’t quite understand just how hard it is to do this kind of work. It may be different today compared to when I did it twenty years ago through the advances in technology but there is one group of people that know just how hard it is to do remote work on a system you can’t physically touch:

NASA Engineers

If you think it’s easy to just jump in front of a keyboard and fix an issue with Outlook or Chrome, try doing it without being able to see what you’re looking at. Now try and describe what you want to do without technical jargon and without using the term “thingie”. Now do it with the patience that it takes to tell someone more than once what they need to do and how to get them to read you the error message verbatim so you can figure out what’s actually going on.

Inbound phone support is a lot like fix a space probe. You have a lag time between commands being sent and confirmation. You can’t make it too complicated because the memory of the thing you’re working with isn’t unlimited. And you have to hope what you did actually fixes the issue because you may not get a second chance at it. It might sound a touch hyperbolic but I’d argue that phone support teaches the skills needed to communicate effectively and concisely to people less technical than you. You know, executives.

Joking aside, being able to distill the issue from incomplete information and formulate a response while simultaneously describing it in non-technical terms to ease implementation is the kind of skill you don’t read out of a book. Don’t believe me? Call your parents and help them update their DNS settings. Unless your parents are in IT I doubt it’s going to be a fun call. Tech is obtuse on purpose to keep people from playing with it. The people that can cut through the jargon and keep users going are magicians.

Visualize the Result

I have a gift for analogies. They aren’t always 100% accurate but I can usually relate some kind of a technology to some real-world non-technical thing to help users understand what’s going on. That gift was developed in the six months I spent trying to explain bad capacitors, startup TSRs, and exactly what an FDISK, Format, Reload process did to the computer. It’s a super valuable skill to have outside of tech too.

Go ask your doctor to explain how the endocrine system works in non-medical terms. I bet they can because they have a way to describe most of the body functions in terms that someone who has never taken human anatomy can understand. You can’t explain technical subjects with technical terms to just anyone. And even the technical people that know what you’re saying often have a hard time visualizing what you’re talking about. That’s how we ended up with the “Internet as a series of tubes” meme.

Where does this skill come in handy? How about teaching advanced tech to your junior staff? Or educating the executives in a board meeting? Or even just trying to tell the users in your organization why it’s not the network this time? If you can only talk in technical jargon without giving analogies or helping them visualize what you’re saying you will only end up with confused angry users that think you’re being patronizing.

The helpdesk forces you to get better at being descriptive with shorter words and more visual descriptions. I’d argue that my helpdesk time made my writing better because it reminds me not to be loquacious when I should be economical in my word choices and sentence lengths. If you can’t help your user figure out what you’re doing they’ll never learn how to be better at telling people what they need support for.

Documenting All the Things

The last skill that the helpdesk instilled in me is documentation. I’m a bad note taker. I get distracted (thanks ADHD) and often forget to write down important things. Helpdesk work is a place where you absolutely need to write everything down. Ask questions and record info. Dig into error messages and find out when they started happening. Listen to what people are telling you and don’t jump to conclusions until you’ve written it all down.

I once had a call where the user told me that they couldn’t get into their computer. I thought this was an easy fix. I spent ten minutes walking through the Windows XP login process. Nothing worked. I was getting frustrated and was about to reboot to safe mode and start erasing password hashes. By chance the user mentioned that they saw the pop up for the buzzer and it still wasn’t working. After I questioned them on this I realized that they could log in to WinXP just fine. The “buzzer popup” was AOL’s login screen. They had an issue with the modem, not the operating system. If I’d asked more questions about when the problem started and what they saw instead of just jumping to the wrong conclusion I could have saved a ton of pain and wasted effort.

Likewise, when you’re doing IT work you have to write it all down. Troubleshooting makes a lot of sense here but so does implementation or other kinds of design work. If you’re doing a wireless survey on site and forget to write down what the walls are made out of you may find yourself with an ineffective design or, worse having to spend extra time and money fixing it on the fly because those reinforced concrete walls are way better at signal attenuation than you recalled from a spotty memory.

Get in the habit of taking good notes from the start and summarizing them when needed to weed out the working things from the non-working things. Honestly, having a record of all the steps you took to arrive at your conclusion helps in the future if the issue happens again or you find yourself at a brick wall and need to retrace your steps to figure out where you took the wrong turn. If you record it all you won’t need to spend too much effort scratching your head to figure out what you were thinking.

Tom’s Take

Every job teaches you skills you can carry forward. Even the worst job in the world teaches you something you can take with you. Some jobs have a negative stigma and really shouldn’t. Like Swift said above you should accentuate the positive aspects of your journey. Don’t look at the helpdesk as a slog through the trenches. See yourself as a NASA rocket scientist that can talk to normal people and document like a champion. That’s a career anyone would be proud to have.

Imagine you’re deep into a massive issue. You’ve been troubleshooting for hours trying to figure out why something isn’t working. You’ve pulled in resources to help and you’re on the line with the TAC to try and get a resolution. You know this has to be related to something recent because you just got notified about it yesterday. You’re working through logs and configuration setting trying to gain insights into what went wrong. That’s when the TAC engineer hits you with with an armor-piecing question:

When did this start happening?

Now you’re sunk. When did you first start seeing it? Was it happening before and no one noticed? Did a tree fall in the forest and no one was around to hear the sound? What is the meaning of life now?

It’s not too hard to imagine the above scenario because we’ve found ourselves in it more times than we can count. We’ve started working on a problem and traced it back to a root cause only to find out that the actual inciting incident goes back even further than that. Maybe the symptoms just took a while to show up. Perhaps someone unknowingly “fixed” the issue with a reboot or a process reload over and over again until it couldn’t work any longer. How do we find ourselves in this mess? And how do we keep it from happening?

Quirky Worky

Glitches happen. Weird bugs crop up temporarily. It happens every day. I had to reboot my laptop the other day after being up for about two months because of a series of weird errors that I couldn’t resolve. Little things that weren’t super important that eventually snowballed into every service shutting down and forcing me to restart. But what was the cause? I can’t say for sure and I can’t tell you when it started. Because I just ignored the little glitches until the major ones forced me to do something.

Unless you work in security you probably ignore little strange things that happen. Maybe an application takes twice as long to load one morning. Perhaps you click on a link and it pops up a weird page before going through to the actual website. You could even see a storage notification in the logs for a router that randomly rebooted itself for no reason in the middle of the night. Occasional weirdness gets dismissed by us because we’ve come to expect that things are just going to act strangely. I once saw a quote from a developer that said, “If you think building a steady-state machine is easy, just look at how many issues are solved with a reboot.”

We tend to ignore weirdness unless it presents itself as a more frequent issue. If a router reboots itself once we don’t think much about it. The fourth time it reboots itself in a day we know we’ve got a problem we need to fix. Could we have solved it when the first reboot happened? Maybe. But we also didn’t know we were looking at a pattern of behavior either. Human brains are wired to look for patterns of things and pick them out. It’s likely an old survival trait of years gone by that we apply to current technology.

Notice that I said “unless you work in security”. That’s because the security folks have learned over many years and countless incidents that nothing is truly random or strange. They look for odd remote access requests or strange configuration changes on devices. They wonder why random IP addresses from outside the company are trying to access protected systems. Security professionals treat every random thing as a potential problem. However, that kind of behavior also demonstrates the downside of analyzing every little piece of information for potential threats. You quickly become paranoid about everything and spend a lot of time and energy trying to make sense out of potential nonsense. Is it any wonder that many security pros find themselves jumping at every little shadow in case it’s hiding a beast?

Middle Ground of Mentions

On the one hand, we have systems people that dismiss weirdness until it’s a pattern. On the other we have security pros that are trying to make patterns out of the noise. I’m sure you’re probably wondering if there has to be some kind of middle ground to ensure we’re keeping track of issues without driving ourselves insane.

In fact, there is a good policy that you need to get into the habit of doing. You need to write it all down somewhere. Yes, I’m talking about the dreaded documentation monster. The kind of thing that no one outside of developers likes to do. The mean, nasty, boring process of taking the stuff in your brain and putting it down somewhere so someone can figure out what you’re thinking without the need to read your mind.

You have to write it down because you need to have a record to work from if something goes wrong later. One of the greatest features I’ve ever worked with that seems to be ignored by just about everyone is the Windows Shutdown Reason dialog box in Windows Server 2003 and above. Rebooting a box? You need to write in why and give a good justification. That way if someone wants to know why the server was shut off at 11:15am on a Tuesday they can examine the logs. Unfortunately in my experience the usual reason for these shutdowns was either “a;lkjsdfl;kajsdf” or “because I am doing it”. Which aren’t great justifications for later.

You don’t have to be overly specific with your documentation but you need to give enough detail so that later you can figure out if this is part of a larger issue. Did an application stop responding and need to be restarted? Jot that down. Did you need to kill a process to get another thing running again? Write down that exact sentence. If you needed to restart a router and you ended up needing to restore a configuration you need to jot that part down too. Because you may not even realize you have an issue until you have documentation to point it out.

I can remember doing a support call years ago with a customer and in conversation he asked me if I knew much about Cisco routers. I chuckled and said I knew a bit. He said that he had one that he kept having to copy the configuration files to every time it restarted because it came up blank. He even kept a console cable plugged into it for just that reason. Any CCNA out there knows that’s probably a config register issue so I asked when it started happening. The person told me at least a year ago. I asked if anyone had to get into the router because of a forgotten password or some other lockout. He said that he did have someone come out a year and a half prior to reset a messed up password. Ever since then he had to keep putting the configuration back in. Sure enough, the previous tech hadn’t reset the config register. One quick fix and the customer was very happy to not have to worry about power outages any longer.

Documenting when things happen means you can build a timeline per device or per service to understand when things are acting up. You don’t even need to figure it out yourself. The magic of modern systems lies in machine learning. You may think to yourself that machine learning is just fancy linear regression at this point and you would be right more often than not. But one thing linear regression is great at doing is surfacing patterns of behavior for specific data points. If your router reboots on the third Wednesday of every month precisely at 3:33am the ML algorithms will pick up on that and tell you about it. But that’s only if your system catches the reboot through logs or some other record keeping. That’s why you have to document all your weirdness. Because the ML systems can analyze what they don’t know about.

Tom’s Take

I love writing. And I hate documentation. Documentation is boring, stuffy, and super direct. It’s like writing book reports over and over again. I’d rather write a fun blog post or imagine an exciting short story. However, documentation of issues is critical to modern organizations because these issues can spiral out of hand before you know it. If you don’t write it down it didn’t happen. And you need to know when it happened if you hope to prevent it in the future.

The Networking Nerd

Networking With A Side of Snark

Tag Archives: Documentation

Helpdesk Skills Fit the Bill

Rocket Science

Visualize the Result

Documenting All the Things

Tom’s Take

Document The First Time, Every Time

Quirky Worky

Middle Ground of Mentions

Tom’s Take

Rocket Science

Visualize the Result

Documenting All the Things

Tom’s Take

Share this:

Quirky Worky

Middle Ground of Mentions

Tom’s Take

Share this: