Tedium is the enemy of productivity. The fastest way for a task to not be done is to make it long, boring, and somewhat complicated. People who feel that something is tedious or repetitive are the ones more likely to marginalize a task. And I think I speak for the entire industry when I say that there is no task more tedious and boring than documentation. So how can we fix it?
Tell Me What You Did
I’m not a huge fan of documentation. When I decide on a plan of action, I rarely write it down step-by-step unless I’m trying to train someone. Even then, it looks more like notes with keywords instead of a narrative to follow. It’s a habit that has been borne out of years of firefighting in networks and calls to “do it faster”. The essential items of a task are refined and reduced until all that remains is the work and none of the ancillary items, like documentation.
Based on my previous life as a network engineer, I can honestly say that I’m not alone in this either. My old company made lots of money doing network discovery engagements. Sometimes these came because the previous admins walked out the door with no documentation. Other times, it was simply because the network had changed so much since the last person made any notes that what was going on didn’t resemble anything like what they thought it was supposed to look like.
This happens everywhere. It doesn’t take many instances of an network or systems professional telling themselves, “Oh, I’ll write it down later…” for later to never come. Devices get added, settings get changed, and not one word is ever written down. That’s the kind of chaos that causes disorganization at best and outages at worst. And I doubt there’s any networking pro out there that hasn’t been affected by bad documentation at one time or another.
So, how do we fix documentation? It’s tedious for sure. Requiring it as part of the process just invites people to find ways around it. And good documentation takes time. Is there a way to combine the lack of time, lack of requirement, and repetition and make documentation something that is done again? I think there is. And it requires a little help from process.
Not Too Late To Automate
Automation is a big thing right now. SDN is driving it. Network complexity is practically requiring it. Yet networking professionals are having a hard time embracing it. Why?
In part, networking pros don’t like to spend hours solving a problem that can be done in minutes. If you don’t believe me, watch one of the old SNL Nick Burns sketches. Nick is more likely to tell you to move than tell you how to fix your problem. Likewise, if a network pro is spending four hours writing an automation script that is supposed to execute a change that can be made in 20 minutes, they’re not going to want to do it. It’s just the nature of the job and the desire of the network professional to make every minute count.
So, how can we drive adoption of automation? As it turns out, automating documentation can be a huge driver. Automation of tedious tasks is exactly the thing that scripting and automation was designed to solve. Instead of focusing on the automation of the task, like adding VLANs to a set of switches, focus on the ability of the system to create documentation on the fly from the change.
Let’s walk through an example. In order for documentation to matter, it has to answer the 5 Ws. How can we automate that?
Let’s start with Who. Automation can create documentation saying user Hollingsworth made a change through an automated process. That helps the accounting side of the house figure out the person making changes in the network. If that person is actually a script, the Who can be changed to reflect that it was an automated process called by a person related to a change ticket. That gives everyone the ability to track the changes back to a given problem. And it can all be pulled in without user intervention.
What is also an easy automation task. List the configuration being applied. At first, the system can simply list the configuration to be programmed. But for menial and repetitive tasks like VLAN additions you can program the system with a real description like “Adding VLANs to $Switch to support $ticket”. Those variables can be autopopulated based on the work to be done. Again, we reference a ticket number in order to prove that these changes are coming from somewhere.
When is also critical. Are these changes happening in a maintenance window? Or did someone check them in in the middle of the day because they won’t cause any problems? (SPOILER ALERT: They will) By required a timestamp for changes, you can track which professionals are being cavalier with their change management. You can also find out if someone is getting into the system after hours to cause problems or attempt to compromise things. Even if the cause of the change is “immediately” due to downtime or emergency, knowing why it had to be checked in right away is a clue to finding problems that recur in the network.
Where is a two-pronged reason. It’s important to check where the changes are going to be applied. Is it going to be done to all switches in the organization? Or just a set in a remote office. Sanity checking via documentation will keep you from bricking your entire organization in one fell swoop. Likewise, knowing where the change is being checked in from is important. Is a remote office trying to change config on HQ switches? Is a remote engineer dialed in making changes related to an open support case? Is someone from a foreign nation making changes via VPN at 4:30am local time? In every case, you’d really want to know what’s going on before those changes get made.
Why is the one that will trip up everything. If you don’t believe me, I’d like to give you the top two reasons why Windows Server 2003 is shut down and rebooted with the shutdown justification dialog box:
- JUST ****ing SHUT DOWN!!!!!
People don’t like justifying their decisions. Even when I worked for Gateway 2000 on their national help desk, our required call documentation was a bit spotty when it came to justification for changes. Why did you decide to FDISK and reload? Why are you going into the registry to fix the icon colors? Change justification is half of documentation. It gives people something to audit. It gives people a way to look at things and figure out why you started down the path of a particular reasoning for problem solving. It also provides context for you after the fact when you can’t figure out why you did it the way you did.
Automation isn’t going to take away your job. Automation is going to do the jobs you hate doing. It’s going to make your life easier to concentrate on the tasks that need to be done by freeing you from the tasks that should be done and aren’t. If we can make automation document our networks for just six months, I think you’ll find the value in programming things to work this way. I also think you’ll be happier with the level of detail on your network. And once you can prove the value of automating just one task to your teams, I’m sure they’ll see the value of increasing automation all around.