APIs and Department Stores

This week I tweeted something from a discussion we had during Networking Field Day that summed up my feelings about the state of documentation of application programming interfaces (APIs):

I laughed a bit as I wrote it because I’ve worked in department stores like Walmart in the past and I know the reasons why they tend to move things around. Comparing that to the way that APIs are documented is an interesting exercise in how people think about things like new capabilities and notification of changes.

Branding Exercises

In case you weren’t aware, everything in your average department store is carefully planned out. The things placed in the main aisles are decided on weeks in advance due to high traffic. The items placed at the ends of the aisles, or endcaps, are placed there to highlight high margin items or things that are popular enough to be sought out by customers. The makeup of the rest of the store is determined by a lot of metrics.

There are a few restrictions that have to be taken into account. In department stores with grocery departments, the locations of the refrigerated sections must be around the outside because of power requirement. Within those restrictions, plans put the high traffic items in the back of the store to require everyone to walk past all the other stuff in hopes they might buy it. That’s why the milk and bread and electronics areas are always the furthest away from the front of the store. You’re likely headed there anyway so why not make you work for it?

Every few months the store employees receive new floor plans that move items to different locations. Why would they do that? Well, those metrics help them understand where people are more likely to purchase certain items. Those metrics also tell the planners what items should be located together as well, which is how the whole aisle is planned out. Once everything gets moved they start gathering new metrics and find out how well their planning works. Aside from the inevitable grumbles. Even with some fair warning no one is happy when you find out something has moved.

Who Needs Documentation?

You might think that, on the surface, there’s not much similarity between a department store aisle and an API. One is a fixture. The other is code. Yet, think about how APIs are typically changed and you might find some of the parallels. Change is a constant in the world of software development, after all.

The APIs that we used a decade ago are almost assuredly different from the ones we program for today. Every year brings updated methods, new functions, and even changes in programming languages or access methods. How can you be sure that developers are accessing the latest and greatest technology that you’ve put into place? You can’t just ask them. Instead, you have to deprecate the methods that you don’t want them to use any longer.

Ask any developer writing for a API about deprecation and you’re probably going to hear a string of profanity. Spending time to write a perfectly good piece of software only to have it wrecked by someone’s decision to do things differently is infuriating to say the least. Trying to solve a hard problem with a novel concept is one thing. Having to do it all over again a month later when a new update is released is even more infuriating.

It’s the same fury that you feel when the peanut butter is moved from aisle four to aisle eight. How dare you! It took me a week last time to remember where it was and now you’ve gone and moved it. Just like when I spent all that time learning which methods to query to pull the data I needed for my applications.

No matter how much notice you give or how much you warn people that change is coming they’re always going to be irritated at you for making those changes. It feels like a waste of effort to need to rewrite an interface or to walk a little further in the store to locate the item you wanted. Humans aren’t fond of wasted effort or of needing to learn new things without good reason.

Poor API documentation is only partly to blame for this. Even the most poorly documented API will eventually be mapped out by someone that needs the info. It’s also the fact that the constant change in methods and protocols forces people to spend a significant amount of time learning the same things over and over again for very little gain.

The Light at the End of the Aisle

Ironically enough, both of these kinds of issues are likely to be solved in a similar way. Thanks to the large explosion of people doing their shopping online or with pickup and delivery services there is a huge need to have things more strictly documented and updated very frequently. It’s not enough to move the peanut butter to a better location. Now you need to update your online ordering system so the customers as well as the staff members pulling it for a pickup order can find it quickly and get more orders done in a shorter time.

Likewise, the vast number of programs that are relying on API calls today necessitate that older versions of functionality are supported for longer or newer functions are more rigorously tested before implementation. You don’t want to disable a huge section of your userbase because you deprecated something you didn’t like to maintain any longer. Unless you are the only application in the market you will find that creating chaos will just lead to users fleeing for someone that doesn’t upset their apple cart on a regular basis.


Tom’s Take

Documentation is key for us to understand change. We can’t just say we changed something. We have to give warning, ensure that people have seen the warning, tell them we’ve changed it, and then give them some way to transform the old way of things into the new one. And even that might not be enough. However, the pace of change that we’re seeing also means that rapid changes may not even be required for much longer. With people choosing to order online and never step foot inside the store the need to change the shelves frequently may be a thing of the past. With new methods and languages being developed so rapidly today it may be much faster to rewrite everyone on a new API and leave the old one intact instead of forcing developers to look at technology that is years old at this point. The delicious irony of the people forcing change on us to need to accept change themselves is something I’d happily shop for.

Fast Friday – Podcasts Galore!

It’s been a hectic week and I realized that I haven’t had a chance to share some of the latest stuff that I’ve been working on outside of Tech Field Day. I’ve been a guest on a couple of recent podcasts that I loved.

Art of Network Engineering

I was happy to be a guest on Episode 57 of the Art of Network Engineering podcast. AJ Murray invited me to take part with all the amazing co-hosts. We talked about some fun stuff including my CCIE study attempts, my journey through technology, and my role at Tech Field Day and how it came to be that I went from being a network engineer to an event lead.

The interplay between the hosts and I during the discussion was great. I felt like we probably could have gone another hour if we really wanted to. You should definitely take a listen and learn how I kept getting my butt kicked by the CCIE open-ended questions or what it’s like to be a technical person on a non-technical briefing.

IPv6, Wireless, and the Buzz

I love being able to record episodes of Tomversations on Youtube. One of my latest was all about IPv6 and Wi-Fi 6E. As soon as I hit the button to publish the episode I knew I was going to get a call from my friends over at the IPv6 Buzz podcast. Sure enough, I was able to record an episode talking to them all about how the parallels between the two technologies are similar in my mind.

What I love about this podcast is that these are the experts when it comes to IPv6. Ed and Tom and Scott are the people that I would talk to about IPv6 any day of the week. And having them challenge my assertions about what I’m seeing helps me understand the other side of the coin. Maybe the two aren’t as close as I might have thought at first but I promise you that the discussion is well worth your time.


Tom’s Take

I don’t have a regular podcast aside from Tomversations so I’m not as practiced in the art of discussion as the people above. Make sure you check out those episodes but also make sure to subscribe to the whole thing because you’re going to love all the episodes they record.

Getting Blasted by Backdoors

Open Door from http://viktoria-lyn.deviantart.com/

I wanted to take minute to talk about a story I’ve been following that’s had some new developments this week. You may have seen an article talking about a backdoor in Juniper equipment that caused some issues. The issue at hand is complicated at the linked article does a good job of explaining some of the nuance. Here’s the short version:

  • The NSA develops a version of Dual EC random number generation that includes a pretty substantial flaw.
  • That flaw? If you know the pseudorandom value used to start the process you can figure out the values, which means you can decrypt any traffic that uses the algorithm.
  • NIST proposes the use of Dual EC and makes it a requirement for vendors to be included on future work. Don’t support this one? You don’t get to even be considered.
  • Vendors adopt the standard per the requirement but don’t make it the default for some pretty obvious reasons.
  • Netscreen, a part of Juniper, does use Dual EC as part of their default setup.
  • The Chinese APT 5 hacking group figures out the vulnerability and breaks into Juniper to add code to Netscreen’s OS.
  • They use their own seed value, which allows them to decrypt packets being encrypted through the firewall.
  • Hilarity does not ensue and we spend the better part of a decade figuring out what has happened.

That any of this even came to light is impressive considering the government agencies involved have stonewalled reporters and it took a probe from a US Senator, Ron Wyden, to get as far as we have in the investigation.

Protecting Your Platform

My readers know that I’m a pretty staunch advocate for not weakening encryption. Backdoors and “special” keys for organizations that claim they need them are a horrible idea. The safest lock is one that can’t be bypassed. The best secret is one that no one knows about. Likewise, the best encryption algorithms are ones that can’t be reversed or calculated by anyone other than the people using them to send messages.

I get that the flood of encrypted communications today is making life difficult for law enforcement agencies all over the world. It’s tempting to make it a requirement to allow them a special code that will decrypt messages to keep us safe and secure. That’s the messaging I see every time a politician wants to compel a security company to create a vulnerability in their software just for them. It’s all about being safe.

Once you create that special key you’ve already lost. As we saw above, the intentions of creating a backdoor into an OS so that we could spy on other people using it backfired spectacularly. Once someone else figured out that you could guess the values and decrypt the traffic they set about doing it for themselves. I can only imagine the surprise at the NSA when they realized that someone had changed the values in the OS and that, while they themselves were no longer able to spy with impunity, someone else could be decrypting their communications at that very moment. If you make a key for a lock someone will figure out how to make a copy. It’s that simple.

We focus so much on the responsible use of these backdoors that we miss the bigger picture. Sure, maybe we can make it extremely difficult for someone in law enforcement to get the information needed to access the backdoor in the name of national security. But what about other nations? What about actors not tied to a political process or bound by oversight from the populace. I’m more scared that someone that actively wishes to do me harm could find a way to exploit something that I was told had to be there for my own safety.

The Juniper story gets worse the more we read into it but they were only the unlucky dancer with a musical chair to slip into when the music stopped. Any one of the other companies that were compelled to include Dual EC by government order could have gotten the short straw here. It’s one thing to create a known-bad version of software and hope that someone installs it. It’s an entirely different matter to force people to include it. I’m honestly shocked the government didn’t try to mandate that it must be used exclusively of other algorithms. In some other timeline Cisco or Palo Alto or even Fortinet are having very bad days unwinding what happened.


Tom’s Take

The easiest way to avoid having your software exploited is not to create your own exploit for it. Bugs happen. Strange things occur in development. Even the most powerful algorithms must eventually yield to Moore’s Law or Shor’s Algorithm. Why accelerate the process by cutting a master key? Why weaken yourself on purpose by repeating over and over again that this is “for the greater good”? Remember that the greater good may not include people that want the best for you. If you’re wiling to hand them a key to unlock the chaos that we’re seeing in this case then you have overestimated your value to the process and become the very bad actor you hoped to stop.

Sharing Failure as a Learning Model

Earlier this week there was a great tweet from my friends over at Juniper Networks about mistakes we’ve made in networking:

It got some interactions with the community, which is always nice, but it got me to thinking about how we solve problems and learn from our mistakes. I feel that we’ve reached a point where we’re learning from the things we’ve screwed up but we’re not passing it along like we used to.

Write It Down For the Future

Part of the reason why I started my blog was to capture ideas that had been floating in my head for a while. Troubleshooting steps or perhaps even ideas that I wanted to make sure I didn’t forget down the line. All of it was important to capture for the sake of posterity. After all, if you didn’t write it down did it even happen?

Along the way I found that the posts that got significant traction on my site were the ones that involved mistakes. Something I’d done that caused an issue or something I needed to look up through a lot of sources that I distilled down into an easy reference. These kinds of posts are the ones that fly right up to the top of the Google search results. They are how people know you. It could be a terminology post like defining trunks. Or perhaps it’s a question about why your SPF modules are working in a switch.

Once I realized that people loved finding posts that solved problems I made sure to write more of them down. If I found a weird error message I made sure to figure out what it was and then put it up for everyone to find. When I documented weird behaviors of BPDUGuard and BPDUFilter that didn’t match the documentation I wrote it all down, including how I’d made a mistake in the way that I interpreted things. It was just part of the experience for me. Documenting my failures and my learning process could help someone in the future. My hope was that someone in the future would find my post and learn from it like I had.

Chit Chat Channels

It used to be that when you Googled error messages you got lots of results from forum sites or Reddit or other blogs detailing what went wrong and how you fixed it. I assume that is because, just like me, people were doing their research and figuring out what went wrong and then documenting the process. Today I feel like a lot of that type of conversation is missing. I know it can’t have gone away permanently because all networking engineerings make mistakes and solve problems and someone has to know where that went, right?

The answer came to me when I read a Reddit post about networking message boards. The suggestions in the comments weren’t about places to go to learn more. Instead, they linked to Slack channels or Discord servers where people talk about networking. That answer made me realize why the discourse around problem solving and learning from mistakes seems to have vanished.

Slack and Discord are great tools for communication. They’re also very private. I’m not talking about gatekeeping or restrictions on joining. I’m talking about the fact that the conversations that happen there don’t get posted anywhere else. You can join, ask about a problem, get advice, try it, see it fail, try something else, and succeed all without ever documenting a thing. Once you solve the problem you don’t have a paper trail of all the things you tried that didn’t work. You just have the best solution that you did and that’s that.

You know what you can’t do with Slack and Discord? Search them through Google. The logs are private. The free tiers remove messages after a fashion. All that knowledge disappears into thin air. Unlike the Wisdom of the Ancients the issues we solve in Slack are gone as soon as you hit your message limit. No one learns from the mistakes because it looks like no one has made them before.

Going the Extra Mile

I’m not advocating for removing Slack and Discord from our daily conversations. Instead, I’m proposing that when we do solve a hard problem or we make a mistake that others might learn from we say something about it somewhere that people can find it. It could be a blog post or a Reddit thread or some kind of indexable site somewhere.

Even the process of taking what you’ve done and consolidating it down into something that makes sense can be helpful. I saw X, tried Y and Z, and ended up doing B because it worked the best of all. Just the process of how you got to B through the other things that didn’t work will go a long way to help others. Yes, it can be a bit humbling and embarrassing to publish something that admits you that you made a mistake. But It’s also part of the way that we learn as humans. If others can see where we went and understand why that path doesn’t lead to a solution then we’ve effectively taught others too.


Tom’s Take

It may be a bit self-serving for me to say that more people need to be blogging about solutions and problems and such, but I feel that we don’t really learn from it unless we internalize it. That means figuring it out and writing it down. Whether it’s a discussion on a podcast or a back-and-forth conversation in Discord we need to find ways to getting the words out into the world so that others can build on what we’ve accomplished. Google can’t search archives that aren’t on the web. If we want to leave a legacy for the DenverCoder10s of the future that means we do the work now of sharing our failures as well as our successes and letting the next generation learn from us.

The Mystery of Known Issues

I’ve spent the better part of the last month fighting a transient issue with my home ISP. I thought I had it figure out after a hardware failure at the connection point but it crept back up after I got back from my Philmont trip. I spent a lot of energy upgrading my home equipment firmware and charting the seemingly random timing of the issue. I also called the technical support line and carefully explained what I was seeing and what had been done to work on the problem already.

The responses usually ranged from confused reactions to attempts to reset my cable modem, which never worked. It took several phone calls and lots of repeated explanations before I finally got a different answer from a technician. It turns out there was a known issue with the modem hardware! It’s something they’ve been working on for a few weeks and they’re not entirely sure what the ultimate fix is going to be. So for now I’m going to have to endure the daily resets. But at least I know I’m not going crazy!

Issues for Days

Known issues are a way of life in technology. If you’ve worked with any system for any length of time you’ve seen the list of things that aren’t working or have weird interactions with other things. Given the increasing amount of interactions that we have with systems that are becoming more and more dependent on things it’s a wonder those known issue lists are miles long by now.

Whether it’s a bug or an advisory or a listing of an incompatibility on a site, the nature of all known issues is the same. They are things that don’t work that we can’t fix yet. They could be on a list of issues to resolve or something that may never be able to be fixed. The key is that we know all about them so we can plan around them. Maybe it’s something like a bug in a floating point unit that causes long division calculations to be inaccurate to a certain number of decimal places. If you know what the issue is you know how to either plan around it or use something different. Maybe you don’t calculate to that level of precision. Maybe you do that on a different system with another chip. Whatever the case, you need to know about the issue before you can work around it.

Not all known issues are publicly known. They could involve sensitive information about a system. Perhaps the issue itself is a potential security risk. Most advisories about remote exploits are known issues internally at companies before they are patched. While they aren’t immediately disclosed they are eventually found out when the patch is released or when someone discovers the same issue outside of the company researchers. Putting these kinds of things under an embargo of sorts isn’t always bad if it protects from a wider potential to exploit them. However, the truth must eventually come out or things can’t get resolved.

Knowing the Unknown

What happens when the reasons for not disclosing known problems are less than noble? What if the reasoning behind hiding an issue has more to do with covering up bad decision making or saving face or even keeping investors or customers from fleeing? Welcome to the dark side of disclosure.

When I worked from Gateway 2000 back in the early part of the millennium, we had a particularly nasty known issue in the system. The ultimate root cause was that the capacitors on a series of motherboards were made with poor quality controls or bad components and would swell and eventually explode, causing the system to come to a halt. The symptoms manifested themselves in all manner of strange ways, like race conditions or random software errors. We would sometimes spend hours troubleshooting an unrelated issue only to find out the motherboard was affected with “bad caps”.

The issue was well documented in the tech support database for the affected boards. Once we could determine that it was a capacitor issue it was very easy to get the parts replaced. Getting to that point was the trick, though. Because at the top of the article describing the problem was a big, bold statement:

Do Not Tell The Customer This Is A Known Issue!!!

What? I can’t tell them that their system has an issue that we need to correct before everything pops and shuts it down for good? I can’t even tell them what to look for specifically when we open the case? Have you ever tried to tell a 75-year-old grandmother to look for something “strange” in a computer case? You get all kinds of fun answers!

We ended up getting creative in finding ways to look for those issues and getting them replaced where we could. When I moved on to my next job working for a VAR, I found out some of those same machines had been sold to a customer. I opened the case and found bad capacitors right away. I told my manager and explained the issue and we started getting them replaced under warranty as soon as the first sign of problems happened. After the warranty expired we kept ordering good boards from suppliers until we were able to retire all of those bad machines. If I hadn’t have known about the bad cap issue from my help desk time I never would have known what to look for.

Known issues like these are exactly the kind of thing you need to tell your customers about. It’s something that impacts their computer. It needs to be fixed. Maybe the company didn’t want to have to replace thousands of boards at once. Maybe they didn’t want to have to admit they cut corners when they were buying the parts and now the money they saved is going to haunt them in increased support costs. Whatever the reason it’s not the fault of the customer that the issue is present. They should have the option to get things fixed properly. Hiding what has happened is only going to create stress for the relations between consumer and provider.

Which brings me back to my issue from above. Maybe it wasn’t “known” when I called the first time. But by the third or fourth time I called about the same thing they should have been able to tell me it’s a known problem with this specific behavior and that a fix is coming soon. The solution wasn’t to keep using the first-tier support fixes of resets or transfers to another department. I would have appreciated knowing it was an issue so I didn’t have to spend as much time upgrading and isolating and documenting the hell out of everything just to exclude other issues. After all, my troubleshooting skills haven’t disappeared completely!

Vendors and providers, if you have a known issue you should admit it. Be up front. Honestly will get you far in this world. Tell everyone there’s a problem and you’re working on a fix that you don’t have just yet. It may not make the customer happy at first but they’ll understand a lot more than hiding it for days or weeks while you scramble to fix it without telling anyone. If that customer has more than a basic level of knowledge about systems they’ll probably be able to figure it out anyway and then you’re going to be the one with egg on your face when they tell you all about the problem you don’t want to admit you have.


Tom’s Take

I’ve been on both sides of this fence before in a number of situations. Do we admit we have a known problem and try to get it fixed? Or do we get creative and try to hide it so we don’t have to own up to the uncomfortable questions that get asked about bad development or cutting corners? The answer should always be to own up to things. Make everyone aware of what’s going on and make it right. I’d rather deal with an honest company working hard to make things better than a dishonest vendor that miraculously manages to fix things out of nowhere. An ounce of honestly prevents a pound of bad reputation.

Slow and Steady and Complete

StepTiles

I was saddened to learn last week that one of my former coworkers passed away unexpectedly. Duane Mersman started at the same time I did at United Systems and we both spent most of our time in the engineering area working on projects. We worked together on so many things that I honestly couldn’t keep count of them if I tried. He’s going to be missed by so many people.

A Hare’s Breadth

Duane was, in many ways, my polar opposite at work. I was the hard-charging young buck that wanted to learn everything there was to know about stuff in about a week and just get my hands dirty trying to break it and learn from my mistakes. If you needed someone to install a phone system next week with zero formal training or learn how iSCSI was supposed to operate based on notes sketched on the back of a cocktail napkin I was your nerd. That meant we could often get things running quickly. It also meant I spent a lot of time trying to figure out why things weren’t working. I left quite a few forehead-shaped dents in data center walls.

Duane was not any of those things. He was deliberate and methodical. He spent so much time researching technology that he knew it backwards and forwards and inside out. He documented everything he did while he was working on it instead of going back after the fact to scribble down some awkward prose from his notes. He triple checked all his settings before he ever implemented them. Duane wouldn’t do anything until he was absolutely sure it was going to work. And even then he checked it again just to be sure.

I used to joke that we were two sides of the same coin. You sent me in to clean things up. Then you sent Duane in to clean up after me. I got in and out quickly but I wasn’t always the most deliberate. Duane would get in behind me and spend time making sure whatever I did was the right way. I honestly felt more comfortable knowing he would ensure whatever I did wasn’t going to break next week.

Turtle Soup

Management knew how to use us both effectively. When the customer was screaming and needed it done right now I was the guy. When you wanted things documented in triplicate Duane was the right man for the job. I can remember him working on a network discovery diagram for a medical client that was so detailed that we ended up framing it as a work of art for the customer. It was something that he was so proud of given the months that he toiled away on it.

In your organization you need to recognize the way that people work and use them effectively. If you have an engineer that just can’t be rushed no matter what you need to find projects for them to work on that can take time to work out correctly. You can’t rush people if they don’t work well that way. Duane had many gears but all of them needed to fit his need to complete every part of every aspect of the project. Likewise, hard chargers like me need to be able to get in and get things done with a minimum of distraction.

Think of it somewhat like an episode of The Witcher. You need a person to get the monsters taken care of but you also need someone to chronicle what happened. Duane was my bard. He documented what we did and made sure that future generations would remember it. He even made sure that I would remember the things that we did later when someone asked a question about it or I stated blaming the idiot that programmed it (spoiler alert: I was that idiot).

Lastly, Duane taught me the value of being a patient teacher. When he was studying to take his CCNP exams he spent a significant amount of time on the SWITCH exam learning the various states of spanning trees. I breezed through it because it mostly made sense to me. When he went through it he lobbed up every example and investigated all the aspects of the settings. He would ask me questions about why something behaved the way it did or how a setting could mess things up. As he asked me what I thought I tried to explain how I saw it. My explanations created more questions. But those questions helped me investigate why things worked the way they did. His need to know all about the protocol made me understand it at a more fundamental level than just passing an exam. He slowed me down and made sure I didn’t miss anything.


Tom’s Take

Duane was as much a mentor in my career as anyone. We learned from each other and we made sure to check each other’s work. He taught me that slow and steady is just as important as getting things done at warp speed. His need to triple check everything led me to do the same in the CCIE lab and is probably the reason why I eventually passed. His documentation and diagrams taught me to pay attention to the details. In the end he helped me become who I am today. Treasure the people you work with that take the time to do things right. It may take them a little longer than you’d like but in the end you’ll be happier knowing that they are there to make sure.

Follow My Leader

I spent the past two weeks enjoying the scenic views at the Philmont Scout Ranch with my son and some of his fellow Scouts BSA troop mates. It was very much the kind of vacation that involved a lot of hiking, mountain climbing, and even some inclement weather. We all completely enjoyed ourselves and I learned a lot about hanging bear bags and taking care of blisters. I also learned a lot about leadership by watching the boys in the crew interact with each other.

Storm Warnings

Leadership styles are nothing new to the people that read my blog. I’ve talked about them at length in the past. One thing I noticed when I was on the trek was how different leadership styles can clash and create friction among teenagers. As adults we tend to gloss over delivery and just accept that people are the way they are. When you’re fourteen or fifteen you haven’t quite taken that lesson to heart yet. That means more pushing against styles that don’t work for you.

We have all worked for or with someone that has a very authoritarian style in the past. The kind of people that say, “Do this right now” frequently. It’s a style that works well for things like military units or other places where decisions need to be quick and final. The crew leader exhibited that kind of leadership style to our crew. I sat back and watched how the other boys in the unit handled it.

If you’ve never gotten to watch the Stages of Team Development form in real time you’re missing out on a treat. I won’t go into too much depth here but the important stage happens after we get past the formation and into the Storming phase. This is where motivation and skill sets are low and the interaction between the members is primarily antagonistic. Arguments and defensiveness are more prevalent during storming. It happens every time and frequently occurs again and again as team members interact. It’s important to recognize the barriers that Storming creates and move past them to a place where the team puts the mission before their egos.

Easier said that done when you’re with a group of teenagers. I swear our group never really got past the storming phase for long. The end of the trek saw some friction still among the members. I couldn’t quite put my finger on why that was. After all, we grown ups can put things aside to focus on the mission, right? We can check our egos at the door and hope that we can just get past this next part to make things easier overall.

Style Points

That’s when our lead Crew Advisor pointed out a key piece of the puzzle I’d missed, even after all my time dealing with team development. He said to the crew on the last day, “There are a lot of leaders in this group. That’s why there was so much friction between you all.” It was like a lightbulb going off in my mind. The friction wasn’t the result of leadership styles inasmuch as it was the clash between styles that kids aren’t so good at hiding.

I’m not an authoritarian. I don’t demand people do things. I ask people to do things. Maybe when I want isn’t a request but it is almost always phrased that way. “Please walk the dog” or “Can you get me the hammer from the garage?” are common ways for me to direct my family or my unit. I was raised not to be a demanding person. However, in my house growing up those statements were never questions. I’ve continued that method of leadership as my own family has grown. Dad asks you to do something but it’s not optional.

Where my leadership style clashes is with people who tell you to do something right now. “Get this done” or “You go do this thing over here” wrankle me. Moreover, I get frustrated when I don’t understand the why behind it. I’m happy to help if you just help me understand why it needs to be done. Bear bags need to be hung right away to keep animals from devouring the human food. The dining fly needs to be put up to put things underneath in case of inclement weather. There’s an order to things that makes sense. You need to explain why instead of just giving orders.

As I watched the teenagers in the crew interact with each other I couldn’t understand the defensive nature of the interactions. Some of the crew mates flat out refused to do things because they didn’t get it. They took their time getting necessary tasks done because they felt like they were doing all the work. Until the end of the trip I didn’t understand that the reason for their lack of motivation wasn’t inspired by laziness, but instead by a clash in style.

My son is like me in that he asks people to do things. So when he was ordered to do something he felt the need to push back or express displeasure with the leadership style. It looked defiant because he was trying to communicate that politeness and explanation go a long way toward helping people feel more motivated to pitch in. 

For example, asking someone to help hang the bear bags because there is a storm coming in and they are the most efficient at it is a better explanation than telling them to just do it. Explaining that you want someone to train another person in a job because you excel at it helps the person understand this is more about education than making them do the job over and over again. I’ve mentioned it before when it comes to leaders leaning on the people that get the job done all the time without expressing why. It’s important to help people understand that they have special unique skills that are critical to helping out.

Promoting From Within

Leaders chafe at the styles that don’t match their own. One of the ways to help this process is through delegation. Instead of punishing those that talk back to you make them responsible for leading the group. Let them show off their leadership style to see how it is received. You’re essentially giving that person the power to express themselves to see if their way is better. Depending on your leadership style this may be difficult to do. Authoritarians don’t like letting go of their power. People with no patience are more likely to just do the job themselves instead of letting others learn. However, you need to do it.

Leaders will excel in the right environment. Give someone responsibility and let them accomplish things. Instead of simply giving out tasks let the leaders figure out how to accomplish the goals. I ran a small experiment where I told our crew leader to just take care of his one responsibility and then leave the crew to their own devices. By this point in the trek they knew what needed to be done. If they couldn’t find the motivation to get it done then it was on them and not the leader. Weather forced my hand before I could get the experiment done but when a leader is having issues with those under then chafing at their leadership style they need to empower their group to lead their way to see how effective it can be instead of just falling back on “I’m in charge so you do what I say”.


Tom’s Take

My leadership experience and training has been all about creating artificial situations where people are required to step up to lead. Seeing it happen organically was a new experience for me. Leaders emerge naturally but they don’t all grow at the same rate or in the same way. The insight gained at the end of the trip helped me understand the source of friction over the twelve days were were in the backcountry. I think I’d do things a little differently next time given the opportunity to allow those that needed a different style to come forward and provide their own way of doing things. I’ll be interested to see how those leaders develop as well as how I approach these situations in the future.

Pegasus Pisses Me Off

UnicornPegasus

In this week’s episode of the Gestalt IT Rundown, I jumped on my soapbox a bit regarding the latest Pegasus exploit. If you’re not familiar with Pegasus you should catch up with the latest news.

Pegasus is a toolkit designed by NSO Group from Israel. It’s designed for counterterrorism investigations. It’s essentially a piece of malware that can be dropped on a mobile phone through a series of unpatched exploits that allows you to create records of text messages, photos, and phone calls and send them to a location for analysis. On the surface it sounds like a tool that could be used to covertly gather intelligence on someone of interest and ensure that they’re known to law enforcement agencies so they can be stopped in the event of some kind of criminal activity.

Letting the Horses Out

If that’s where Pegasus stopped, I’d probably not care one way or the other. A tool used by law enforcement to figure out how to stop things that are tough to defend against. But because you’re reading this post you know that’s not where it stopped. Pegasus wasn’t merely a tool developed by intelligence agencies for targeted use. If I had to guess, I’d say the groundwork for it was laid when the creators did work in some intelligence capacity. Where things went off the rails was when they no longer did.

I’m sure that all of the development work on the tool that was done for the government they worked for stayed there. however, things like Pegasus evolve all the time. Exploits get patches. Avenues of installation get closed. And some smart targets figure out how to avoid getting caught or even how to detect that they’ve been compromised. That means that work has to continue for this to be effective in the future. And if the government isn’t paying for it who is?

If you guessed interested parties you’d be right! Pegasus is for sale for anyone that wants to buy it. I’m sure there are cursory checks done to ensure that people that aren’t supposed to be using it can’t buy it. But I also know that in those cases a few extra zeros at the end of a wire transfer can work wonders to alleviate those concerns.Whether or not it was supposed to be sold to everyone or just a select group of people it got out.

Here’s where my hackles get raised a bit. The best way to prevent a tool like this from escaping is to never have created it in the first place. Just like a biological or nuclear weapon, the only way to be sure it can never be used is to never have it. Weapons are a temptation. Bombs were built to be dropped. Pegasus was built to be installed somewhere. Sure, the original intentions were pure. This tool was designed to save lives. What happens when the intentions aren’t so pure? What happens when your enemies are terrorist but politicians with different views? You might scoff at the suggestion of using a counterterrorism tool to spy on your ideological opponents, but look around the world today and ask yourself if your opponents are so inclined.

Once Pegasus was more widely available I’m sure it became a very tempting way to eavesdrop on people you wanted to know more about. Journalist getting leaks from someone in your government? Just drop Pegasus on that phone and find out who it is. Annoying activist making the media hate you? Text him the Pegasus installer and dump his phone looking for incriminating evidence to shut him up. Suspect your girlfriend of being unfaithful? Pegasus can tell you for sure! See how quickly we went from “necessary evil to protect the people” to “petty personal reasons”?

The danger of the slippery slope is that once you’re on it you can’t stop. Pegasus may have saved some lives but it has undoubtedly cost many others too. It has been detected as far back as 2014. That means every source that has been compromised or every journalist killed doing their work could have been found out thanks to this tool. That’s an awful lot of unknowns to carry on your shoulders. I’m sure that NSO Group will protest and say that they never knowingly sold it to someone that used it for less-than-honorable purposes. Can they say for sure that their clients never shared it? Or that it was never stolen and used by the very people that it was designed to be deployed against?

Closing the Barn Door

The escalation of digital espionage is only going to increase. In the US we already have political leaders calling on manufacturers and developers to create special backdoors for law enforcement to use to detect criminals and arrest them as needed. This is along the same lines as Pegasus, just formalized and legislated. It’s a terrible idea. If the backdoor is created it will be misused. Count on that. Even if the people that developed it never intended to use it improperly someone without the same moral fortitude will eventually. Oppenheimer and Einstein may have regretted the development of nuclear weapons but you can believe that by 1983 the powers that held onto them weren’t so opposed to using them if the need should arise.

I’m also not so naive as to believe for an instant that the governments of the world are just going to agree to play nice and not developer these tools any longer. They represent a competitive advantage over their opponents and that’s not something they’re going to give up easily. The only thing holding them back is oversight and accountability to the people they protect.

What about commercial entities though? If governments are restrained by the people then businesses are only restrained by their stakeholders and shareholders. And those people only seem to care about making money. So if the best tool to do the thing appears and it can make them a fortune, would they forego they profits to take a stand against categorically evil behavior? Can you say for certain that would always be the case?


Tom’s Take

Governments may not ever stop making these weapons but perhaps it’s time for the private sector to stop. The best ways to keep the barn doors closed so the horses can’t get out is not to build doors in the first place. If you build a tool like Pegasus it will get out. If you sell it, even to the most elite clientele, someone you don’t want to have it will end up with it. It sounds like a pretty optimistic viewpoint for sure. So maybe the other solution is to have them install their tool on their own devices and send the keys to a random person. That way they will know they are being watched and that whomever is watching them can decide when and where to expose the things they don’t want known. And if that doesn’t scare them into no longer developing tools like this then nothing will.

Should We Embrace Points of Failure?

failure-pic-1

There was a tweet making the rounds this last week that gave me pause. Max Clark said that we should embrace single points of failure in the network. His post was an impassioned plea to networking rock stars out there to drop redundancy out of their networks and instead embrace these Single Points of Failure (SPoF). The main points Mr. Clark made boil down to a couple of major statements:

  1. Single-device networks are less complex and easier to manage and troubleshoot. Don’t have multiple devices when an all-in-one works better.
  2. Consumer-grade hardware is cheaper and easier to understand, therefore it’s better. Plus, if you need a backup you can just buy a second one and keep it on the shelf.

I’m sure more networking pros out there are practically bristling at these suggestions. Others may read through the original tweet and think this was a tongue-in-cheek post. Let’s look at the argument logically and understand why this has some merit but is ultimately flawed.

Missing Minutes Matter

I’m going to tackle the second point first. The idea that you can use cheaper gear and have cold standby equipment just sitting on the shelf is one that I’ve heard of many times in the past. Why pay more money to have a hot spare or a redundant device when you can just stock a spare part and swap it out when necessary? If your network design decisions are driven completely by cost then this is the most appealing thing you could probably do.

I once worked with a technology director that insisted that we forgo our usual configuration of RAID-5 with a hot spare drive in the servers we deployed. His logic was that the hot spare drive was spinning without actually doing anything. If something did go wrong it was just as easy to slip the spare drive in by taking it off the shelf and firing it up then instead of running the risk that the drive might fail in the server. His logic seemed reasonable enough but there was one variable that he wasn’t thinking about.

Time is always the deciding factor in redundancy planning. In the world of backup and disaster recovery they use the acronym RTO, which stands for Recovery Time Objective. Essentially, how long do you want your systems to be offline before the data is restored? Can you go days without getting your data back? Or do you need it back up and running within hours or even minutes? For some organizations the RTO could even be measured in mere seconds. Every RTO measurement adds additional complexity and cost.

If you can go days without your data then a less expensive tape solution is best because it is the cheapest per byte stored and lasts forever. If your RTO is minutes or less you need to add hardware that replicates changes or mirrors the data between sites to ensure there is always an available copy somewhere out there. Time is the deciding factor here, just as it is in the redundancy example above.

Can your network tolerate hours of downtime while you swap in a part from off the shelf? Remember that you’re going to need to copy the configuration over to it and ensure it’s back up and running. If it is a consumer-grade device there probably isn’t an easy way to console in and paste the config. Maybe you can upload a file from the web GUI but the odds are pretty good that you’re looking at downtime at least in the half-hour range if not more. If your office can deal with that then Max’s suggestions should work just fine.

For organizations that need to be back up and running in less than hours, you need to have fault tolerance in your network. Redundant paths for traffic or multiple devices to eliminate single points of failure are the only way to ensure that traffic keeps flowing in the event of a hardware failure. Sure, it’s more complicated to troubleshoot. But the time you spend making it work correctly is not time you’re going to spend copying configurations to a cold device while users and stakeholders are yelling at you to get things back online.

All-In-Wonder

Let’s look at the first point here. Single box solutions are better because they are simple to manage and give you everything you could need. Why buy a separate switch, firewall, and access point when you can get them all in one package? This is the small office / branch office model. SD-WAN has even started moving down this path for smaller deployments by pushing all the devices you need into one footprint.

It’s not unlike the TVs you can buy in the big box stores that have DVD players, VHS players, and even some streaming services built in. They’re easy to use because there are no extra wires to plug in and no additional remote controls to lose. Everything works from one central location and it’s simple to manage. The package is a great solution when you need to watch old VHS tapes or DVDs from your collection infrequently.

Of course, most people understand the drawbacks of this model. Those devices can break. They are much harder to repair when they’re all combined. Worse yet, if the DVD player breaks and you need to get it repaired you lose the TV completely during the process instead of just the DVD player. You also can’t upgrade the components individually. Want to trade out that DVD for a Blu-Ray player? You can’t unless you install one on its own. Want to keep those streaming apps up-to-date? Better hope the TV has enough memory to keep current. Event state-of-the-art streaming boxes will eventually be incapable of running the latest version of popular software.

All-in-one devices are best left to the edges of the network. They function well in offices with a dozen or so workers. If something goes bad on the device it’s easier to just swap the whole thing instead of trying to repair the individual parts. That same kind of mentality doesn’t work quite so well in a larger data center. The fact that most of these unified devices don’t take rack mounting ears or fit into a standard data center rack should be a big hint that they aren’t designed for use in a place that keeps the networking pieces off of someone’s desk.


Tom’s Take

I smiled a bit when I read the tweet that started this whole post. I’m sure that the networks that Max has worked on work much better with consumer all-in-one devices. Simple configurations and cold spares are a perfectly acceptable solution for law offices or tag agencies or other places that don’t measure their downtime in thousands of dollars per second. I’m not saying he’s wrong. I’m saying that his solution doesn’t work everywhere. You can’t run the core of an ISP with some SMB switches. You should run your three-person law office with a Cat6500. You need to decide what factors are the most important for you. Don’t embrace failure without thought. Figure out how tolerant you or your customers are of failure and design around it as best you can. Once you can do that you’ll have a much better idea of how to build your network with the fewest points of failure.

VARs See You As Technical Debt

I’ve worked for a Value Added Reseller (VAR) in the past and it was a good run of my career before I started working at Tech Field Day. The market was already changing eight years ago when I got out of the game. With the advent of the pandemic that’s especially true today. Quite a few of my friends say they’re feeling the pressure from their VAR employer to stretch beyond what they’re accustomed to doing or outright being treated in such a way as to be forced out or leaving on their own. They tell me they can’t quite understand why that’s happening. After some thought on the matter I think I know. Because you represent debt they need to retire.

Skill Up

We don’t start our careers knowing everything we need to know to make it. The industry spends a lot of time talking about careers and skill paths and getting your legs under you. Networking people need to learn Cisco or Juniper or whatever configuration language makes the most sense for them. Wireless people need to learn how to do site surveys and configure access points. Server people need to learn operating systems and hypervisors. We start accumulating skills to land jobs to earn money and hopefully learn more important skills to benefit our careers.

Who benefits from that learning though? You certainly do because you gain new ways to further your career. But your VAR gains value as well because they’re selling your skills. The “value added” part is you. When you configure a device or deploy a network or design a system you’re adding value through your skills. That’s what the VAR is charging for. Your skills are their business model. No VAR stays in business just reselling hardware.

Accumulating skills is the name of the game. Those skills lead to new roles and more responsibility. Those new roles lead to more money. Perhaps that means moving on to new companies looking to hire someone that has your particular expertise in an area. That’s a part of the game too, especially for VARs. And that’s where the whole debt mess starts.

Double Down on Debt

Your skills are valuable. They’re also debt. They represent a cost in time, money, and resources. The investment that your VAR makes in you is a calculated return on that debt. If your company primarily deploys Cisco networks then the training you get to install and configure Cisco switches is a return on your VAR being able to hire you out to do that skill. Being able to install and configure Juniper switches isn’t a valuable skill set for them unless they move into a new line of business.

People are no different. We acquire skills that suit us for a time that we may or may not use forever. It’s like riding a bike. We use it a lot when we’re young. We stop using it when we start to drive. We may start again when we need to use a bike for college or for living in a large city or if we pick up cycling or mountain biking as a sport. However, the bike riding skill is always there. It is a sunk cost for us because we acquired it and keep it with us.

For a VAR, your skill is not a sunk cost. It’s a graph of keeping the amount of billable hours you contribute above the line of debt that you create to the company. If you spend 85% of your time installing Cisco switches you are well above the debt line to the company. But if your company stops installing so many switches your value starts to fall as well. It could be that the technology is old and no one is buying it. It could be that companies have shifted the way they do business and need different resources and technology. It could be that a new partnership has created competition inside your organization.

No one wants to the be a last buggy whip manufacturer. VARs thrive on attacking markets that are hot with huge potential for profits. When a skill set becomes a commodity VARs are competing on pricing they can’t always win. That drives them to investigate new markets to offer to the customer base. In order to deliver those new technologies and solutions they need skilled people to install and configure them. The easiest solution is to acquire talent to make that happen. As above, VARs are always willing to pay top dollar to professionals with the specific skill sets they need. Bringing someone in to do that new line of business means they’re producing from Day One and keeping their value above the debt line of their salary.

The other way that VARs compete in these new markets is by training existing professionals on the new technology. Everyone that has ever worked in a VAR knows of the people that get tasked with learning how to deploy new storage systems, new network equipment, and even entirely new solutions that customers are asking for. I know I was that person at my old VAR. If it needed to be learned I was the one to do it first. I jumped in to deploying iSCSI storage, wireless access points, and even VoIP phone systems. Each time I had to spend time learning those new skills and adding them to my existing set. It was a cheaper method in the short term than bringing entirely new talent on board.

Get Out of Town

The friction in the training approach comes when it’s time to value your employees and their skill sets. If I’m getting paid to deploy Cisco switches and now my company wants me to learn how to install Palo Alto firewalls then I’m going to eventually get a raise or a new role to cover this expanded skill set. And rarely, if ever, do employee salaries get adjusted downward to compensate for old skills that are no longer relevant being supplanted by new marketable skills. Suddenly all those technologies I spent so much time learning are technical debt my VAR is paying for.

VARs need to be able to jump into new lines of business in order to survive. And that sometimes means shedding technical debt. If you’re a highly paid employee that earns twice as much as someone that has the specific skill set your VAR needs for a new project then your value to the at this current moment is likely much closer to the negative line of skills versus debt. You may have more experience or more familiarity with the process but that doesn’t translate as well into real value. If it did contractors wouldn’t be as well compensated as they are.

Now your VAR has a choice: keep paying you a lot and investing in their technical debt or bring on someone new that moves more closely with their new lines of business and start the escalator ride all over again. Unless you’re an exceptional employee or you are moved into a management role that usually means you’re let go or encourage to find another role somewhere. Maybe you get lucky and another VAR needs exactly what you offer and they’re willing to pay to get it. No matter what, the VAR is ridding themselves of technical debt. It should be no different than retiring an old laptop or installing new software to do help desk ticketing. But because it’s a person with a life and a family it feels wrong.

Rise Above

Is there an answer to this problem? If there is I don’t think we’ve found it yet. Obviously the solution would be to keep people on staff and pay them what their skill set is worth to the company. But that could entail retraining or readjustment in compensation that people aren’t always willing to do. VARs aren’t going to pay hefty salaries for skills that aren’t making them money. Other VARs may want to pay you for your skills but that’s not always a guarantee, especially if your skill set is extremely specific.

The other possibility is more akin to the contractor system, where you’re only hired for your skills for the period of time that they are needed. In theory that works very well. In practice the challenges of capital asset acquisition and personal benefits make contracting full-time almost as much of a hassle as changing jobs every few years chasing a bigger paycheck or a company that values your skills. There isn’t a clear-cut answer. Part of that reasoning is because the system works just fine the way it is. Why fix it if it’s not broken? It would take a massive shift in IT toward a new paradigm to force the kind of soul searching necessary to change the way VARs handle their staff. Cloud is close. So too is DevOps and programmatic IT. But for the kind of change we’re talking about it’s going to take something even bigger than those two things combined.


Tom’s Take

After reading this I’m sure half of you are scared to death and swear you will never work for a VAR. That’s a bit short-sighted. Remember that they’re a great source of training and experience. Customer networks stay fairly static and only require specific kinds of maintenance from time to time outside of deployments. If you want to hone your skills on a variety of technologies and get very good at troubleshooting then VAR life is absolutely where you need to be. Just remember that you are a resource with a value and a burden. Despite the mantra of being a “family” or other feel-good nonsense you will eventually reach the point of being the uncle that constantly incurs debt for very little return. Every family shuns those kinds of members. Make sure you know your value and how you can contribute. If that’s not possible where you are now then make sure it is wherever you land with whatever skills you need.