APIs and Department Stores

This week I tweeted something from a discussion we had during Networking Field Day that summed up my feelings about the state of documentation of application programming interfaces (APIs):

I laughed a bit as I wrote it because I’ve worked in department stores like Walmart in the past and I know the reasons why they tend to move things around. Comparing that to the way that APIs are documented is an interesting exercise in how people think about things like new capabilities and notification of changes.

Branding Exercises

In case you weren’t aware, everything in your average department store is carefully planned out. The things placed in the main aisles are decided on weeks in advance due to high traffic. The items placed at the ends of the aisles, or endcaps, are placed there to highlight high margin items or things that are popular enough to be sought out by customers. The makeup of the rest of the store is determined by a lot of metrics.

There are a few restrictions that have to be taken into account. In department stores with grocery departments, the locations of the refrigerated sections must be around the outside because of power requirement. Within those restrictions, plans put the high traffic items in the back of the store to require everyone to walk past all the other stuff in hopes they might buy it. That’s why the milk and bread and electronics areas are always the furthest away from the front of the store. You’re likely headed there anyway so why not make you work for it?

Every few months the store employees receive new floor plans that move items to different locations. Why would they do that? Well, those metrics help them understand where people are more likely to purchase certain items. Those metrics also tell the planners what items should be located together as well, which is how the whole aisle is planned out. Once everything gets moved they start gathering new metrics and find out how well their planning works. Aside from the inevitable grumbles. Even with some fair warning no one is happy when you find out something has moved.

Who Needs Documentation?

You might think that, on the surface, there’s not much similarity between a department store aisle and an API. One is a fixture. The other is code. Yet, think about how APIs are typically changed and you might find some of the parallels. Change is a constant in the world of software development, after all.

The APIs that we used a decade ago are almost assuredly different from the ones we program for today. Every year brings updated methods, new functions, and even changes in programming languages or access methods. How can you be sure that developers are accessing the latest and greatest technology that you’ve put into place? You can’t just ask them. Instead, you have to deprecate the methods that you don’t want them to use any longer.

Ask any developer writing for a API about deprecation and you’re probably going to hear a string of profanity. Spending time to write a perfectly good piece of software only to have it wrecked by someone’s decision to do things differently is infuriating to say the least. Trying to solve a hard problem with a novel concept is one thing. Having to do it all over again a month later when a new update is released is even more infuriating.

It’s the same fury that you feel when the peanut butter is moved from aisle four to aisle eight. How dare you! It took me a week last time to remember where it was and now you’ve gone and moved it. Just like when I spent all that time learning which methods to query to pull the data I needed for my applications.

No matter how much notice you give or how much you warn people that change is coming they’re always going to be irritated at you for making those changes. It feels like a waste of effort to need to rewrite an interface or to walk a little further in the store to locate the item you wanted. Humans aren’t fond of wasted effort or of needing to learn new things without good reason.

Poor API documentation is only partly to blame for this. Even the most poorly documented API will eventually be mapped out by someone that needs the info. It’s also the fact that the constant change in methods and protocols forces people to spend a significant amount of time learning the same things over and over again for very little gain.

The Light at the End of the Aisle

Ironically enough, both of these kinds of issues are likely to be solved in a similar way. Thanks to the large explosion of people doing their shopping online or with pickup and delivery services there is a huge need to have things more strictly documented and updated very frequently. It’s not enough to move the peanut butter to a better location. Now you need to update your online ordering system so the customers as well as the staff members pulling it for a pickup order can find it quickly and get more orders done in a shorter time.

Likewise, the vast number of programs that are relying on API calls today necessitate that older versions of functionality are supported for longer or newer functions are more rigorously tested before implementation. You don’t want to disable a huge section of your userbase because you deprecated something you didn’t like to maintain any longer. Unless you are the only application in the market you will find that creating chaos will just lead to users fleeing for someone that doesn’t upset their apple cart on a regular basis.


Tom’s Take

Documentation is key for us to understand change. We can’t just say we changed something. We have to give warning, ensure that people have seen the warning, tell them we’ve changed it, and then give them some way to transform the old way of things into the new one. And even that might not be enough. However, the pace of change that we’re seeing also means that rapid changes may not even be required for much longer. With people choosing to order online and never step foot inside the store the need to change the shelves frequently may be a thing of the past. With new methods and languages being developed so rapidly today it may be much faster to rewrite everyone on a new API and leave the old one intact instead of forcing developers to look at technology that is years old at this point. The delicious irony of the people forcing change on us to need to accept change themselves is something I’d happily shop for.

When Hardware Drives Software Upgrades

8006701A-85B8-4391-BD36-487B35755676

What’s your favorite version of Microsoft Windows? Is it Windows 10? Maybe it’s Windows XP? Windows 95? Odds are good that you have one version you appreciated more than most. Windows XP, Windows 7, and Windows 10 tend to rank high on the list. Windows ME and Windows 8 seem to rank pretty low. Yet, for all their impressive love and all the users clinging to them we don’t really use anything other than Windows 10 any more.

You might be tempted to say that the OS isn’t supported any longer so there’s no reason to run it. Yet we still drive vehicles that are no longer under warranty. We still buy classic cars that are older than we are and put parts in them to keep them running. Why is software different? What drives us to keep needing to upgrade our programs?

You might be shocked to learn that the most popular reason to upgrade software is, in fact, driven by hardware. It’s not the memory requirements or the fancy new user interface that drives people to move to the new platform. More often than not it’s because a new piece of hardware has requirements that only work on the latest version of the system.

It happened to me once or twice. I can distinctly remember needing to go out and buy a new printer to replace some cheap HP Inkjet I purchased for a project because when I upgraded from Windows XP to Windows 7 the drivers didn’t support the move. Why spend money writing new drivers for a cheap printer when you could just make people go out and buy another new cheap printer? I swear that’s what happened. And, of course, the most expensive the device you purchase the more likely it stays supported, right?

The Lords of COBOL

By now I’m sure you’re all familiar with the little tidbit of information that most of the world’s insurance companies run their databases on ancient mainframes. Why? Well, most of their software still requires COBOL to run. Large organizations don’t like to move to new platforms very often. It wasn’t that long ago that Southwest Airlines moved to a new booking system because the old one only had two days you could schedule flights – Monday through Friday and Sunday. If you scheduled a flight for Monday through Friday you had to have the same flight at the same time every day no matter what. It’s even widely believed that part of the reason that United Airlines merged with Continental was because they wanted to switch to a better booking system.

Why do companies keep these systems around? It should be easy to just migrate off of them, right? Well, reality is that between the sunk cost of operating a mainframe for years and patching the software that you’ve built to operate your business the desire to move to something else isn’t always a driver. After all, if it ain’t broke why fix it? Companies can keep maintaining old systems as long as someone sticks around to keep the lights on. I can remember working with a number of IT professionals over the years that had their jobs mostly because they were the final remaining mainframe wizard that knew how to put the system into maintenance mode or remembered the magical incantations to reboot the old machine after a power failure.

Alas, nothing lasts forever. The current pattern seems to be pretty standard. The old wizards finally decide to retire. They’ve had enough and they’re ready to move to somewhere warm and enjoy not working. The management keeps the lights going because it’s not that hard, right? It would take way too much to rewrite the software or move people to a new platform. Until the day when the system stops working. The day when everything doesn’t come back up. Then it’s panic mode. Was that just the database? What if it’s the actual hardware. Do they still make parts for this? Does anyone even know what this button does? Eventually either the hard decision is made to cut over somehow or an exorbitant amount of money is paid to the former operations people to come back and get things running again long enough to figure out how to keep this from happening again. And if you think you’re going to be able to train a developer to just pick up where the grizzled old wizard left off, good luck. Go find a COBOL training course somewhere. I’ll wait.

Modern Makers Make Mistakes Too

If you think that the modern era of cloud development is any different than writing FORTRAN or COBOL on a mainframe you’ve got a nice set of rose-colored glasses. We’re locking ourselves into the same patterns of thought that brought on the monoliths we’re currently trying to tear down. Every time you enable a feature that only works on one cloud platform or you choose to develop in a hot new language that isn’t fully supported everywhere you’re putting up a barrier that will eventually lead to you making hard choices.

You know what’s different this time, though? You don’t have the luxury of a position where you get to be the wizard that knows how to keep the lights on. As the article above mentions, the race is on to get the COBOL migrated to a modern platform that allows integration with languages like C# and Java. Do you believe that having platforms like that means you’ll get to a point where you can be the last remaining person around that remembers what crazy setup you used to minimize the number of containers an app was using? Or do you think it’s more likely they’ll just fire you, figure out how to integrate your legacy code into a new platform, and go on painting themselves right back into corners?

Hardware is the last true driver to keep people moving along into a place where they are forced to do things the right way. If your hardware doesn’t support something you don’t do it. If you need to ensure that your code is portable you don’t bake in features that require specific hardware or you create a situation where you’re tied to that platform forever. That’s why cloud is a bit scary in my mind. Because you’re agnostic from the hardware. You can do whatever you want without limit.

Want to write software that requires the use of hundreds of processing threads? You can do it because why not? You aren’t limited to just one chip any longer. Want to eat up tons of memory and storage? Go for it. You get to use as much as your credit card can hold. Now the bounds of a programmer’s imagination is no longer limited to physical hardware limitations. If you don’t believe me then ask yourself why there are apps on the App Store today that are bigger then the entire amount of storage that the original iPhone was capable of. Sure, hardware brought us to this point. But ditching the hardware for the magic of the cloud means there isn’t anything holding back those that want to build the biggest, burliest, baddest application they can!


Tom’s Take

Somewhat ironically, I’m not really that worried about the cloud letting people build ugly things to their hearts’ content. Why? Just like terrible movie directors, once you’ve removed their limitations you expose their vulnerabilities and they build something that is unsustainable. Build the biggest app you can. You’ll find out that it collapses under its own weight. Even the promise of the mythical giant virtual machines with 1 TB of RAM haven’t made them materialize. Why? Because it turns out that removing restrictions just enforces them through trial and error. If you have to build small because you can’t get crazy due to hardware you’re held back by external forces. But when you are held back because you tried it that way the last time and you failed by creating an app that takes ten minutes to load you learned your lesson. You get leaner and better and more portable next time. And that’s the kind of driver that makes software and hardware better for us all.

Programming Unbound

I’m doing some research on Facebook’s Open/R routing platform for a future blog post. I’m starting to understand the nuances a bit compared to OSPF or IS-IS, but during my reading I got stopped cold by one particular passage:

Many traditional routing protocols were designed in the past, with a strong focus on optimizing for hardware-limited embedded systems such as CPUs and RAM. In addition, protocols were designed as purpose-built solutions to solve the particular problem of routing for connectivity, rather than as a flexible software platform to build new applications in the network.

Uh oh. I’ve seen language like this before related to other software projects. And quite frankly, it worries me to death. Because it means that people aren’t learning their lessons.

New and Improved

Any time I see an article about how a project was rewritten from the ground up to “take advantage of new changes in protocols and resources”, it usually signals to me that some grad student decided to rewrite the whole thing in Java because they didn’t understand C. It sounds a bit cynical, but it’s not often wrong.

Want proof? Check out Linus Torvalds and his opinion about rewriting the Linux kernel in C++. Spoiler alert – “C++ is a horrible language.” And it gets more colorful from there. Linus has some very valid points about C++ that have been debated by lots of communities for the past ten years. But the fact remains that he has decided that completely rewriting the entire kernel in C++ is an exercise in futility.

In today’s world, we’re faced with a multitude of programming languages fighting for our attention. We’re evolved past FORTRAN, COBOL, C, and C++. We now live a world of Python, C#, J#, R, Ruby, and dozens more. And those don’t even include the languages that aren’t low-level and are more scripting. Every one of these languages was designed to solve a particular problem. And every one of them is begging to be used.

But it’s not enough that we have ten ways to write a function today. What’s more troublesome is that we’ve forgotten why certain languages were preferred over others in the past. We’ve forgotten that things used to be done the way they were done because we had no other alternatives. I can remember studying for my Novell CNE and taking 50-649, wherein Novell kept referring to OSPF as an “expensive” protocol to use on a NetWare server. At the time that test was created, OSPF was expensive from a CPU cycle standpoint. If the server was doing other things besides running a routing protocol you might see a potential impact if the CPU was slowed. And having OSPF calculations interrupted because someone was doing an FTP transfer could be expensive indeed.

No Wasted Space

More to the point, when people are faced with a limitation they have to be creative and concise. And nowhere is that more apparent than in E.T. the Extraterrestrial for the Atari 2600. Infamously, Howard Scott Warshaw had just one month to write a video game that would go on to be blasted critically, considered one of the worst of all time, and be blamed for the Video Game Crash of 1983. Yet, as one fan discovered years later when he set out to “fix” the game’s code, as bad is it may have been it was very well coded. From the article:

…it’s unlikely that Howard Scott Warshaw (the developer) included some useless code for us to replace…

So, a programmer for an outdated video game system had a month to code a complex game and managed to do it in such a way as to leave very little empty space to insert code patches? And yet my Facebook app on my iPhone requires how much space?!?

All joking aside, the issue with E.T. wasn’t code quality. The problems with the game have been documented over the years, but almost no one blames Warshaw’s coding capabilities. That’s because Warshaw was working within his limitations. He couldn’t magically make the Atari 2600 cartridge bigger. He couldn’t increase the CPU size on the system. He worked within him limitations and made the best game that he could make.

Now, let’s look at an article about Open/R and see some of Facebook’s criticisms of OSPF and IS-IS:

We didn’t want to get bogged down in discussions over the lower-level protocol details, such as frame formatting and handshakes…

While it might sound heavyweight compared with OSPF and ISIS, which use their own “lightweight” transports, we haven’t found this to be an issue in modern networking hardware…

Whether or not they were intended to be taken as such, these are some pretty interesting knocks against OSPF and IS-IS. What Facebook is essentially saying is that they didn’t want to worry about building the low level parts of the messaging system, so they picked something off the shelf. They also built it to be more resource intensive than necessary because they didn’t need to compromise when running it on Six Pack and Wedge.

So long as your routers have ample CPU cycles and memory, Open/R will run just fine. But how many people out there are running a data center server board in their edge router? How many routers out there have to take a reduced BGP table because they don’t have enough memory to fit the entire global IPv4 routing table in memory, let alone IPv6? If resources are infinite and time is irrelevant than building your protocols the way you want is of no consequence. But as soon as you add constraints to the equation, like support for older hardware or limited memory, you have to start making compromises to make things work.


Tom’s Take

I’m not saying that Open/R is a bad routing protocol. I’m going to save that analysis for a later time. But I do take a bit of umbrage with Facebook’s idea that OSPF and IS-IS are a bit outdated simply because they were programmed for a different era. If they were really that inept they would have been replaced or expanded by now. The fact that twenty-somethings got a bug to rewrite a routing protocol because they could and threw all caution to the wind with regard to resource usage should be a cautionary tale to any programmer out there. Never assume that you have more space than you need. Train yourself to do more with less. And be ready to compromise in case the worst case scenario becomes reality.

Do Network Professionals Need To Be Programmers?

With the advent of software defined networking (SDN) and the move to incorporate automation, orchestration, and extensive programmability into modern network design, it could easily be argued that programming is a must-have skill. Many networking professionals are asking themselves if it’s time to pick up Python, Ruby or some other language to create programs in the network. But is it a necessity?

Interfaces In Your Faces

The move toward using API interfaces is one of the more striking aspects of SDN that has been picked up quickly. Instead of forcing information to be input via CLI or information to be collected from the network via scraping the same CLI, APIs have unlocked more power than we ever imagined. RESTful APIs have giving nascent programmers the ability to query devices and push configurations without the need to learn cumbersome syntax. The ability to grab this information and feed it to a network management system and analytics platform has extended the capabilites of the systems that support these architectures.

The syntaxes that power these new APIs aren’t the copyrighted CLIs that networking professionals spend their waking hours memorizing in excruciating detail. JUNOS and Cisco’s “standard” CLI are as much relics of the past as CatOS. At least, that’s the refrain that comes from both sides of the discussion. The traditional networking professionals hold tight to the access methods they have experience with and can tune like a fine instrument. More progressive networkers argue that standardizing around programming languages is the way to go. Why learn a propriety access method when Python can do it for you?

Who is right here? Is there a middle ground? Is the issue really about programming? Is the prattle from programming proponents posturing about potential pitfalls in the perfect positioning of professional progress? Or are anti-programmers arguing against attacks, aghast at an area absent archetypical architecture?

Who You Gonna Call?

One clue in this discussion comes from the world of the smartphone. The very first devices that could be called “smartphones” were really very dumb. They were computing devices with strict user interfaces designed to mimic phone functions. Only when the device potential was recognized did phone manufacturers start to realize that things other than address books and phone dialers be created. Even the initial plans for application development weren’t straightforward. It took time for smartphone developers to understand how to create smartphone apps.

Today, it’s difficult to imagine using a phone without social media, augmented reality, and other important applications. But do you need to be a programmer to use a phone with all these functions? There is a huge market for smartphone apps and a ton of courses that can teach someone how to write apps in very little time. People can create simple apps in their spare time or dedicate themselves to make something truly spectacular. However, users of these phones don’t need to have any specific programming knowledge. Operators can just use their devices and install applications as needed without the requirement to learn Swift or Java or Objective C.

That doesn’t mean that programming isn’t important to the mobile device community. It does mean that programming isn’t a requirement for all mobile device users. Programming is something that can be used to extend the device and provide additional functionality. But no one in an AT&T or Verizon store is going to give an average user a programming test before they sell them the phone.

This, to me, is the argument for network programmability in a nutshell. Network operators aren’t going to learn programming. They don’t need to. Programmers can create software that gathers information and provides interfaces to make configuration changes. But the rank-and-file administrator isn’t going to need to pull out a Java manual to do their job. Instead, they can leverage the experience and intelligence of people that do know how to program in order to extend their network functionality.


Tom’s Take

It seems like this should be a fairly open-and-shut case, but there is a bit of debate yet left to have on the subject. I’m going to be moderating a discussion between Truman Boyes of Bloomberg and Vijay Gill of Salesforce around this topic on April 25th. Will they agree that networking professionals don’t need to be programmers? Will we find a middle ground? Or is there some aspect to this discussion that will surprise us all? I’ll make sure to keep you updated!

Automating Your Job Away Isn’t Easy

programming

One of the most common complaints about SDN that comes from entry-level networking folks is that SDN is going to take their job away. People fear what SDN represents because it has the ability to replace their everyday tasks and put them out of a job. While this is nowhere close to reality, it’s a common enough argument that I hear it very often during Q&A sessions. How is it that SDN has the ability to ruin so many jobs? And how is it that we just now have found a way to do this?

Measure Twice

One of the biggest reasons that the automation portion of SDN has become so effective in today’s IT environment is that we can finally measure what it is that networks are supposed to be doing and how best to configure them. Think about the work that was done in the past to configure and troubleshoot networks. It’s often a very difficult task that involves a lot of intuition and guesswork. If you tried to explain to someone the best way to do things, you’d likely find yourself at a loss for words.

However, we’ve had boring, predictable standards for many years. Instead of cobbling together half-built networks and integrating them in the most obscene ways possible, we’ve instead worked toward planning and architecting things properly so they are built correctly from the ground up. No more guess work. No more last minute decisions that come back to haunt us years down the road. Those kinds of things are the basic building blocks for automation.

When something is built along the lines of predictable rules with proper adherence to standards, it’s something that can be understood by a non-human. Going all the way back to Basic Computing 101, the inputs of a system determine the outputs. More simply, Garbage In, Garbage Out. If your network configuration looks like a messy pile of barely operational commands it will only really work when a human can understand what’s going on. Machines don’t guess. They do exactly what they are told to do. Which means that they tend to break when the decisions aren’t clear.

Cut Once

When a system, script, or program can read inputs and make procedural decisions on those inputs, you can make some very powerful things happen. Provided, that is, that your chosen language is powerful enough to do those things. I’m reminded of a problem I worked on fifteen years ago during my internship at IBM. I needed to change the MTU size for a network adapter in the Windows 2000 registry. My programming language of choice wasn’t powerful enough for me to say something like, “Read these values into an array and change the last 2 or 3 to the following MTU”. So instead, I built a nested if statement that was about 15 levels deep to ensure I caught every possible permutation of the adapter binding order. It was messy. It was ugly. And it worked. But there was no way it would scale.

The most important thing to realize about SDN and automation is that we’ve moved past simply understanding basic values. We’ve finally graduated to a place where programs can make complex decisions based on a number of inputs. We’ve graduated from simple if-then-else constructs and up to a point where programs can take a number of inputs and make decisions based on them. Sure, in many cases the inputs are simple little things like tags or labels. But what we’re gaining is the ability to process more and more of those labels. We can create provisioning scripts that ensure that prod never talks to dev. We can automate turn-up of a new switch with multiple VLANs on different ports through the use of labels and object classes. We can even extrapolate this to a policy-based network language that we can use to build a task once and execute it over and over again on different hardware because we’re doing higher level processing instead of being hamstrung by specific device syntax.

Automation is going to cost some people their jobs. That’s a given. Just like every other manufacturing position, the menial tasks of assembling simple pieces or performing repetitive tasks can easily be accomplished by a machine or software construct. But writing those programs and working on those machines is a new kind of job in and of itself. A humorous anecdote from the auto industry says that the introduction of robots onto assembly lines caused many workers to complain and threaten to walk off the job. However, one worker picked up the manual for the robot and realized that he could easily start working on the it instead of the assembly line.


Tom’s Take

Automation isn’t a magic bullet to fix all your problems. It only works if things are ordered and structured in such a way that you can predictably repeat tasks over and over. And it’s not going to stop with one script or process. You need to continue to build, change, and extend your environment. Which means that your job of programming switches should now be looked at in light of building the programs that program switches. Does it mean that you need to forget the basics of networking? No, but it does mean that they way in which you think about them will change.