You should definitely watch this amazing video from Ben Sigelman of LightStep that was recorded at Cloud Field Day 4. The good stuff comes right up front.
In less than five minutes, he takes apart crazy notions that we have in the world today. I like the observation that you can’t build a system more than three or four orders of magnitude. Yes, you really shouldn’t be using Hadoop for simple things. And Machine Learning is not a magic wand that fixes every problem.
However, my favorite thing was the quick mention of how emulating Google for the sake of using their tools for every solution is folly. Ben should know, because he is an ex-Googler. I think I can sum up this entire discussion in less than a minute of his talk here:
Google’s solutions were built for scale that basically doesn’t exist outside of a maybe a handful of companies with a trillion dollar valuation. It’s foolish to assume that their solutions are better. They’re just more scalable. But they are actually very feature-poor. There’s a tradeoff there. We should not be imitating what Google did without thinking about why they did it. Sometimes the “whys” will apply to us, sometimes they won’t.
Gee, where have I heard something like this before? Oh yeah. How about this post. Or maybe this one on OCP. If I had a microphone I would have handed it to Ben so he could drop it.
Building a Laser Moustrap
We’ve reached the point in networking and other IT disciplines where we have built cargo cults around Facebook and Google. We practically worship every tool they release into the wild and try to emulate that style in our own networks. And it’s not just the tools we use, either. We also keep trying to emulate the service provider style of Facebook and Google where they treated their primary users and consumers of services like your ISP treats you. That architectural style is being lauded by so many analysts and forward-thinking firms that you’re probably sick of hearing about it.
Guess what? You are not Google. Or Facebook. Or LinkedIn. You are not solving massive problems at the scale that they are solving them. Your 50-person office does not need Cassandra or Hadoop or TensorFlow. Why?
- Google Has Massive Scale – Ben mentioned it in the video above. The published scale of Google is massive, and even it’s on the low side of the number. The real numbers could even be an order of magnitude higher than what we realize. When you have to start quoting throughput numbers in “Library of Congress” numbers to make sense to normal people, you’re in a class by yourself.
- Google Builds Solutions For Their Problems – It’s all well and good that Google has built a ton of tools to solve their issues. It’s even nice of them to have shared those tools with the community through open source. But realistically speaking, when are you really going to use Cassandra to solve all but the most complicated and complex database issues? It’s like a guy that goes out to buy a pneumatic impact wrench to fix the training wheels on his daughter’s bike. Sure, it will get the job done. But it’s going to be way overpowered and cause more problems than it solves.
- Google’s Tools Don’t Solve Your Problems – This is the crux of Ben’s argument above. Google’s tools aren’t designed to solve a small flow issue in an SME network. They’re designed to keep the lights on in an organization that maps the world and provides video content to billions of people. Google tools are purpose built. And they aren’t flexible outside that purpose. They are built to be scalable, not flexible.
Down To Earth
Since Google’s scale numbers are hard to comprehend, let’s look at a better example from days gone by. I’m talking about the Cisco Aironet-to-LWAPP Upgrade Tool:
I used this a lot back in the day to upgrade autonomous APs to LWAPP controller-based APs. It was a very simple tool. It did exactly what it said in the title. And it didn’t do much more than that. You fed it an image and pointed it at an AP and it did the rest. There was some magic on the backend of removing and installing certificates and other necessary things to pave the way for the upgrade, but it was essentially a batch TFTP server.
It was simple. It didn’t check that you had the right image for the AP. It didn’t throw out good error codes when you blew something up. It only ran on a maximum of 5 APs at a time. And you had to close the tool every three or four uses because it had a memory leak! But, it was a still a better choice than trying to upgrade those APs by hand through the CLI.
This tool is over ten years old at this point and is still available for download on Cisco’s site. Why? Because you may still need it. It doesn’t scale to 1,000 APs. It doesn’t give you any other functionality other than upgrading 5 Aironet APs at a time to LWAPP (or CAPWAP) images. That’s it. That’s the purpose of the tool. And it’s still useful.
Tools like this aren’t built to be the ultimate solution to every problem. They don’t try to pack in every possible feature to be a “single pane of glass” problem solver. Instead, they focus on one problem and solve it better than anything else. Now, imagine that tool running at a scale your mind can’t comprehend. And you’ll know now why Google builds their tools the way they do.
I have a constant discussion on Twitter about the phrase “begs the question”. Begging the question is a logical fallacy. Almost every time the speaker really means “raises the question”. Likewise, every time you think you need to use a Google tool to solve a problem, you’re almost always wrong. You’re not operating at the scale necessary to need that solution. Instead, the majority of people looking to implement Google solutions in their networks are like people that put chrome everything on a car. They’re looking to show off instead of get things done. It’s time to retire the Google Cargo Cult and instead ask ourselves what problems we’re really trying to solve, as Ben Sigelman mentions above. I think we’ll end up much happier in the long run and find our work lives much less complicated.