Facebook announced at Interop that they are soliciting ideas for building their own top-of-rack (ToR) switch via the Open Compute Project. This sent the tech media into a frenzy. People are talking about the end of the Cisco monopoly on switches. Others claimed that the world would be a much different place now that switches are going to be built by non-vendors and open sourced to everyone. I yawned and went back to my lunch. Why?
BYO Networking Gear
As you browse the article that you’re reading about how Facebook is going to destroy the networking industry, do me a favor and take note of what kind of computer you’re using. Is is a home-built desktop? Is it something ordered from a vendor? Is it a laptop or mobile device that you built? Or bought?
The idea that Facebook is building switches isn’t far fetched to me. They’ve been doing their own servers for a while. That’s because their environment looks wholly different than any other enterprise on the planet, with the exception of maybe Google (who also builds their own stuff). Facebook has some very specialized needs when it comes to servers and to networking. As they mention at conferences, the amount of data rushing into their network on an hourly, let alone daily, basis is mind boggling. Shaving milliseconds off query times or reducing traffic by a few KB per flow translates into massive savings when you consider the scale they are operating at.
To that end, anything they can do to optimize their equipment to meet their needs is going to be a big deal. They’ve got a significant motivation to ensure that the devices doing the heavy lifting for them are doing the best job they can. That means they can invest a significant amount of capital into building their own network devices and still get a good return on the investment. Much like the last time I built my own home desktop. I didn’t find a single machine that met all of my needs and desires. So I decided to cannibalize some parts out of an old machine and just build the rest myself. Sure, it took me about a month to buy all the parts, ship them to my house, and then assemble the whole package together. But in the end I was very happy with the design. In fact, I still use it at home today.
That’s not to say that my design is the best for everyone, or anyone for that matter. The decisions I made in building my own computer were one’s that suited me. In much the same way, Facebook’s ToR switches probably serve very different needs than existing data centers. Are your ToR switches optimized for east-west traffic flow? I don’t see a lot of data at Facebook directed to other internal devices. I think Facebook is really pushing their systems for north-south flow. Data requests coming in from users and going back out to them are more in line with what they’re doing. If that’s the case, Facebook will have a switch optimized for really fast data flows. Only they’ll be flowing in the wrong direction for what most people are using data flows for today. It’s like having a Bugatti Veyron and living in a city with dirt roads.
Facebook admitted that there are things about networking vendors they don’t like. They don’t want to be locked into a proprietary OS like IOS, EOS, or Junos. They want a whitebox solution that will run any OS on the planet efficiently. I think that’s because they don’t want to get locked into a specific hardware supplier either. They want to buy what’s cheapest at the time and build large portions of their network rapidly as needed to embrace new technology and data flows. You can’t get married to a single supplier in that case. If you do, a hiccup in the production line or a delay could cost you thousands, if not millions. Just look at how Apple ensures diversity in the iPhone supply chain to get an idea of what Facebook is trying to do. If Apple were to lose a single part supplier there would be chaos in the supply chain. In order to ensure that everything works like a well-oiled machine, they have multiple companies supplying each outsourced part. I think Facebook is driving for something simliar in their switch design.
One Throat To Choke
The other thing that gives me pause here is support. I’ve long held that one of the reasons why people still buy computers from vendors or run Windows and OS X on machines is because they don’t want the headache of fixing things. A warranty or support contract is a very reassuring thing. Knowing that you can pick up the phone and call someone to get a new power supply or tell you why you’re getting a MAC flap error lets you go to sleep at night. When you roll your own devices, the buck stops with you when you need to support something. Can’t figure out how to get your web server running on Ubuntu? Better head to the support forums. Wondering why your BYOSwitch is dropping frames under load? Hope you’re a Wireshark wizard. Most enterprises don’t care that a support contract costs them money. They want the assurance that things are going to get fixed when they break. When you develop everything yourself, you are putting a tremendous amount of faith into those developers to ensure that bugs are worked out and hardware failures are taken care of. Again, when you consider the scale of what Facebook is doing, the idea of having purpose-build devices makes sense. It also makes sense that having people on staff that can fix those specialized devices is cost effective for you.
Face it. The idea that Facebook is going to destroy the switching market is ludicrous. You’re never going to buy a switch from Facebook. Maybe you want to tinker around with Intel’s DPDK with a lab switch so you can install OpenFlow or something similar. But when it comes time to forklift the data center or populate a new campus building with switches, I can almost guarantee that you’re going to pick up the phone and call Cisco, Arista, Juniper, Brocade, or HP. Why? Because they can build those switches faster than you can. Because even though they are a big captial expenditure (capex), it’s still cheaper in the long run if you don’t have the time to dedicate to building your own stuff. And when something blows up (and something always blows up), you’re going to want a TAC engineer on the phone sharing the heat with you when the CxOs come headhunting in the data center when everything goes down.
Facebook will go on doing their thing their way with their own servers and switches. They’ll do amazing things with data that you never dreamed possible. But just like buying a Sherman tank for city driving, their solution isn’t going to work for most people. Because it’s built by them for them. Just like Google’s server farms and search appliances. Facebook may end up contributing a lot to the Open Compute Project and advancing the overall knowledge and independence of networking hardware. But to think they’re starting a revolution in networking is about as far fetched as thinking that Myspace was going to be the top social network forever.
Nice post! I agree 100% All this build your own stuff on x86/SDN hype is getting ludicrous. Cisco, Juniper, HP and the usual suspects will keep doing what they do and what they do well. When you are at the scale of Google and Facebook it does make sense to build your own stuff but it requires lots of skilled engineers, programmers and large sums of money to invest.
Couldn’t have worded it better myself and completely agree.
Its the same that is floating around at my place at the moment with certain people wanted to shave minimal amounts off the CapEX to go with some cheap unbranded routers yet we all know when shit hits the fan the support is going to be non-existant rather than just going with a big vendor where the support is a guarantee!
Here is key difference though. A 48 port 10GbE switch is type one quarter the price of name brand unit. So, if you are buying enough switches, the cost difference adds up to a lot of engineers.
For some cloud companies, having a lot of engineers in the resource pool is a good thing since it increases your available capacity to implement new features and maintain skills.
I’m cautious like you about the initial announcements but it could lead to some interesting changes int he network marketplace which I expect to welcome.
A couple of comments.
To EtherealMind’s comments, these switches are much cheaper. A 48x10G + 4x40G Broadcom TOR switch can be acquired for less than $4000. This is more like less than 10% of the list price of name brand switches.
You don’t have to build the switches from scratch of from components – you just buy a SKU – http://www.accton.com/prodcat.asp?c=1 or http://www.dninetworks.com/product.aspx?ObjectId=35 or http://www.quantaqct.com/en/01_product/01_list.php?mid=30&sid=114&id=116
In terms of support and one throat to choke – the way most people put together servers today is they buy x86 from a server seller ie. HP, Dell, IBM etc. and then they procure separately and add themselves software, ie. Redhat Linux. You do HW support and RMA with who you buy HW from and you have support contract with Redhat for the SW. Keep in mind that 1 in 7 servers sold today worldwide is bought directly via Quanta – http://www.wired.com/wiredenterprise/2013/03/quanta-growth/
Pingback: Why Facebook’s Open Compute Switches Don’t Matter. To You.
Three words: Economies of Scale.
Facebook & Google are large enough that they can actually design/manufacture/build networking gear at economies of scale. 99% of companies aren’t.
What is the point where the cost of additional units begins to increase? Hard to say, but I’m guessing stuff like getting into low-level chip design. I don’t think Facebook/Google will start creating their own x86 compatible chips any time soon.
I’m late on this one but I agree – if I can get my hands on one or two for my lab, I will, but the fact of the matter is that this kind of platform follows the same rules as open source software. Enterprise *typically* stays away from open source because of the support component, as you pointed out. So while I’m personally excited about the idea, I don’t think we’ll be seeing it en masse anywhere, at least not for a really really long time.
Pingback: Your Data Center Isn’t Facebook And That’s Just Fine | The Networking Nerd
Pingback: The Cargo Cult of Google Tools | The Networking Nerd
Pingback: Tomversations: Episode 12 - Hyperscale Networking - Gestalt IT