The race to make things just a little bit faster in the networking world has heated up in recent weeks thanks to the formation of the 25Gig Ethernet Consortium. Arista Networks, along with Mellanox, Google, Microsoft, and Broadcom, has decided that 40Gig Ethernet is too expensive for most data center applications. Instead, they’re offering up an alternative in the 25Gig range.
This podcast with Greg Ferro (@EtherealMind) and Andrew Conry-Murray (@Interop_Andrew) does a great job of breaking down the technical details on the reasoning behind 25Gig Ethernet. In short, the current 10Gig connection is made of four multiplexed 2.5Gig connections. To get to 25Gig, all you need to do is over clock those connections a little. That’s not unprecedented, as 40Gig Ethernet accomplishes this by over clocking them to 10Gig, albeit with different optics. Aside from a technical merit badge, one has to ask themselves “Why?”
As always, money is the factor here. The 25Gig Consortium is betting that you don’t like paying a lot of money for your 40Gig optics. They want to offer an alternative that is faster than 10Gig but cheaper than the next standard step up. By giving you a cheaper option for things like uplinks, you gain money to spend on things. Probably on more switches, but that’s beside the point right now.
The other thing to keep in mind, as mentioned on the Coffee Break podcast, is that the cable runs for these 25Gig connectors will likely be much shorter. Short term that won’t mean much. There aren’t as many long-haul connections inside of a data center as one might thing. A short hop to the top-of-rack (ToR) switch, then another different hop to the end-of-row (EoR) or core switch. That’s really about it. One of the arguments against 40/100Gig is that it was designed for carriers for long-haul purposes. 25G can give you 60% of the speed of that link at a much lower cost. You aren’t paying for functionality you likely won’t use.
Is this a good move? That depends. There aren’t any 25Gig cards for servers right now, so the obvious use for these connectors will be uplinks. Uplinks that can only be used by switches that share 25Gig (and later 50Gig) connections. As of today, that means you’re using Arista, Dell, or Brocade. And that’s when the optics and switches actually start shipping. I assume that existing switching lines will be able to retrofit with firmware upgrades to support the links, but that’s anyone’s guess right now.
If Mellanox and Broadcom do eventually start shipping cards to upgrade existing server hardware to 25Gig then you’ll have to ask yourself if you want to pursue the upgrade costs to drive that little extra bit of speed out of the servers. Are you pushing the 10Gig links in your servers today? Are they the limiting factor in your data center? And will upgrading your servers to support twice the bandwidth per network connection help alleviate your bottlenecks? Or will they just move to the uplinks on the switches? It’s a quandary that you have to investigate. And that takes time and effort.
The very first thing I ever tweeted (4 years ago):
We’ve come a long way from ratified standards to deployment of 40Gig and 100Gig. Uplinks in crowded data centers are going to 40Gig. I’ve seen a 100Gig optic in the wild running a research network. It’s interesting to see that there is now a push to get to a marginally faster connection method with 25Gig. It reminds me of all the competing 100Mbit standards back in the day. Every standard was close but not quite the same. I feel that 25Gig will get some adoption in the market. So now we’ll have to choose from 10Gig, 40Gig, or something in between to connect servers and uplinks. It will either get sent to the standards body for ratification or die on the vine with no adoption at all. Time will tell.
A lot of these details are off. 10G has been serial for years now; it’s not 4×2.5G. The 25G proposal is basically one-fourth of a 100G port (which is 4x25G).
25G won’t be used for uplinks; it’s not even fast enough today and it won’t be released for a little while. By the time it comes out, 25G will be the downlink from the TOR and 100G will be the uplink. The core of the network will be all 100G at that point.
Of course it doesn’t make sense to upgrade one part of a server. 25G NICs will be going into new Haswell/Broadwell servers where a “cheap” server has 32 cores and ~512GB RAM.
Pingback: NBase-ing Your Wireless Decisions | The Networking Nerd
Pingback: The 25GbE Datacenter Pipeline | The Networking Nerd
Pingback: Tomahawk II – Performance Over Programmability | The Networking Nerd