While I was at Brocade Tech Day, I had the wonderful opportunity to sit down with Jon Hudson (@the_solutioneer) and just talk for about half an hour. While the rest of the day was a whirlwind of presentations and interviews, Jon and I talked about all manner of things not related to VDX or VCS. Instead, we had a very fascinating discussion about TRILL and SPB.
For those not familiar, TRILL is the IETF standard for the issue of layer 2 multipath. It’s a very elegant solution for the spanning tree problem. Our data centers today are running at half capacity. That’s not because we don’t have enough bandwidth, though. It’s because half our links are shut down, waiting for a link failure. Thanks to 802.1d spanning tree, we can’t run two links at the same time unless they are bundled into a link aggregation (LAG) solution. And heaven forbid we want to terminate that LAG bundle on two different switches to prevent single-switch failure from affecting our traffic. Transparent Interconnection of Lots of Links (TRILL) fixes this by creating a layer 2 network with link state. It accomplishes this by running a form of IS-IS, which allows the layer 2 nodes to create an SPF table and determine not only the best path to a node, but other paths that could be equally as good. This means that we have a real fabric of interconnections with no blocked links.
802.1aq Shortest Path Bridging, or SPB informally, is the IEEE version of a layer 2 multipathing replacement for spanning tree. It looks a lot like TRILL and even uses IS-IS for the layer 2 protocol as well. It does differ in some respects, such as using MAC-in-MAC encapsulation for frames as opposed to rewriting the header like TRILL does. This makes it very attractive to the service provider market, as they don’t have to buy a bunch of new gear to get everything up and running quickly on SPB. Looking at the proponents of SPB, such as Avaya and Alcatel-Lucent that really comes as no surprise. Those companies are heavily invested in the service provider space and would really love to see SPB adoption take off as it would protect their initial investments.
The showdown between TRILL and SPB isn’t that far removed from the old showdown between VHS and Betamax. For those not entirely familiar, this was a case of two competing standards that was eventually settled in the court of the consumer. While many regard the early Betamax units as technologically superior, there was an issue of tape length (1 hour vs the VHS 2 hour limit). As time wore on, there was significant development done on both sides that stretched the formats to their absolute limits. However, by the end, VHS had won due to simple popularity. Since VHS had become the most popular format for consumers, even the supposed superiority of Betamax couldn’t save it from being relegated to the junk pile of history. Another more recent case is the battle between HD-DVD and Blu-ray. Similarly to the analog format wars decades earlier, the digital disc war erupted from two alliances thinking they had the best solution to the problem. Blu-ray eventually won out in much the same way that VHS – by becoming the format that most people wanted to use. The irony that Sony actually won a format war isn’t lost on a lot of people either.
I believe that we’re going to see something like these showdowns in TRILL vs. SPB. Right now, the battle lines seem drawn between the data center vendors supporting TRILL and the service provider vendors getting ready to implement SPB. Whether or not one of the solutions is technically superior to the other is inconsequential at this point. It’s all going to come down to popularity. Brocade and Cisco have non-standard TRILL implementations in VCS and FabricPath. The assumption is that they will be compatible with TRILL when a working solution is finally released. I’m also guessing that we’re going to see more support for TRILL in the cloud providers to maximize their revenue potential by offering non-blocking paths to increase throughput for those hungry cloud applications. Brocade showcased some providers moving to VCS at Brocade Tech Day. If that’s the case, we’re going to see TRILL at the enterprise level and the cloud provider level connected by an SP core running SPB. Just like Betamax being the favorite of the professional video industry, SPB will be the go-to protocol for providers, as they can put of yet one more round of equipment upgrades. I think by that point, however, TRILL will have obtained enough critical mass to drive adoption to the point where TRILL silicon will be a very inexpensive option on most new equipment in a few years, perhaps even becoming the default configuration. If that is indeed the case, then TRILL will indeed become the VHS or Blu-ray of this protocol war.
I can still remember going into the video store and seeing the great divide. On one side, Betamax. On the other, VHS. Slowly, the Betamax side of the house shrank away to nothing. It happened again with HD DVD and Blu-ray. In the end, both format wars came down to popularity. VHS was in more households and offered the ability to record two hours worth of programming instead of one. Blu-ray got the popular movie studios on board quickly, like Disney. Once the top selling movies were on Blu-ray, the outcome was all but guaranteed. In the big debate of TRILL against SPB, it’s going to come down to popularity. I think we’re already seeing the beginning of TRILL winning this fight. Sure, the service providers are going to use SPB as long as they can to avoid upgrading to TRILL-compatible hardware. I could even make a pretty compelling case the neither of these two layer 2 protocols would make a bunch of sense for a service provider. At the end of the day, though, I’m pretty sure that we’ll eventually be speaking about SPB in the same hushed nostalgia we reserve for the losers of the format wars so many years ago.
Here are a few posts about TRILL and SPB that generated some ideas for me. You should check them out too:
Does TRILL Stand A Chance At Wide Adoption – Ethan Banks
Why SPB Doesn’t Get Any Attention – Greg Ferro
TRILL and 802.1aq (SPB) Are Like Apples and Oranges – Ivan Pepelnjak
NANOG 50 TRILL vs. SPB Great Debate – PDF of a huge discussion presentation
The difference is that these protocols are not for consumers, but rather for engineers. In this case the strengths of one are the weaknesses of the other. I view this battle as more along the lines of is-is/ospf where both protocols coexist in their separate spaces.
It’s often claimed that VHS won because the porn industry adopted it as their format of choice.
TRILL right now has a couple of advantages, and Cisco’s Fabricpath – albeit “an enhanced superset of TRILL” – is one of them. It’s out there on a key product from a market leader, so assuming the future “standards mode” compatibility they promise, even if you have other vendors’ switches in your data center, you’re going to want compatibility with the big guys. SPB’s big selling point was existing OA&M and support from existing ASICs, but I’m not sure that’s enough. Still, anything to get rid of STP is a win in my book.
I think there’s a better comparison to be made.
I would compare it to AMD vs. Intel in the battle for the 64 bit desktop processor with a few distinctions. AMD was clearly the early winner in that space but once they woke up the 800 lbs gorilla they didn’t stand much of a chance in the long haul. I would argue that Avaya (formerly Nortel) is in pretty much the same scenario although there are some lines to be drawn between enterprise and service provider networks. I’m not sure I’d declare Avaya the early winner though, only time will truly tell the story. However, Avaya/Nortel has been successfully deploying SPB for a number of years in their Metro Ethernet product for service providers. Avaya has only recently brought that technology into their enterprise solutions.
In my opinion it’s very similar to the early days of active/active Layer 2 pathing and dual core redundancy. Avaya was the only leader in that space with their IST/SMLT technology. Eventually Cisco realized that Spanning Tree wasn’t going to hold it’s own and out came Virtual Port Channels – which I would probably agree is technically superior to SMLT today. However, it took a long time for Cisco to come to that realization, just as it took Intel quite sometime to realize that consumers really did want a 64bit desktop processor.
I’m not sure who will win out but I know that there are quite a number of differences between service provider networks and enterprise networks that there might be room enough for both protocols.
FYI – Most ASICs comming out this year can do TRILL and SPB for L2VPN can can do L3VPN over SPB.
Avaya has a nice edge with their SPB in that it runs both L2VPN and L3VPNs over the mac-in-mac forwarding which is extremely clean because one ISIS protocol instance can do everything for a small enterprise whereas with other solutions they would have to deploy an L2VPN solution and an L3VPN solution in parallel. So if you need both types of VPN you end up with twice the OPEX costs with other solutions. Since customers don’t care about protocol wars they pick the lowest cost solution.
Its interesting to compare SPB to MPLS in that MPLS transport labels are the underlay on which L2 and L3VPN’s can reside with a common transport routing protocol. Its very similar with SPB and in particular Avayas implementation.
Anyway the competition is good for both protocols. TRILL is busy adding proper OA&M and 24 bit service instances as a result of competition and SPB is adding hop by hop ECMP in the form of 802.1Qbp. In the end the customer wins, and has more choice, better choices.
Both protocols however will have to face the VXLAN/NVGRE/NVO3 stuff which will be stiff competition in the next round of ASICs.
Love betamax vhs comparisons! Thanks very funny and very informative.
Is it about replacing Spanning Tree, or is it about network virtualization with optimal usage of the infrastructure? 10 years ago, the focus was on the former, but now with the compute virtualization driving huge changes, network virtualization is the name of the game.
In order to address today’s and tomorrow’s network virtualization requirements, the underlying infrastructure and protocols need to be able to support this in an elegant and cost effective way. The IEEE 802.1 group, with all its members from leading networking companies, recognized this trend early and standardized a service instance in the Ethernet packet (more than 4 years ago) which allows true network virtualization. It has the significant benefit that infrastructure and end-to-end connectivity layer can be logically separated from each other, enabling true service abstraction.
This service abstraction can be used to provide L2 services, L3 services or can go even further to Application based services…. Networking is changing, we have to face it!
Pingback: Internets of Interest for 11th October 2012 — EtherealMind
Great article and thought-provoking comments. No disagreements. I’ll add a few thoughts.
Thinking about the VHS/Betamax analogy, it’s worth noting that Betamax was considered technically superior. Consumers may not have cared about this, but Professionals did (one of the reasons Betamax/Betacam were widely used in the professional video industry for many years). My point: In discussing technical standards it comes down the market requirements – Service Provider and Enterprise networks are usually different in that regard (though I’m starting to see large enterprises become much more like service providers in some aspects). I could easily see two protocols such as SPB, TRILL, and many of the other proprietary L2 multipath solutions co-existing for many years just like VHS and Betacam did. As mentioned in comments, it’s probably not that hard to build all of these of features into an ASIC if there is desire to do so.
Some protocols are widely-used in Service Providers, but never make it down to the Enterprise space (like SS7). Others, such as BGP, were used for a long-time in carrier networks before moving into the enterprise.
It came down to requirements.
Pingback: Avaya and the Magic of SPB | The Networking Nerd
Pingback: The Alignment of Net Neutrality | The Networking Nerd
Well, it won’t be any of the two, at least not in my datacenter. L3 scales, L2 doesn’t, or at the price of a control plane that noone would ever want to manage, at least not in my reality. These problems belong to the edge (because complexity belongs to the edge) and are solved either by overlays or scale out applications.