VMware and VeloCloud: A Hedge Against Hyperconvergence?

VMware announced on Thursday that they are buying VeloCloud. This was a big move in the market that immediately set off a huge discussion about the implications. I had originally thought AT&T would buy VeloCloud based on their relationship in the past, but the acquistion of Vyatta from Brocade over the summer should have been a hint that wasn’t going to happen. Instead, VMware swooped in and picked up the company for an undisclosed amount.

The conversations have been going wild so far. Everyone wants to know how this is going to affect the relationship with Cisco, especially given that Cisco put money into VeloCloud in both 2016 and 2017. Given the acquisition of Viptela by Cisco earlier this year it’s easy to see that these two companies might find themselves competing for marketshare in the SD-WAN space. However, I think that this is actually a different play from VMware. One that’s striking back at hyperconverged vendors.

Adding The Value

If you look at the marketing coming out of hyperconvergence vendors right now, you’ll see there’s a lot of discussion around platform. Fast storage, small footprints, and the ability to deploy anywhere. Hyperconverged solutions are also starting to focus on the hot new trends in compute, like containers. Along the way this means that traditional workloads that run on VMware ESX hypervisors aren’t getting the spotlight they once did.

In fact, the leading hyperconvergence vendor Nutanix has been aggressively selling their own hypervisor, Acropolis as a competitor to VMware. They tout new features and easy configuration as the major reason to use Acropolis over ESX. The push by Nutanix is to get their customers off of ESX and on to Acropolis to get a share of the VMware budget that companies are currently paying.

For VMware, it’s a tough sell to keep their customers on ESX. There’s a very big ecosystem of software out there that runs on ESX, but if you can replicate a large portion of it natively like Acropolis and other hypervisors do there’s not much of a reason to stick with ESX. And if the VMware solution is more expensive over time you will find yourself choosing the cheaper alternative when the negotiations come up for renewal.

For VMware NSX, it’s an even harder road. Most of the organizations that I’ve seen deploying hyperconverged solutions are not huge enterprises with massive centralized data centers. Instead, they are the kind small-to-medium businesses that need some functions but are very budget conscious. They’re also very geographically diverse, with smaller branch offices taking the place of a few massive headquarters locations. While NSX has some advantages for these companies, it’s not the best fit for them. NSX works optimally in a data center with high-speed links and a well-built underlay network.

vWAN with VeloCloud

So how is VeloCloud going to play into this? VeloCloud already has a lot of advantages that made them a great complement to VMware’s model. They have built-in multi tenancy. Their service delivery is virtualized. They were already looking to move toward service providers as their primary market, but network services and managed service providers. This sounds like their interests are aligning quite well with VMware already.

The key advantage for VMware with VeloCloud is how it will allow NSX to extend into the branch. Remember how I said that NSX loves an environment with a stable underlay? That’s what VeloCloud can deliver. A stable, encrypted VPN underlay. An underlay that can be managed from one central location, or in the future, perhaps even a vCenter plugin. That gives VeloCloud a huge advantage to build the underlay to get connectivity between branches.

Now, with an underlay built out, NSX can be pushed down into the branch. Branches can now use all the great features of NSX like analytics, some of which will be bolstered by VeloCloud, as well as microsegmentation and other heretofore unseen features in the branch. The large headquarters data center is now available in a smaller remote size for branches. That’s a huge advantage for organizations that need those features in places that don’t have data centers.

And the pitch against using other hypervisors with your hyperconverged solution? NSX works best with ESX. Now, you can argue that there is real value in keeping ESX on your remote branches is not costs or features that you may one day hope to use if your WAN connection gets upgraded to ludicrous speed. Instead, VeloCloud can be deployed between your HQ or main office and your remote site to bring those NSX functions down into your environment over a secure tunnel.

While this does compete a bit with Cisco from a delivery standpoint, it still doesn’t affect them with complete overlap. In this scenario, VeloCloud is a service delivery platform for NSX and not a piece of hardware at the edge. Absent VeloCloud, this kind of setup could still be replicated with a Cisco Viptela box running the underlay and NSX riding on top in the overlay. But I think that the market that VMware is going after is going to be building this from the ground up with VMware solutions from the start.


Tom’s Take

Not every issues is “Us vs. Them”. I get that VMware and Cisco seem to be spending more time moving closer together on the networking side of things. SD-WAN is a technology that was inevitably going to bring Cisco into conflict with someone. The third generation of SD-WAN vendors are really companies that didn’t have a proper offering buying up all the first generation startups. Viptela and VeloCloud are now off the market and they’ll soon be integral parts of their respective parent’s strategies going forward. Whether VeloCloud is focused on enabling cloud connectivity for VMware or retaking the branch from the hyperconverged vendors is going to play out in the next few months. But instead of focusing on conflict with anyone else, VeloCloud should be judged by the value it brings to VMware in the near term.

Advertisements

Back In The Saddle Of A Horse Of A Different Color

I’ve been asked a few times in the past year if I missed being behind a CLI screen or I ever got a hankering to configure some networking gear. The answer is a guarded “yes”, but not for the reason that you think.

Type Casting

CCIEs are keyboard jockeys. Well, the R&S folks are for sure. Every exam has quirks, but the R&S folks have quirky QWERTY keyboard madness. We spend a lot of time not just learning commands but learning how to input them quickly without typos. So we spend a lot of time with keys and a lot less time with the mouse poking around in a GUI.

However, the trend in networking has been to move away from these kinds of input methods. Take the new Aruba 8400, for instance. The ArubaOS-CX platform that runs it seems to have been built to require the least amount of keyboard input possible. The whole system runs with an API backend and presents a GUI that is a series of API calls. There is a CLI, but anything that you can do there can easily be replicated elsewhere by some other function.

Why would a company do this? To eliminate wasted effort. Think to yourself how many times you’ve typed the same series of commands into a switch. VLAN configuration, vty configs, PortFast settings. The list goes on and on. Most of us even have some kind of notepad that we keep the skeleton configs in so we can paste them into a console port to get a switch up and running quickly. That’s what Puppet was designed to replace!

By using APIs and other input methods, Aruba and other companies are hoping that we can build tools that either accept the minimum input necessary to configure switches or that we can eliminate a large portion of the retyping necessary to build them in the first place. It’s not the first command you type into a switch that kills you. It’s the 45th time you paste the command in. It’s the 68th time you get bored typing the same set of arguments from a remote terminal and accidentally mess this one up that requires a physical presence on site to reset your mistake.

Typing is boring, error prone, and costs significant time for little gain. Building scripts, programs, and platforms that take care of all that messy input for us makes us more productive. But it also changes the way we look at systems.

Bird’s Eye Views

The other reason why my fondness for keyboard jockeying isn’t as great as it could be is because of the way that my perspective has shifted thanks to the new aspects of networking technology that I focus on. I tell people that I’m less of an engineer now and more of an architect. I see how the technologies fit together. I see why they need to complement each other. I may not be able to configure a virtual link without documentation or turn up a storage LUN like I used to, but I understand why flash SSDs are important and how APIs are going to change things.

This goes all they way back to my conversations at VMunderground years ago about shifting the focus of networking and where people will go. You remember? The “ditch digger” discussion?

 

This is more true now than ever before. There are always going to be people racking and stacking. Or doing basic types of configuration. These folks are usually trained with basic knowledge of their task with no vision outside of their job role. Networking apprentices or journeymen as the case may be. Maybe one out of ten or one out of twenty of them are going to want to move up to something bigger or better.

But for the people that read blogs like this regularly the shift has happened. We don’t think in single switches or routers. We don’t worry about a single access point in a closet. We think in terms of systems. We configure routing protocols across multiple systems. We don’t worry about a single port VLAN issue. Instead, we’re trying to configure layer 2 DCI extensions or bring racks and pods online at the same time. Our visibility matters more than our typing skills.

That’s why the next wave of devices like the Aruba 8400 and the Software Defined Access things coming from Cisco are more important than simple checkboxes on a feature sheet. They remove the visibility of protocols and products and instead give us platforms that need to be configured for maximum effect. The gap between the people that “rack and stack” and those that build the architecture that runs the organization has grown, but only because the middle ground of administration is changing so fast that it’s tough to keep up.


Tom’s Take

If I were to change jobs tomorrow I’m sure that I could get back in the saddle with a couple of weeks of hard study. But the question I keep asking myself is “Why would I want to?” I’ve learned that my value doesn’t come from my typing speed or my encyclopedia of networking command arguments any more. It comes from a greater knowledge of making networking work better and integrate more tightly into the organization. I’m a resource, not a reactionary. And so when I look to what I would end up doing in a new role I see myself learning more and more about Python and automation and less about what new features were added in the latest OSPF release on Cisco IOS. Because knowing how to integrate technology at a high level is more valuable to everyone than just knowing the commands to type to turn the lights on.

Changing The Baby With The Bathwater In IT

If you’re sitting in a presentation about the “new IT”, there’s bound to be a guest speaker talking about their digital transformation or service provider shift in their organization. You can see this coming. It’s a polished speaker, usually a CIO or VP. They talk about how, with the help of the vendor on stage with them, they were able to rapidly transform their infrastructure into something modern while at the same time changing processes to accommodate faster IT response, more productive workers, and increase revenue or transform IT from a cost center to a profit center. The key components are simple:

  1. Buy new infrastructure from $vendor
  2. Transform all processes to be more agile, productive, and better.

Why do those things always happen in concert?

Spring Cleaning

Infrastructure grows old. That’s a fact of life. Outside of some very specialized hardware, no one is using the same desktop they had ten years ago. No enterprise is still running Windows 2000 server on an IBM NetFinity server. No one is still using 10Mbps Ethernet over Thinnet to connect their offices. Hardware marches on. So when we buy new things, we as technology professionals need to find a way to integrate them into our existing technology stack.

Processes, on the other hand, are very slow to change. I can remember dealing with process issues when I was an intern for IBM many, many years ago. The process we had for deploying a new workstation had many, many reboots involved. The deployment team worked out a new strategy to streamline deployments and make things run faster. We brought our plan to the head of deployments. From there, we had to:

  • Run tests to prove that it was faster
  • Verify that the process wasn’t compromised in any way
  • Type up new procedures in formal language to match the existing docs
  • Then submit them for ISO approval

And when all those conditions were met, we could finally start using our process. All in all, with aggressive testing, it still took two months.

Processes are things that are thought to be carved in stone, never to be modified or changed in any way for the rest of time. Unless the stones break or something major causes a process change. Usually, that major change is a whole truckload of new equipment showing up on the back dock attached to a consultant telling IT there is a better way (TM) to do things.

Ceteris Paribus

Ceteris Paribus is a latin term that means “all else unchanged”. We use it when we talk about having multiple variables in an equation and the need to keep them constant to be able to measure changes appropriately.

The funny thing about all these transformations is that it’s hard to track what actually made improvements when you’re changing so many things at once. If the new hardware is three or four times faster than your old equipment, would it show that much improvement if you just used your old software and processes on it? How much faster could your workloads execute with new CPUs and memory management techniques? How about collapsing your virtual infrastructure onto fewer and fewer physical servers because of advances there? Running old processes on new hardware can give you a very good idea of how good the hardware is. Does it meet the criteria for selection that you wanted when it was purchased? Or, better still, does it seems like you’re not getting the performance you paid for?

Likewise, how are you able to know for sure that the organization and process changes you implemented actually did anything? If you’re implementing them on new hardware how can you capture the impact? There’s no rule that says that new processes can only be implemented on new shiny hardware. Take a look at what Walmart is doing with OpenStack. They most certainly aren’t rushing out to buy tons and tons of new servers just for OpenStack integration. Instead, they are taking streamlined processes and implementing them on existing infrastructure to see the benefits. Then it’s easy to measure and say how much hardware you need to expand instead of overbuying for the process changes you make.


Tom’s Take

So, why do these two changes always seem to track with each other? The optimist in me wants to believe that it’s people deciding to make positive changes all at once to pull their organization into the future. Since any installation is disruptive, it’s better to take the huge disruption and retrain for the massive benefits down the road. It’s a rosy picture indeed.

The pessimist in me wonders if all these massive changes aren’t somehow tied to the fact that they always come with massive new hardware purchases from vendors. I would hope there isn’t someone behind the scenes with the ear of the CIO pushing massive changes in organization and processes for the sake of numbers. I would also sincerely hope that the idea isn’t to make huge organizational disruptions for the sake of “reducing overhead” or “helping tell the world your story” or, worse yet, “making our product look good because you did such a great job with all these changes”.

The optimist in me is hoping for the best. But the pessimist in me wonders if reality is a bit less rosy.

Extreme-ly Interesting Times In Networking

If you’re a fan of Extreme Networks, the last few months have been pretty exciting for you. Just yesterday, it was announced that Extreme is buying the data center networking business of Brocade for $55 million once the Broadcom acquisition happens. Combined with the $100 million acquisition of Avaya’s campus networking portfolio on March 7th and the purchase of Zebra Wireless (nee Motorola) last September, Extreme is pushing itself into the market as a major player. How is that going to impact the landscape?

Building A Better Business

Extreme has been a player in the wireless space for a while. Their acquisition of Enterasys helped vault them into the mix with other big wireless players. Now, the rounding out of the portfolio helps them complete across the board. They aren’t just limited to playing with stadium wifi and campus technologies now. The campus networking story that was brought in through Avaya was a must to help them compete with Aruba, A Hewlett Packard Enterprise Company. Aruba owns the assets of HPE’s campus networking business and has been leveraging them effectively.

The data center play was an interesting one to say the least. I’ve mused recently that Brocade’s data center business may end up lying fallow once Arris grabbed Ruckus. Brocade had some credibility in very large networks through VCS and the MLX router series, but outside of the education market and specialized SDN deployments it was rare to encounter them. Arista has really dug into Cisco’s market share here and the rest of the players seem to be content to wait out that battle. Juniper is back in the carrier business, and the rest seem to be focusing now on OCP and the pieces that flow logically from that, such as Six Pack, Backpack, and Whatever Facebook Thinks The Next Fast Switch Should Be Called That Ends In “Pack”.

Seeing Extreme come from nowhere to snap up the data center line from Brocade signals a new entrant into the data center crowd. Imagine, if you will, a mosh pit. Lots of people fighting for their own space to do their thing. Two people in the middle have decided to have an all-out fight over their space. Meanwhile, everyone else is standing around watching them. Finally, a new person enters the void of battle to do their thing on the side away from the fistfight that has captured everyone’s attention. This is where Extreme finds itself now.

Not Too Extreme

The key for Extreme now is to tell the “Full Stack” story to customers. Whereas before they had to hand off the high end to another “frenemy” and hope that it didn’t come back to bite them, now Extreme can sell all the way up and down the stack. They have some interesting ideas about SDN that will bear some watching as they begin to build them into their stack. The integration of VCS into their portfolio will take some time, as the way that Brocade does their fabric implementation is a bit different than the rest of the world.

This is also a warning call for the rest of the industry. It’s time to get off the sidelines and choose your position. Arista and Cisco won’t be fighting forever. Cisco is also reportedly looking to create a new OS to bring some functionality to older devices. That means that they can continue and try to innovate while fighting against their competitors. The winner of the Cisco and Arista battle is inconsequential to the rest of the industry right now. Either Arista will be wiped off the map and a stronger Cisco will pick a new enemy, or Arista will hurt Cisco and pull even with them in the data center market, leaving more market share for others to gobble up.

Extreme stands a very good chance of picking up customers with their approach. Customers that wouldn’t have considered them in the past will be lining up to see how Avaya campus gear will integrate with Enterasys wireless and Brocade data center gear. It’s not all the different from the hodge-podge approach that many companies have picked for years to lower costs and avoid having a single vendor solution. Now, those lower cost options are available in a single line of purple boxes.


Tom’s Take

Who knew we were going to get a new entrant into the Networking Wars for the tidy sum of $155 million? Feels like it should have cost more than that, but given the number of people holding fire sales to get rid of things they have to divest before pending acquisition or pending dissolution, it really doesn’t come as much surprise. Someone had to buy these pieces and put them together. I think Extreme is going to turn some heads and make some for some interesting conversations in the next few months. Don’t count them out just yet.

HPE Networking: Past, Present, and Future

hpe_pri_grn_pos_rgb

I had the chance to attend HPE Discover last week by invitation from their influencer team. I wanted to see how HPE Networking had been getting along since the acquisition of Aruba Networks last year. There have been some moves and changes, including a new partnership with Arista Networks announced in September. What follows is my analysis of HPE’s Networking portfolio after HPE Discover London and where they are headed in the future.

Campus and Data Center Divisions

Recently, HPE reorganized their networking division along two different lines. The first is the Aruba brand that contains all the wireless assets along with the campus networking portfolio. This is where the campus belongs. The edge of the network is an ever-changing area where connectivity is king. Reallocating the campus assets to the capable Aruba team means that they will do the most good there.

The rest of the data center networking assets were loaded into the Data Center Infrastructure Group (DCIG). This group is headed up by Dominick Wilde and contains things like FlexFabric and Altoline. The partnership with Arista rounds out the rest of the switch portfolio. This helps HPE position their offerings across a wide range of potential clients, from existing data center infrastructure to newer cloud-ready shops focusing on DevOps and rapid application development.

After hearing Dom Wilde speak to us about the networking portfolio goals, I think I can see where HPE is headed going forward.

The Past: HPE FlexFabric

As Dom Wilde said during our session, “I have a market for FlexFabric and can sell it for the next ten years.” FlexFabric represents the traditional data center networking. There is a huge market for existing infrastructure for customers that have made a huge investment in HPE in the past. Dom is absolutely right when he says the market for FlexFabric isn’t going to shrink the foreseeable future. Even though the migration to the cloud is underway, there are a significant number of existing applications that will never be cloud ready.

FlexFabric represents the market segment that will persist on existing solutions until a rewrite of critical applications can be undertaken to get them moved to the cloud. Think of FlexFabric as the vaunted buggy whip manufacturer. They may be the last one left, but for the people that need their products they are the only option in town. DCIG may have eyes on the future, but that plan will be financed by FlexFabric.

The Present: HPE Altoline

Altoline is where HPE was pouring their research for the past year. Altoline is a product line that benefits from the latest in software defined and webscale technologies. It is technology that utilizes OpenSwitch as the operating system. HPE initially developed OpenSwitch as an open, vendor-neutral platform before turning it over to the Linux Foundation this summer to run with development from a variety of different partners.

Dom brought up a couple of great use cases for Altoline during our discussion that struck me as brilliant. One of them was using it as an out-of-band monitoring solution. These switches don’t need to be big or redundant. They need to have ports and a management interface. They don’t need complexity. They need simplicity. That’s where Altoline comes into play. It’s never going to be as complex as FlexFabric or as programmable as Arista. But it doesn’t have to be. In a workshop full of table saw and drill presses, Altoline is a basic screwdriver. It’s a tool you can count on to get the easy jobs done in a pinch.

The Future: Arista

The Arista partnership, according to Dom Wilde, is all about getting ready for the cloud. For those customers that are looking at moving workloads to the cloud or creating a hybrid environment, Arista is the perfect choice. All of Arista’s recent solution sets have been focused on providing high-speed, programmable networking that can integrate a number of development tools. EOS is the most extensible operating system on the market and is a favorite for developers. Positioning Arista at the top of the food chain is a great play for customers that don’t have a huge investment in cloud-ready networking right now.

The question that I keep coming back to is…when does this Arista partnership become an acquisition? There is a significant integration between the two companies. Arista has essentially displaced the top of the line for HPE. How long will it take for Arista to make the partnership more permanent? I can easily foresee HPE making a play for the potential revenues produced by Arista and the help they provide moving things to the cloud.


Tom’s Take

I was the only networking person at HPE Discover this year because the HPE networking story has been simplified quite a bit. On the one hand, you have the campus tied up with Aruba. They have their own story to tell in a different area early next year. On the other hand, you have the simplification of the portfolio with DCIG and the inclusion of the Arista partnership. I think that Altoline is going to find a niche for specific use cases but will never really take off as a separate platform. FlexFabric is in maintenance mode as far as development is concerned. It may get faster, but it isn’t likely to get smarter. Not that it really needs to. FlexFabric will support legacy architecture. The real path forward is Arista and all the flexibility it represents. The question is whether HPE will try to make Arista a business unit before Arista takes off and becomes too expensive to buy.

Disclaimer

I was an invited guest of HPE for HPE Discover London. They paid for my travel and lodging costs as well as covering event transportation and meals. They did not ask for nor were they promised any kind of consideration in the coverage provided here. The opinions and analysis contained in this article represent my thoughts alone.

Nutanix and Plexxi – An Affinity to Converge

nutanix-logo

Nutanix has been lighting the hyperconverged world on fire as of late. Strong sales led to a big IPO for their stock. They are in a lot of conversations about using their solution in place of large traditional virtualization offerings that include things like blade servers or big boxes. And even coming off the recent Nutanix .NEXT conference there were some big announcements in the networking arena to help them complete their total solution. However, I think Nutanix is missing a big opportunity that’s right in front of them.

I think it’s time for Nutanix to buy Plexxi.

Software Says

If you look at the Nutanix announcements around networking from .NEXT, they look very familiar to anyone in the server space. The highlights include service chaining, microsegmentation, and monitoring all accessible through an API. If this sounds an awful lot like VMware NSX, Cisco ACI, or any one of a number of new networking companies then you are in the right mode of thinking as far as Nutanix is concerned.

SDN in the server space is all about overlay networking. Segmentation of flows and service chaining are the reason why security is so hard to do in the networking space today. Trying to get traffic to behave in a certain way drives networking professionals nuts. Monitoring all of that to ensure that you’re actually doing what you say you’re doing just adds complexity. And the API is the way to do all of that without having to walk down to the data center to console into a switch and learn a new non-Linux CLI command set.

SDN vendors like VMware and Cisco ACI would naturally have jumped onto these complaints and difficulties in the networking world and both have offered solutions for them with their products. For Nutanix to have bundled solutions like this into their networking offering is no accident. They are looking to battle VMware head-to-head and need to offer the kind of feature parity that it’s going to take a make medium to large shops shift their focus away from the VMware ecosystem and take a long look at what Nutanix is offering.

In a way, Nutanix and VMware are starting to reinforce the idea that the network isn’t a magical realm of protocols and tricks that make applications work. Instead, it’s a simple transport layer between locations. For instance, Amazon doesn’t rely on the magic of the interstate system to get your packages from the distribution center to your home. Instead, the interstate system is just a transport layer for their shipping overlays – UPS, FedEX, and so on. The overlay is where the real magic is happening.

Nutanix doesn’t care what your network looks like. They can do almost everything on top of it with their overlay protocols. That would seem to suggest that the focus going forward should be to marginalize or outright ignore the lower layers of the network in favor of something that Nutanix has visibility into and can offer control and monitoring of. That’s where the Plexxi play comes into focus.

Plexxi Logo

Affinity for Awesome

Plexxi has long been a company in search of a way to sell what they do best. When I first saw them years ago, they were touting their Affinities idea as a way to build fast pathways between endpoints to provide better performance for applications that naturally talked to each other. This was a great idea back then. But it quickly got overshadowed by the other SDN solutions out there. It even caused Plexxi to go down a slightly different path for a while looking at other options to compete in a market that they didn’t really have a perfect fit product.

But the Affinities idea is perfect for hyperconverged solutions. Companies like Nutanix are marking their solutions as the way to create application-focused compute nodes on-site without the need to mess with the cloud. It’s a scalable solution that will eventually lead to having multiple nodes in the future as your needs expand. Hyperconverged was designed to be consumable per compute unit as opposed to massively scaling out in leaps and bounds.

Plexxi Affinities is just the tip of the iceberg. Plexxi’s networking connectivity also gives Nutanix the ability to build out a high-speed interconnect network with one advantage – noninterference. I’m speaking about what happens when a customer needs to add more networking ports to support this architecture. They need to make a call to their Networking Vendor of Choice. In the case of Cisco, HPE, or others, that call will often involve a conversation about what they’re doing with the new network followed by a sales pitch for their hyperconverged solution or a partner solution that benefits both companies. Nutanix has a reputation for being the disruptor in traditional IT. The more they can keep their traditional competitors out of the conversation, the more likely they are to keep the business into the future.


Tom’s Take

Plexxi is very much a company with an interesting solution in need of a friend. They aren’t big enough to really partner with hyperconverged solutions, and most of the hyperconverged market at this point is either cozy with someone else or not looking to make big purchases. Nutanix has the rebel mentality. They move fast and strike quickly to get their deals done. They don’t take prisoners. They look to make a splash and get people talking. The best way to keep that up is to bundle a real non-software networking component alongside a solution that will make the application owners happy and keep the conversation focused on a single source. That’s how Cisco did it back and the day and how VMware has climbed to the top of the virtualization market.

If Nutanix were to spend some of that nice IPO money on a Plexxi Christmas present, I think 2017 would be the year that Nutanix stops being discussed in hushed whispers and becomes a real force to be reckoned with up and down the stack.

Facebook Wedge 100 – The Future of the Data Center?

 

FBLike

Facebook is back in the news again. This time, it’s because of the release of their new Wedge 100 switch into the Open Compute Project (OCP). Wedge was already making headlines when Facebook announced it two years ago. A fast, open sourced 40Gig Top-of-Rack (ToR) switch was huge. Now, Facebook is letting everyone in on the fun of a faster Wedge that has been deployed into production at Facebook data centers as well as being offered for sale through Edgecore Networks, which is itself a division of Accton. Accton has been leading the way in the whitebox switching market and Wedge 100 may be one of the ways it climbs to the top.

Holy Hardware!

Wedge 100 is pretty impressive from the spec sheet. They paid special attention to making sure the modules were expandable, especially for faster CPUs and special purpose devices down the road. That’s possible because Wedge is a highly specialized micro server already. Rather than rearchitecting the guts of the whole thing, Facebook kept the CPU and the monitoring stack and just put newer, faster modules on it to ramp to 32x100Gig connectivity.

12809187_1676340369272065_1831349201_n

As suspected in the above image, Facebook is using Broadcom Tomahawk as the base connectivity in their switch, which isn’t surprising. Tomahawk is the roadmap for all vendors to get to 100Gig. It also means that the downlink connectivity for these switches could conceivably work in 25/50Gig increments. However, given the enormous amount of east/west traffic that Facebook must generate, Facebook has created a server platform they call Yosemite that has 100Gig links as well. Given the probably backplane there, you can imagine the data that’s getting thrown around the data centers.

That’s not all. Omar Baldonado has said that they are looking at going to 400Gig connectivity soon. That’s the kind of mind blowing speed that you see in places like Google and Facebook. Remember that this hardware is built for a specific purpose. They don’t just have elephant flows. They have flows the size of an elephant herd. That’s why they fret about the operating temperature of optics or the rack design they want to use (standard versus Open Racks). Because every little change matters a thousand fold at that scale.

Software For The People

The other exciting announcement from Facebook was on the software front. Of course, FBOSS has been updated to work with Wedge 100. I found it very interesting in the press release that much of the programming in FBOSS went into interoperability with Wedge 40 and with fixing the hardware side of things. This makes some sense when you realize that Facebook didn’t need to spend a lot of time making Wedge 40 interoperate with anything, since it was a wholesale replacement. But Wedge 100 would need to coexist with Wedge 40 as the rollout happens, so making everything play nice is a huge point on the checklist.

The other software announcement that got the community talking was support for third-party operating systems running on Wedge 100. The first one up was Open Network Linux from Big Switch Networks. ONL ran on the original Wedge 40 and now runs on the Wedge 100. This means that if you’re familiar with running BSN OSes on your devices, you can drop in a Wedge 100 in your spine or fabric and be ready to go.

The second exciting announcement about software comes from a new company, Apstra. Apstra announced their entry into OCP and their intent to get their Apstra Operating System (AOS) running on Wedge 100 by next year. That has a big potential impact for Apstra customers that want to deploy these switches down the road. I hope to hear more about this from Apstra during their presentation at Networking Field Day 13 next month.


Tom’s Take

Facebook is blazing a trail for fast ToR switches. They’ve got the technical chops to build what they need and release the designs to the rest of the world to be used for a variety of ideas. Granted, your data center looks nothing like Facebook. But the ideas they are pioneering are having an impact down the line. If Open Rack catches on you may see different ideas in data center standardization. If the Six Pack catches on as a new chassis concept, it’s going to change spines as well.

If you want to get your hands dirty with Wedge, build a new 100Gig pod and buy one from Edgecore. The downlinks can break out into 10Gig and 25Gig links for servers and knowing it can run ONL or Apstra AOS (eventually) gives you some familiar ground to start from. If it runs as fast as they say it does, it may be a better investment right now than waiting for Tomahawk II to come to your favorite vendor.