We Are Number Two!

Checklist

In my old IT life I once took a meeting with a networking company. They were trying to sell me on their hardware and get me to partner with them as a reseller. They were telling me how they were the number two switching vendor in the world by port count. I thought that was a pretty bold claim, especially when I didn’t remember seeing their switches at any of my deployments. When I challenged this assertion, I was told, “Well, we’re really big in Europe.” Before I could stop my mouth from working, I sarcastically replied, “So is David Hasselhoff.” Needless to say, we didn’t take this vendor on as a partner.

I tell this story often when I go to conferences and it gets laughs. As I think more and more about it the thought dawns on me that I have never really met the third best networking vendor in the market. We all know who number one is right now. Cisco has a huge market share and even though it has eroded somewhat in the past few years they still have a comfortable lead on their competitors. The step down into the next tier of vendors is where the conversation starts getting murky.

Who’s Number Two???

If you’ve watched a professional sporting event in the last few years, you’ve probably seen an unknown player in close up followed by an oddly specific statistic. Like Bucky has the most home runs by a left handed batter born in July against pitchers that thought The Matrix should have won an Oscar. When these strange stats are quoted, the subject is almost always the best or second best. While most of these stats are quoted by color announcers trying to fill airtime, it speaks to a larger issue. Especially in networking.

No one wants the third best anything. John Chambers is famous for saying during every keynote, “If Cisco can’t be number one or number two in a market we won’t be in it.” That’s easy to say when you’re the best. But how about the market down from there? Everyone is trying to position themselves as the next best option in some way or another. Number two by port counts. Number two by ports sold (which is somehow a different metric). Number two by units shipped or amount sold or some other way of phrasing things slightly differently in order to the viable alternative.

I don’t see this problem a lot in other areas. Servers have a clear first, second, and third order. Storage has a lot of tiering. Networking, on the other hand, doesn’t have a third best option. Yet there are way more than three companies doing networking. Brocade, HPE, Extreme, Dell, Cumulus Networks, and many more if you want to count wireless companies with switching gear for the purpose of getting into the Magic Quadrant. No one wants to be seen as the next best, next best thing.

How can we fix this? Well, for one thing we need impartial ranking. No more magical polygons and reports that take seventeen pages to say “It depends”. There are too many product categories that you can slice your solution into to be the best. Let’s get back to the easy things. Switches are campus or data center. Routers are campus or service provider. Hardware falls here or there. No unicorns. No single-product categories. If you’re the only product of your kind you are in the wrong place.


Tom’s Take

Once again, I think it’s time for a networking Consumer Reports type of publication. Or maybe something like Storage Review. We need someone to come in and tell vendors that they are the third or fourth best option out there by the following simple metrics. Yes, it’s very probable that said vendors would just ignore the ranking and continue building their own marketing bubble around the idea of being the second best switching platform for orange switches sold to school districts not named after presidents. Or perhaps finding out they are behind the others will spur people inside the company into actually getting better. It’s a faint hope, but hey. The third time’s a charm.

The Tortoise and the Austin Hare

Dell_Logo

Dell announced today the release of their newest network operating system, OS10 (note the lack of an X). This is an OS that is slated to build on the success that Dell has had selling 3rd party solutions from vendors like Cumulus Networks and Big Switch. OS10’s base core will be built on an unmodified Debian distro that will have a “premium” feature set that includes layer 2 and layer 3 functionality. The aim to have a fully open-source base OS in the networking space is lofty indeed, but the bigger question to me is “what happens to Cumulus”?

Storm Clouds

As of right this moment, before the release of Dell OS10, the only way to buy Linux on a Dell switch is to purchase it with Cumulus. In the coming months, Dell will begin to phase in OS10 as an option in parallel with Cumulus. This is especially attractive to large environments that are running custom-built networking today. If your enterprise is running Quagga or sFlow or some other software that has been tweaked to meet your unique needs you don’t really need a ton of features wrapped in an OS with a CLI you will barely use.

So why introduce an OS that directly competes with your partners? Why upset that apple cart? It comes down to licenses. Every time someone buys a Dell data center switch, they have to pick an OS to run on it. You can’t just order naked hardware and install your own custom kernel and apps. You have to order some kind of software. When you look at the drop-down menu today you can choose from FTOS, Cumulus, or Big Switch. For the kinds of environments that are going to erase and reload anyway the choice is pointless. It boils down to the cheapest option. But what about those customers that choose Cumulus because it’s Linux?

Customers want Linux because they can customize it to their heart’s content. They need access to the switching hardware and other system components. So long as the interface is somewhat familiar they don’t really care what engine is driving it. But every time a customer orders a switch today with Cumulus as the OS option, Dell has to pay Cumulus for that software license. It costs Dell to ship Linux on a switch that isn’t made by Dell.

OS10 erases that fee. By ordering a base image that can only boot and address hardware, Dell puts a clean box in the hands of developers that are going to be hacking the system anyway. When the new feature sets are released later in the year that increase the functionality of OS10, you will likely see more customers beginning to experiment with running Linux development environments. You’ll also see Dell beginning to embrace a model that loads features on a switch as software modules instead of purpose-built appliances.

Silver Lining

Dell’s future is in Linux. Rebuilding their OS from the ground up to utilize Linux only makes sense given industry trends. Junos, EOS, and other OSes from upstarts like Pluribus and Big Switch are all based on Linux or BSD. Reinventing the wheel makes very little sense there. But utilizing the Switch Abstraction Interface (SAI) developed for OpenCompute gives them an edge to focus on northbound feature development while leaving the gory details of addressing hardware to the abstraction talking to something below it.

Dell isn’t going to cannibalize their Cumulus partnership immediately. There are still a large number of shops running Cumulus that a are going to want support from their vendor of choice in the coming months. Also, there are a large number of Dell customers that aren’t ready to disaggregate hardware from software radically. Those customers will require some monitoring, as they are likely to buy the cheapest option as opposed to the best fit and wind up with a switch that will boot and do little else to solve network problems.

In the long term, Cumulus will continue to be a fit for Dell as long as OS10 isn’t ported to the Campus LAN. Once that occurs, you will likely see a distancing of these two partners as Dell embraces their own Linux OS options and Cumulus moves on to focus on using whitebox hardware instead of bundling themselves with existing vendors. Once the support contracts expire on the Cumulus systems supported by Dell, I would expect to see a professional services offering to help those users of Cumulus-on-Dell migrate to a “truly open and unmodified kernel”.


Tom’s Take

Dell is making strides in opening up their networking with Linux and open source components. Juniper has been doing it forever, and HP recently jumped into the fray with OpenSwitch. Making yourself open doesn’t solve your problems or conjure customers out of thin air. But it does give you a story to tell about your goals and your direction. Dell needs to keep their Cumulus partnerships going forward until they can achieve feature parity with the OS that currently runs on their data center switches. After that happens and the migration plans are in place, expect to see a bit of jousting between the two partners about which approach is best. Time will tell who wins that argument.

 

 

SDN Myths Revisited

techunplugged-logo

I had a great time at TECHunplugged a couple of weeks ago. I learned a lot about emerging topics in technology, including a great talk about the death of disk from Chris Mellor of the Register. All in all, it was a great event. Even with a presentation from the token (ring) networking guy:

I had a great time talking about SDN myths and truths and doing some investigation behind the scenes. What we see and hear about SDN is only a small part of what people think about it.

SDN Myths

Myths emerge because people can’t understand or won’t understand something. Myths perpetuate because they are larger than life. Lumberjacks and blue oxen clearing forests. Cowboys roping tornadoes. That kind of thing. With technology, those myths exist because people don’t want to believe reality.

SDN is going to take the jobs of people that can’t face the reality that technology changes rapidly. There is a segment of the tech worker populace that just moves from new job to new job doing the same old things. We leave technology behind all the time without a care in the world. But we worry when people can’t work on that technology.

I want you to put your hands on a floppy disk. Go on, I’ll wait. Not so easy, is it? Removable disk technology is on the way out the door. Not just magnetic disk either. I had a hard time finding a CD-ROM drive the other day to read an old disc with some pictures. I’ve taken to downloading digital copies of films because my kids don’t like operating a DVD player any longer. We don’t mourn the passing of disks, we celebrate it.

Look at COBOL. It’s a venerable programming language that still runs a large percentage of insurance agency computer systems. It’s safe to say that the amount of money it would cost to migrate away from COBOL to something relatively modern would be in the millions, if not billions, of dollars. Much easier to take a green programmer and teach them an all-but-dead language and pay them several thousand dollars to maintain this out-of-date system.

It’s like the old story of buggy whip manufacturers. There’s still a market for them out there. Not as big as it was before the introduction of the automobile. But it’s there. You probably can’t break into that market and you had better be very good (or really cheap) at making them if you want to get a job doing it. The job that a new technology replaced is still available for those that need that technology to work. But most of the rest of society has moved on and the old technology fills a niche roll.

SDN Truths

I wasn’t kidding when I said that Gartner not having an SDN quadrant was the smartest thing they ever did (aside from the shot at stretched layer 2 DCI). I say this because it will finally force customers to stop asking for a magic bullet SDN solution and it will force traditional networking vendors to stop packaging a bunch of crap and selling it as a magic bullet.

When SDN becomes a part of the entire solution and not some mystical hammer that fixes all the nails in your environment, then the real transformation can happen. Then people that are obstructing real change can be marginalized and removed. And the technology can be the driver for advancement instead of someone coming down the hall complaining about things not working.

We spend so much time reacting to problems that we forgot how to solve them for good. We’re not being malicious. We just can’t get past the triage. That’s the heart of the fire fighter problem. Ivan wrote a great response to my fire fighter post and his points were spot on. Especially the ones about people standing in the way, whether it be through outright obstruction or by taking power away to affect real change. We can’t hold networking people responsible for the architecture and simultaneously keep them from solving the root issues. That’s the ham-handed kind of organizational roadblock that needs to change to move networking forward.


Tom’s Take

Talks like this don’t happen over night. They take careful planning and thought, followed by panic when you realize your 45-minute talk is actually 20-minutes. So you cut out the boring stuff and get right to the meat of the issue. In this case, that meat is the continued misperception of SDN no matter how much education we throw at the networking community. We’re not going to end up jobless programmers being lied to by silver-tongued marketing wonks. But we are going to have to face the need for organization change and process reevaluation on a scale that will take months, if not years, to implement correctly. And then do it all over again as technology evolves to fit the new mold we created when we broke the old one.

I would rather see the easy money flee to a new startup slot machine and all of the fair weather professionals move on to a new career in whatever is the hot new thing. That means those of us left behind in the newly-transformed traditional networking space will be grizzled veterans willing to learn and implement the changes we need to make to stop being blamed for the problems of IT and be a model for how it should be run. That’s a future to look forward to.

 

The Light On The Fiber Mountain

MountainRoad

Fabric switching systems have been a popular solution for many companies in the past few years. Juniper has QFabric and Brocade has VCS. For those not invested in fabrics, the trend has been to collapse the traditional three tier network model down into a spine-leaf architecture to optimize east-west traffic flows. One must wonder how much more optimized that solution can be. As it turns out, there is a bit more that can be coaxed out of it.

Shine A Light On Me

During Interop, I had a chance to speak with the folks over at Fiber Mountain (@FiberMountain) about what they’ve been up to in their solution space. I had heard about their revolutionary SDN offering for fiber. At first, I was a bit doubtful. SDN gets thrown around a lot on new technology as a way to sell it to people that buy buzzwords. I wondered how a fiber networking solution could even take advantage of software.

My chat with M. H. Raza started out with a prop. He showed me one of the new Multifiber Push On (MPO) connectors that represent the new wave of high-density fiber. Each cable, which is roughly the size and shape of a SATA cable, contains 12 or 24 fiber connections. These are very small and pre-configured in a standardized connector. This connector can plug into a server network card and provide several light paths to a server. This connector and the fibers it terminates are the building block for Fiber Mountain’s solution.

With so many fibers running a server, Fiber Mountain can use their software intelligence to start doing interesting things. They can begin to build dedicated traffic lanes for applications and other traffic by isolating that traffic onto fibers already terminated on a server. The connectivity already exists on the server. Fiber Mountain just takes advantage of it. It feels very simliar to the way we add in additional gigabit network ports when we need to expand things like vKernel ports or dedicated traffic lanes for other data.

Quilting Circle

Where this solution starts looking more like a fabric is what happens when you put Fiber Mountain Optical Exchange devices in the middle. These switching devices act like aggregation ports in the “spine” of the network. They can aggregate fibers from top-of-rack switches or from individual servers. These exchanges tag each incoming fiber and add them to the Alpine Orchestration System (AOS), which keeps track of the connections just like the interconnections in a fabric.

Once AOS knows about all the connections in the system, you can use it to start building pathways between east-west traffic flows. You can ensure that traffic between a web server and backend database has dedicated connectivity. You can add additional resources between systems that are currently engaged in heavy processing. You can also dedicated traffic lanes for backup jobs. You can do quite a bit from the AOS console.

Now you have a layer 1 switching fabric without any additional pieces in the middle. The exchanges function almost like a passthrough device. The brains of the system exist in AOS. Remember when Ivan Pepelnjak (@IOSHints) spent all his time pulling QFabric apart to find out what made it tick? The Fiber Mountain solution doesn’t use BGP or MPLS or any other magic protocol sauce. It runs at layer 1. The light paths are programmed by AOS and the packets are swtiched across the dense fiber connections. It’s almost elegant in the simplicity.

Future Illumination

The Fiber Mountain solution has some great promise. Today, most of the operations of the system require manual intervention. You must build out the light paths between servers based on educated guesses. You must manually add additional light paths when extra bandwidth is needed.

Where they can really improve their offering in the future is to add intelligence to AOS to automatically make those decisions based on thresholds and inputs that are predefined. If the system can detect bigger “elephant” traffic flows and automatically provision more bandwidth or isolate these high volume packet generators it will go a long way toward making things much easier on network admins. It would also be great to provide a way to interface that “top talker” data into other systems to alert network admins when traffic flows get high and need additional resources.


Tom’s Take

I like the Fiber Mountain solution. They’ve built a layer 1 fabric that performs similarly to the ones from Juniper and Brocade. They are taking full advantage of the resources provided by the MPO fiber connectors. By adding a new network card to a server, you can test this system without impacting other traffic flows. Fiber Mountain even told me that they are looking at trial installations for customers to bring their technology in at lower costs as a project to show the value to decision makers.

Fiber Moutain has a great start on building a low latency fiber fabric with intelligence. I’ll be keeping a close eye on where the technolgy goes in the future to see how it integrates into the entire network and brings SDN features we all need in our networks.

 

Cumulus Networks Could Be The New Microsoft

CumulusMSTurtle

When I was at HP Discover last December, I noticed a few people running around wearing Cumulus Networks shirts. That had me a bit curious, as Cumulus isn’t usually on the best of terms with traditional networking vendors unless they have a partnership. After some digging, I found out that HP would be announcing a “britebox” branded whitebox switch soon running Cumulus Linux. I wrote a post vaguely hinting about this in as much detail as I dared leak out.

No surprise that HP has formally announced their partnership with Cumulus. This is a great win for HP in the long run, as it gives customers the option to work with an up-and-coming network operating system (NOS) along side HP support and hardware. Note that the article mentions a hardware manufacturing deal with Accton, but I wouldn’t at all be surprised to learn that Accton had been making a large portion of their switching line already. Just a different sticker on this box.

Written Once, Runs Everywhere

The real winner here is Cumulus. They have partnered with Dell and HP to bring their NOS to some very popular traditional network vendor hardware. Given that they continue to push Cumulus Linux on traditional whitebox hardware, they are positioning themselves the same way that Microsoft did back in the 1980s when the IBM Clone PC market really started to take off.

Microsoft’s master stroke wasn’t building an empire around a GUI. It was creating software that ran on almost every variation of devices in the market. That common platform provided consistency for programmers the world over. You never had to worry about what OS was running on an IBM Clone. You could be almost certain it was MS-DOS. In fact, that commonality of platform is what enabled Microsoft to build their GUI interface on top. While DOS was eventually phased out in favor of WinNT kernels in Windows the legacy of DOS still remains on the command line.

Hardware comes and goes every year. Even with device vendors that are very tied to their hardware, like Apple. Look at the hardware differences between the first iPhone and the latest iPhone 6+. They are almost totally alien. Then look at the operating system running on each of them. They are remarkably similar, especially amazing given the eight year gap between them. That consistency of experience has allowed app developers to be comfortable writing apps that will work for more than one generation of hardware.

Bash Brothers

Cumulus is positioning themselves to do something very similar. They are creating a universal NOS interface to switches. Rather than pinning their hopes on the aging Cisco IOS CLI (and avoiding a potential lawsuit in the process), Cumulus has decided to go with Bash. Bash is almost universal for those that work on Linux, and if you’re an old school UNIX admin it doesn’t take long to adapt to Bash either. That common platform means that you have a legion of trained engineers and architects that know how to use your system.

More importantly, you have a legion of people that know how to write software to extend your system. You can create Bash scripts and programs to do many things. Cumulus even created ifupdown2 to help network admins with simplifying network interface administration. If you can extend the interface of a networking device with relative ease, you’ve started unlocking the key to unlimited expansion.

Think about the number of appliances you use every day that you never know are running Linux. I said previously that Linux won the server war because it is everywhere now and yet you don’t know it’s Linux. In the same way, I can see Cumulus negotiating to get the software offered as an option for both whitebox and britebox switches in the future. Once that happens, you can start to see the implications. If developers are writing apps and programs to extend Cumulus Linux and not the traditional switch OS, consumers will choose the more extensible option if everything else is equal. That means more demand for Cumulus. Which pours more resources into development. Which is how MS-DOS took over the world and led to Windows domination, while OS/2 died a quiet, protracted death.


Tom’s Take

When I first tweeted my thoughts about Cumulus Networks and their potential rise like the folks in Redmond, there was a lot of pushback. People told me to think of them more like Red Hat instead of Microsoft. While their business model does indeed track more closely with Red Hat, I think much of this pushback comes from the negative connotations we have with Windows. Microsoft has essentially been the only game in the x86 market for a very long time. People forget what it was like to run BeOS or early versions of Slackware. Microsoft had almost total domination outside the hobby market.

Cumulus doesn’t have to unseat Cisco to win. They don’t even have to displace the second or third place vendor. By signing deals with as many people as possible to bring Cumulus Linux to the masses, they will win in the long run by being the foundation for where networking will be going in the future.

Editor Note:  A previous version of this article incorrectly stated that Cumulus created ifupdown, when in fact they created ifupdown2.  Thanks to Matt Stone (@BigMStone) and Todd Craw (@ToddMCraw) for pointing this out to me.

The Packet Flow Duality

young-double-slit-diffraction-wikipedia-660x330

Quantum physics is a funny thing. It seeks to solve all the problems in the physical world by breaking everything down into the most basic unit possible. That works for a lot of the observable universe. But when it comes to light, quantum physics has issues. Thanks to experiments and observations, most scientists understand that light isn’t just a wave and it’s not just a collection of particles either. It’s both. This concept is fundamental to understanding how light behaves. But can it also explain how data behaves?

Moving Things Around

We tend to think about data as a series of discrete data units being pushed along a path. While these units might be frames, packets, or datagrams depending on the layer of the OSI model that you are operating at, the result is still the same. A single unit is evaluated for transmission. A brilliant post from Greg Ferro (@EtherealMind) sums up the forwarding thusly:

  • Frames being forwarded by MAC address lookup occur at layer 2 (switching)
  • Packets being forwarded by IP address lookup occur at layer 3 (routing)
  • Data being forwarded at higher levels is a stream of packets (flow forwarding)

It’s simple when you think about it. But what makes it a much deeper idea is that lookup at layer 2 and 3 requires a lot more processing. Each of the packets must be evaluated to be properly forwarded. The forwarding device doesn’t assume that the destination is the same for a group of similar packets. Each one must be evaluated to ensure it arrives at the proper location. By focusing on the discrete nature of the data, we are forced to expend a significant amount of energy to make sense of it. As anyone that studied basic packet switching can tell you, several tricks were invented to speed up this process. Anyone remember store-and-forward versus cut-through switching?

Flows behave differently. They contain state. They have information that helps devices make intelligent forwarding decisions. Those decisions don’t have to be limited by destination MAC or IP addresses. They can be labels or VLANs or other pieces of identifying information. They can be anything an application uses to talk to another device, like a DNS entry. It allows us to make a single forwarding decision per flow and implement it quickly and efficiently. Think about a stateful firewall. It works because the information for a given packet stream (or flow) can be programmed into the device. The firewall is no longer examining every individual packet, but instead evaluates the entire group of packets when making decisions.

Consequently, stageful firewalls also give us a peek at how flows are processed. Rather than having a CAM table or an ARP table, we have a group of rules and policies. Those policies can say “given a group of packets in a flow matching these characteristics, execute the following actions”. That’s a far cry from trying to figure out where each one goes.

It’s All About Scale

A single drop of water is discrete. Just like a single data packet, it represents an atomic unit of water. Taken in this measurement, a single drop of water does little good. It’s only when those drops start to form together that their usefulness becomes apparent. Think of a river or a firehose. Those groups of droplets have a vector. They can be directed somewhere to accomplish something, like putting out a fire or cutting a channel across the land.

Flows should be the atomic unit that we base our networking decisions upon. Flows don’t require complex processing on a per-unit basis. Flows carry additional information above and beyond a 48-bit hex address or a binary address representing an IP entry. Flows can be manipulated and programmed. They can have policies applied. Flows can scale to great heights. Packets and frames are forever hampered by the behaviors necessary to deliver them to the proper locations.

Data is simultaneously a packet and a flow. We can’t separate the two. What we can do is change our frame of reference for operations. Just like experiments with light, we must choose one aspect of the duality to act until such time as the other aspect is needed. Light can be treated like a wave the majority of the time. It’s only when things like the photoelectric effect happen that our reference must change. In the same way, data should be treated like a flow for the majority of cases. Only when the very basic needs of packet/frame/datagram forwarding are needed should we abandon our flow focus and treat it as a group of discrete packets.


Tom’s Take

The idea of data flows isn’t new. And neither is treating flows as the primary form of forwarding. That’s what OpenFlow has been doing for quite a while now. What makes this exciting is when people with new networking ideas start using the flow as an atomic unit for decisions. When you remove the need to do packet-by-packet forwarding and instead focus on the flow, you gain a huge insight into the world around the packet. It’s not much a stretch to think that the future of networking isn’t as concerned with the switching of frames or routing of packets. Instead, it’s the forwarding of a flow of packets that will be exciting to watch. As long as you remember that data can be both packet and flow you will have taken your first step into a larger world of understanding.

 

More Bang For Your Budget With Whitebox

white-box-sdn-nfv

As whitebox switching starts coming to the forefront of the next buying cycle for enterprises, decision makers are naturally wondering about the advantages of buying cheaper hardware. Is a whitebox switch going to provide more value for me than buying something from an established vendor? Where are the real savings? Is whitebox really for me? One of the answers to this puzzle comes not from the savings in whitebox purchases, but the capability inherent in rapid deployment.

Ten Thousand Spoons

When users are looking at the acquisition cost advantages of buying whitebox switches, they typically don’t see what they would like to see. Ridiculously cheap hardware isn’t the norm. Instead, you see a switch that can be bought for a decent discount. That does take into account that most vendors will give substantial one-time discounts to customers to entice them into more lucrative options like advanced support or professional services.

The purchasing advantage of whitebox doesn’t just come from reduced costs. It comes from additional unit purchases. Purchasing budgets don’t typically spell out that you are allowed to buy ten switches and three firewalls. They more often state that you are allowed to spend a certain dollar amount on devices of a specific type. Savvy shoppers will find deals or discounts to get more for their dollar. The real world of purchasing budgets means that every dollar will be spent, lest the available dollars get reduced next year.

With whitebox, that purchasing power translates into additional units for the same budget amount. If I could buy three switches from Vendor X or five switches from Whitebox Vendor Y, ceteris paribus I would buy the whitebox switches. If the purpose of the purchase was to connect 144 ports, then that means I have two extra switches lying around. Which does seem a bit wasteful.

However, the option of having spares on the shelf becomes very appealing. Networks are supposed to be built in a way to minimize or eliminate downtime because of failure. The network must continue to run if a switch dies. But what happens to the dead switch? In most current cases, the switch must be sent in for warranty replacement. Services contracts with large networking vendors give you the option for 4-hour, overnight, or next business day replacements. These vendors will even cross-ship you the part. But you are still down the dead switch. If the other part of the redundant pair goes down, you are going to be dead in the water.

With an extra whitebox switch on the shelf you can have a ready replacement. Just slip it into place and let your orchestration and provisioning software do the rest. While the replacement is shipping, you still have redundancy. It also saves you from needing to buy a hugely expensive (and wildly profitable) advanced support contract.

All You Need Is A Knife

Suppose for a moment that we do have these switches sitting around on a shelf doing nothing but waiting for the inevitable failure in the network. From a cost perspective, it’s neutral. I spent the same budget either way, so an unutilized switch is costing me nothing. However, what if I could do something with that switch?

The real advantage of whitebox in this scenario comes from the ability to use non-switching OSes on the hardware. Think for a moment about something like a network packet monitor. In the past, we’ve needed to download specialized software and slip a probing device into the network just for the purposes of packet collection. What if that could be done by a switch? What if the same hardware that is forwarding packets through the network could also be used to monitor them as well?

Imagine creating an operating system that runs on top of something like ONIE for the purpose of being a network tap. Now, instead of specialized hardware for that purpose you only need to go and use one of the switches you have lying around on the shelf and repurpose it into a sensor. And when it’s served that purpose, you put it back on the shelf and wait until there is a failure before going back to push it into production as a replacement. With Chef or Puppet, you could even have the switch boot into a sensor identity for a few days and then provision it back to being a data forwarding switch afterwards. No need for messy complicated software images or clever hacks.

Now, extend those ideas beyond sensors. Think about generic hardware that could be repurposed for any function. A switch could boot up as an inline firewall. That firewall could be repurposed into a load balancer for the end of the quarter. It could then become a passive IDS during an attack. All without moving. The only limitation is the imagination of the people writing code for the device. It may not ever top the performance of a device running purely for the purpose of a given function, but the flexibility of having a device that can serve multiple functions without massive reconfiguration would win out in the long run for many applications. Flexibility is more key than overwhelming performance.


Tom’s Take

Whitebox is still finding a purpose in the enterprise. It’s been embraced by webscale, but the value to the enterprise is not found in massive capabilities like that. Instead, the additional purchasing power that can be derived from additional unit purchases for the same dollar amount leads to reduced support contract costs and even new functionality increases from existing hardware lying around that can be made to do so many other things. Who could have imagined that a simple switch could be made to do the job of many other purpose-built devices in the data center? Isn’t it ironic, don’t you think?