Plexxi and the Case for Affinity

Plexxi Logo

Our last presentation from Day 2 of Network Field Day 5 came from a relatively new company – Plexxi.  I hadn’t really heard much from them before they signed up to present at NFD5.  All I really knew was that they had been attracting some very high profile talent from the Juniper ranks.  First Dan Backman (@jonahsfo) shortly after NFD4 then Mike Bushong (@mbushong) earlier this year.  One might speculate that when the talent is headed in a certain direction, it might be best to see what’s going on over there.  If only I had known the whole story up front.

Mat Mathews kicked off the presentation with a discussion about Plexxi and what they are doing to differentiate themselves in the SDN space.  It didn’t take long before their first surprise was revealed.  Our old buddy Derick Winkworth (@cloudtoad) emerged from his pond to tell us the story of why he moved from Juniper to Plexxi just the week before.  He’d kept the news of his destination pretty quiet.  I should have guessed he would end up at a cutting edge SDN-focused company like Plexxi.  It’s good to see smart people landing in places that not only make them excited but give them the opportunity to affect lots of change in the emerging market of programmable networking.

Marten Terpstra jumped in next to talk about the gory details of what Plexxi is doing.  In a nutshell, this all boils down to affinity.  Based on a study done by Microsoft in 2009, Plexxi noticed that there are a lot of relationships between applications running in a data center.  Once you’ve identified these relationships, you can start doing things with them.  You can create policies that provide for consistent communications between applications.  You can isolate applications from one another.  You can even ensure which applications get preferential treatment during a network argument.  Now do you see the SDN applications?  Plexxi took the approach that there is more data to be gathered by the applications in the network.  When they looked for it, sure enough it was there.  Now, armed with more information, they could start crafting a response.  What they came up with was the Plexxi Switch.  This is a pretty standard 32-port 10GigE switch with 4 QSFP ports..  Their differentiator is the 40GigE uplinks to the other Plexxi Switches.  Those are used to create a physical ring topology that allows the whole conglomeration to work together to create what looked to me like a virtual mesh network.  Once connected in such a manner, the affinities between the applications running at the edges of the network can now begin to be built.

Plexxi has a controller that sits above the bits and bytes and starts constructing the policy-based affinities to allow traffic to go where it needs to go.  It can also set things up so that things don’t go where they’re not supposed to be, as in the example Simon McCormack gives in the above video.  Even if the machine is moved to a different host in the network via vMotion or Live Migration, the Plexxi controller and network are smart enough to figure out that those hosts went somewhere different and that the policy providing for an isolated forwarding path needs to be reimplemented.  That’s one of the nice things about programmatic networking.  The higher-order networking controllers and functions figure out what needs to change in the network and implements the changes either automatically or with a minimum of human effort.  This ensures that the servers don’t come in and muck up the works with things like Dynamic Resource Scheduler (DRS) moves or other unforeseen disasters.  Think about the number of times you’ve seen a VM with an anti-affinity rule that keeps it from being evacuated from a host because there is some sort of dedicated link for compliance or security reasons.  With Plexxi, that can all be done automagically.  Derick even showed off some interesting possibilities around using Python to extend the capabilities of the CLI at the end of the video.

If you’d like to learn more about Plexxi, you can check them out at http://www.plexxi.com.  You can also follow them on Twitter as @PlexxiInc


Tom’s Take

Plexxi has a much different feel than many of the SDN products I’ve seen so far.  That’s probably because they aren’t trying to extend an existing infrastructure with programmability.  Instead, they’ve taken a singular focus around affinity and managed to tun it into something that looks to have some very fascinating applications in today’s data centers.  If you’re going to succeed in the SDN-centric world of today, you either need to be front of the race as it is being run today, like Cisco and Juniper, or you need to have a novel approach to the problem.  Plexxi really is looking at this whole thing from the top down.  As I mentioned to a few people afterwards, this feels like someone reimplemented QFabric with a significant amount of flow-based intelligence.  That has some implications for higher order handling that can’t be addressed by a simple fabric forwarding engine.  I will stay tuned to Plexxi down the road.  If nothing else, just for the crazy sock pictures.

Tech Field Day Disclaimer

Plexxi was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, they also gave the delegates a Nerf dart gun and provided us with after hours refreshments.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Additional Coverage of Plexxi and Network Field Day 5

Smart Optical Switching – Your Plexxible Friend – John Herbert

Plexxi Control – Anthony Burke

Cisco Borderless Idol

Cisco Logo

Day one of Network Field Day 5 (NFD5) included presentations from the Cisco Borderless team. You probably remember their “speed dating” approach at NFD4 which gave us a wealth of information in 15 minute snippets. The only drawback to that lineup is when you find a product or a technology that interests you there really isn’t any time to quiz the presenter before they are ushered off stage. Someone must have listened when I said that before, because this time they brought us 20 minute segments – 10 minutes of presentation, 10 minutes of demo. With the switching team, we even got to vote on our favorite to bring the back for the next round (hence the title of the post). More on that in a bit.

6500 Quad Supervisor Redundancy

First up on the block was the Catalyst 6500 team. I swear this switch is the Clint Howard of networking, because I see it everywhere. The team wanted to tell us about a new feature available in the ((verify code release)) code on the Supervisor 2T (Sup2T). Previously, the supervisor was capable of performing a couple of very unique functions. The first of these was Stateful Switch Over (SSO). During SSO, the redundant supervisor in the chassis can pick up where the primary left off in the event of a failure. All of the traffic sessions can keep on trucking even if the active sup module is rebooting. This gives the switch a tremendous uptime, as well as allowing for things like hitless upgrades in production. The other existing feature of the Sup2T is Virtual Switching System (VSS). VSS allows two Sup2Ts to appear as one giant switch. This is helpful for applications where you don’t want to trust your traffic to just one chassis. VSS allows for two different chassis to terminate Multi-Chassis EtherChannel (MLAG) connections so that distribution layer switches don’t have a single point of failure. Traffic looks like it’s flowing to one switch when in actuality it may be flowing to one or the other. In the event that a Supervisor goes down, the other one can keep forwarding traffic.

Enter the Quad Sup SSO ability. Now, instead of having an RPR-only failover on the members of a VSS cluster, you can setup the redundant Sup2T modules to be ready and waiting in the event of a failure. This is great because you can lose up to three Sup2Ts at once and still keep forwarding while they reboot or get replaced. Granted, anything that can take out 3 Sup2Ts at once is probably going to take down the fourth (like power failure or power surge), but it’s still nice to know that you have a fair amount of redundancy now. This only works on the Sup2T, so you can’t get this if you are still running the older Sup720. You also need to make sure that your linecards support the newer Distributed Forwarding Card 3 (DFC3), which means you aren’t going to want to do this with anything less than a 6700-series line card. In fact, you really want to be using the 6800 series or better just to be on the safe side. As Josh O’brien (@joshobrien77) commented, this is a great feature to have. But it should have been there already. I know that there are a lot of technical reasons why this wasn’t available earlier, and I’m sure the increase fabric speeds in the Sup2T, not to mention the increased capability of the DFC3, are the necessary component for the solution. Still, I think this is something that probably should have shipped in the Sup2T on the first day. I suppose that given the long road the Sup2T took to get to us that “better late than never” is applicable here.

UCS-E

Next up was the Cisco UCS-E series server for the ISR G2 platform. This was something that we saw at NFD4 as well. The demo was a bit different this time, but for the most part this is similar info to what we saw previously.


Catalyst 3850 Unified Access Switch

The Catalyst 3800 is Cisco’s new entry into the fixed-configuration switch arena. They are touting this a “Unified Access” solution for clients. That’s because the 3850 is capable of terminating up to 50 access points (APs) per stack of four. This think can basically function as a wiring closet wireless controller. That’s because it’s using the new IOS wireless controller functionality that’s also featured in the new 5760 controller. This gets away from the old Airespace-like CLI that was so prominent on the 2100, 2500, 4400, and 5500 series controllers. The 3850, which is based on the 3750X, also sports a new 480Gbps Stackwise connector, appropriately called Stackwise480. This means that a stack of 3850s can move some serious bits. All that power does come at a cost – Stackwise480 isn’t backwards compatible with the older Stackwise v1 and v2 from the 3750 line. This is only an issue if you are trying to deploy 3850s into existing 3750X stacks, because Cisco has announced the End of Sale (EOS) and End of Life (EOL) information for those older 3750s. I’m sure the idea is that when you go to rip them out, you’ll be more than happy to replace them with 3850s.

The 3850 wireless setup is a bit different from the old 3750 Access Controller that had a 4400 controller bolted on to it. The 3850 uses Cisco’s IOS-XE model of virtualizing IOS into a sort of VM state that can run on one core of a dual-core processor, leaving the second core available to do other things. Previously at NFD4, we’d seen the Catalyst 4500 team using that other processor core for doing inline Wireshark captures. Here, the 3850 team is using it to run the wireless controller. That’s a pretty awesome idea when you think about it. Since I no longer have to worry about IOS taking up all my processor and I know that I have another one to use, I can start thinking about some interesting ideas.

The 3850 does have a couple of drawbacks. Aside from the above Stackwise limitations, you have to terminate the APs on the 3850 stack itself. Unlike the CAPWAP connections that tunnel all the way back to the Airespace-style controllers, the 3850 needs to have the APs directly connected in order to decapsulate the tunnel. That does provide for some interesting QoS implications and applications, but it doesn’t provide much flexibility from a wiring standpoint. I think the primary use case is to have one 3850 switch (or stack) per wiring closet, which would be supported by the current 50 AP limitation. the othe drawback is that the 3850 is currently limited to a stack of four switches, as opposed to the increased six switch limit on the 3750X. Aside from that, it’s a switch that you probably want to take a look at in your wiring closets now. You can buy it with an IP Base license today and then add on the AP licenses down the road as you want to bring them online. You can even use the 3850s to terminate CAPWAP connections and manage the APs from a central controller without adding the AP license.

Here is the deep dive video that covers a lot of what Cisco is trying to do from a unified wired and wireless access policy standpoint. Also, keep an eye out for the cute Unifed Access video in the middle.

Private Data Center Mobility

I found it interesting this this demo was in the Borderless section and not the Data Center presentation. This presentation dives into the world of Overlay Transport Virtualization (OTV). Think of OTV like an extra layer of 802.1 q-in-q tunneling with some IS-IS routing mixed in. OTV is Cisco’s answer to extending the layer 2 boundary between data centers to allow VMs to be moved to other sites without breaking their networking. Layer 2 everywhere isn’t the most optimal solution, but it’s the best thing we’ve got to work with the current state of VM networking (until Nicira figures out what they’re going to do).

We loved this session so much that we asked Mostafa to come back and talk about it more in depth.

The most exciting part of this deep dive to me was the introduction of LISP. To be honest, I haven’t really been able to wrap my head around LISP the first couple of times that I saw it. Now, thanks to the Borderless team and Omar Sultan (@omarsultan), I’m going to dig into a lot more in the coming months. I think there are some very interesting issues that LISP can solve, including my IPv6 Gordian Knot.


Tom’s Take

I have to say that I liked Cisco’s approach to the presentations this time.  Giving us discussion time along with a demo allowed us to understand things before we saw them in action.  The extra five minutes did help quite a bit, as it felt like the presenters weren’t as rushed this time.  The “Borderless Idol” style of voting for a presentation to get more info out of was brilliant.  We got to hear about something we wanted to go into depth about, and I even learned something that I plan on blogging about later down the line.  Sure, there was a bit of repetition in a couple of areas, most notably UCS-E, but I can understand how those product managers have invested time and effort into their wares and want to give them as much exposure as possible.  Borderless hits all over the spectrum, so keeping the discussion focused in a specific area can be difficult.  Overall, I would say that Cisco did a good job, even without Ryan Secrest hosting.

Tech Field Day Disclaimer

Cisco was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, Cisco provided me with a breakfast and lunch at their offices.  They also provided a Moleskine notebook, a t-shirt, and a flashlight toy.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Cisco Data Center Duel

Cisco Logo

Network Field Day 5 started off with a full day at Cisco. The Data Center group opened and closed the day, with the Borderless team sandwiched in between. Omar Sultan (@omarsultan) greeted us as we settled in for a continental breakfast before getting started.

The opening was a discussion of onePK, a popular topic as of late from Cisco. While the topic du jour in the networking world is software-defined networking (SDN), Cisco steers the conversation toward onePK. This, at its core, is API access to all the flavors of the Internetwork Operating System (IOS). While other vendors discuss how to implement protocols like OpenFlow or how to expose pieces of their underlying systems to developers, Cisco has built a platform to allow access into pieces and parts of the OS. You can write applications in Java or Python to pull data from the system or push configurations to it. The process is slowly being rolled out to the major Cisco platforms. The support for the majority of the Nexus switching line should give the reader a good idea of where Cisco thinks this technology will be of best use.

One of the specific applications that Cisco showed off to us using onePK is the use of Puppet to provision switches from bare metal to functioning with a minimum of human effor. Puppet integration was a big underlying topic at both Cisco and Juniper (more on that in the Juniper NFD5 post). Puppet is gaining steam in the netowrking industry as a way to get hardware up and running quickly with the least amount of fuss. Server admins have enjoyed the flexibility of Puppet for a some time. It’s good to see well-tested and approved software like this being repurposed for similar functionality in the world of routing and switching.

Next up was a discussion about the Cisco ONE network controller. Controllers are a very hot topic in the network world today. OpenFlow allows a central management and policy server to push information and flow data into switches. This allows network admins to get a “big picture” of the network and how the packets are flowing across it. Having the ability to view the network in its entirity also allows admins to start partitioning it in a process called “slicing.” This was one of the first applications that the Stanford wiz kids used OpenFlow to accomplish. It makes sense when you think about how universities wanted to partition off their test networks to prevent this radical OpenFlow idea from crashing the production hardware. Now, we’re looking at using slicing for things like multi-tenancy and security. The building blocks are there to make some pretty interesting leaps. The real key is that the central controller have the ability to keep up with the flows being pushed through the network. Cisco’s ONE controller not only speaks OpenFlow, but onePK as well. This means that while the ONE controller can talk to disparate networking devices running OpenFlow, it will be able to speak much more clearly to any Cisco devices you have lying around. That’s a pretty calculated play from Cisco, given that the initial target for their controller will be networks populated primarily by Cisco equipment. The use case that was given to us for the Cisco ONE controller was replacing large network taps with SDN options. Fans of NFD may remember our trip to Gigamon. Cisco hadn’t forgotten, as the network tap they used as an example in their slide looked just like the orange Gigamon switch we saw at a previous NFD.

After the presentations from the Borderless team, we ended the day with an open discussion around a few topics. This is where the real fun started. Here’s the video:

The first hour or so is a discussion around hybrid switching. I had some points in here about the standoff between hardware and software people not really wanting to get along right now. I termed it a Mexican Standoff because no one really wants to flinch and go down the wrong path. The software people just want to write overlays and things like and make it run on everything. The entrenched hardware vendors, like Cisco, want to make sure their hardware is providing better performance than anyone else (because that’s where their edge is). Until someone decides to take a chance and push things in different directions, we’re not going to see much movement. Also, around 1:09:00 is where we talked a bit about Cisco jumping into the game with a pure OpenFlow switch without much more on top of it. This concept seemed a bit foreign to some of the Cisco folks, as they can’t understand why people wouldn’t want IOS and onePK. That’s where I chimed in with my “If I want a pickup truck, I don’t take a chainsaw to a school bus.” You shouldn’t have to shed all the extra stuff to get the performance you want. Start with a smaller platform and work your way up instead of starting with the kitchen sink and stripping things away.

Shortly after this is where the fireworks started. One of Cisco’s people started arguing that OpenFlow isn’t the answer. He said that the customer he was talking to didn’t want OpenFlow. He even went so far as to say that “OpenFlow is a fantasy because it promises everything and there’s nothing in production.” (about 1:17:00) Folks, this was one of the most amazing conversations I’ve ever seen at a Network Field Day event. The tension in the room was palpable. Brent and Greg were on this guy the entire time about how OpenFlow was solving real problems for customers today, and in Brent’s case he’s running it in production. I really wonder how the results of this are going to play out. If Cisco hears that their customers don’t care that much about OpenFlow and just want their gear to do SDN like in onePK then that’s what they are going to deliver. The question then becomes whether or not network engineers that believe that OpenFlow has a big place in the networks of tomorrow can convince Cisco to change their ways.

If you’d like to learn more about Cisco, you can find them at http://www.cisco.com/go/dc.  You can follow their data center team on Twitter as @CiscoDC.


Tom’s Take

Cisco’s Data Center group has a lot of interesting things to say about programmability in the network. From discussions about APIs to controllers to knock down, drag out aruguments about what role OpenFlow is going to play, Cisco has the gamut covered. I think that their position at the top of the network heap gives them a lot of insight into what’s going on. I’m just worried that they are going to use that to push a specific agenda and not embrace useful technologies down the road that solve customer problems. You’re going to hear a lot more from Cisco on software defined networking in the near future as they begin to roll out more and more features to their hardware in the coming months.

Tech Field Day Disclaimer

Cisco was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, Cisco provided me with a breakfast and lunch at their offices.  They also provided a Moleskine notebook, a t-shirt, and a flashlight toy.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Additional NFD5 Blog Posts

NFD5: Cisco onePK – Terry Slattery

NFD5: SDN and Unicorn Blood – Omar Sultan

New Wrinkles in the Fabric – Cisco Nexus Updates

There’s no denying that The Cloud is an omnipresent fixture in our modern technological lives.  If we aren’t already talking about moving things there, we’re wondering why it’s crashed.  I don’t have any answers about these kinds of things, but thankfully the people at Cisco have been trying to find them.  They let me join in on a briefing about the announcements that were made today regarding some new additions to their data center switching portfolio more commonly known by the Nexus moniker.

Nexus 6000

The first of the announcements is around a new switch family, the Nexus 6000.  The 6000 is more akin to the 5000 series than the 7000, containing a set of fixed-configuration switches with some modularity.  The Nexus 6001 is the true fixed-config member of the lot.  It’s a 1U 48-port 10GbE switch with 4 40GbE uplinks.  If that’s not enough to get your engines revving, you can look at the bigger brother, the Nexus 6004.  This bad boy is a 4U switch with a fixed config of 48 40GbE ports and 4 expansion modules that can double the total count up to 96 40GbE ports.  That’s a lot of packets flying across the wire.  According to Cisco, those packets can fly at a 1 microsecond latency port-to-port.  The Nexus 6000 is also an Fibre Channel over Ethernet (FCoE) switch, as all Nexus switches are.  This one is a 40GbE-capable FCoE switch.  However, as there are no 40GbE targets available in FCoE right now, it’s going to be on an island until those get developed.  A bit of future proofing, if you will.  The Nexus 6000 also support FabricPath, Cisco’s TRILL-based fabric technology, along with a large number of multicast entries in the forwarding table.  This is no doubt to support VXLAN and OTV in the immediate future for layer 2 data center interconnect.

The Nexus line also gets a few little added extras.  There is going to be a new FEX, the 2248PQ, that features 10GbE downlink ports and 40GbE uplink ports.  There’s also going to be a 40GbE expansion module for the 5500 soon, so your DC backbone should be able to run a 40GbE with a little investment.  Also of interest is the new service module  for the Nexus 7000.  That’s right, a real service module.  The NAM-NX1 is a Network Analysis Module (NAM) for the Nexus line of switches.  This will allow spanned traffic to be pumped though for analysis of traffic composition and characteristics without taking a huge hit to performance.  We’ve all known that the 7000 was going to be getting service modules for a while.  This is the first of many to roll off the line.  In keeping with Cisco’s new software strategy, the NAM also has a virtual cousin, not surprising named the vNAM.  This version lives entirely in software and is designed to serve the same function that its hardware cousin does only in the land of virtual network switches.  Now that the Nexus line has service modules, kind of makes you wonder what the Catalyst 6500 has all to itself now?  We know that the Cat6k is going to be supported in the near term, but is it going to be used as a campus aggregation or core?  Maybe as a service module platform until the SMs can be ported to the Nexus?  Or maybe with the announcement of FabricPath support for the Cat6k this venerable switch will serve as a campus/DC demarcation point?  At this point the future of Cisco’s franchise switch is really anyone’s guess.

Nexus 1000v InterCloud

The next major announcement from Cisco is the Nexus 1000v InterCloud.  This is very similar to what VMware is doing with their stretched data center concept in vSphere 5.1.  The 1000v InterCloud (1kvIC) builds a secure layer 2 GRE tunnel between your private could and a provider’s public could.  You can now use this tunnel to migrate workloads back and forth between public and private server space.  This opens up a whole new area of interesting possibilities, not the least of which is the Cloud Services Router (CSR).  When I first heard about the CSR last year at Cisco Live, I thought it was a neat idea but had some shortcomings.  The need to be deployed to a place where it was visible to all your traffic was the most worrisome.  Now, with the 1kvIC, you can build a tunnel between yourself and a provider and use CSR to route traffic to the most efficient or cost effective location.  It’s also a very compelling argument for disaster recovery and business continuity applications.  If you’ve got a category 4 hurricane bearing down on your data center, the ability to flip a switch and cold migrate all your workloads to a safe, secure vault across the country is a big sigh of relief.

The 1kvIC also has its own management console, the vNMC.  Yes, I know there’s already a vNMC available from Cisco.  The 1kvIC version is a bit special thought.  It not only gives you control over your side of the interconnect, but it also integrates with the provider’s management console as well.  This gives you much more visibility into what’s going on inside the provider instances beyond what we already have from simple dashboards or status screens on public web pages.  This is a great help when you think about the kinds of things you would be doing with intercloud mobility.  You don’t want to send your workloads to the provider if an engineer has started an upgrade on their core switches on a Friday night.  When it comes to the cloud, visibility is viability.

CiscoONE

In case you haven’t heard, Cisco wants to become a software company.  Not a bad idea when hardware is becoming a commodity and software is the home of the high margins.  Most of the development that Cisco has been doing along the software front comes from the Open Network Environment (ONE) initiative.  In today’s announcement, CiscoONE will now be the home for an OpenFlow controller.  In this first release, Cisco will be supporting OpenFlow and their own OnePK API extensions on the southbound side.  On the northbound side of things, the CiscoONE Controller will expose REST and Java hooks to allow interaction with flows passing though the controller.  While that’s all well and good for most of the enterprise devs out there, I know a lot of homegrown network admins that hack together their own scripts through Perl and Python.  For those of you that want support for your particular flavor of language built into CiscoONE, I highly recommend getting to their website and telling them what you want.  They are looking at adding additional hooks as time goes on, so you can get in on the ground floor now.

Cisco is also announcing OnePK support for the ISR G2 router platform and the ASR 1000 platform.  There will be OpenFlow support on the Nexus 3000 sometime in the near future, along with support in the Nexus 1000v for Microsoft Hyper-V and KVM.  And somewhere down the line, Cisco will have a VXLAN gateway for all the magical unicorn packet goodness across data centers that stretch via non-1kvIC links.


Tom’s Take

The data center is where the dollars are right now.  I’ve heard people complain that Cisco is leaving the enterprise campus behind as they charge forward into the raised floor garden of the data center.  These are the people driving the data that produces the profits that buy more equipment.  Whether it be massive Hadoop clusters or massive private cloud projects, the accounting department has given the DC team a blank checkbook today.  Cisco is doing its best to drive some of those dollars their way by providing new and improved offerings like the Nexus 6000.  For those that don’t have a huge investment in the Nexus 7000, the 6000 makes a lot of sense as both a high speed core aggregation switch or an end-of-row solution for a herd of FEXen.  The Nexus 1000v InterCloud is competing against VMware’s stretched data center concept in much the same way that the 1000v itself competes against the standard VMware vSwitch.  WIth Nicira in the driver’s seat of VMware’s networking from here on out, I wouldn’t be shocked to see more solutions that come from Cisco that mirror or augment VMware solutions as a way to show VMware that Cisco can come up with alternatives just as well as anyone else.

New Cisco Data Center Certifications

Last week, Cisco finally plugged a huge hole in their certification offerings.  Cisco has historically required its partner community to study for specific certifications related to technologies before offering them as specialized tracks for all candidates.  It was that was for voice, wireless, and even security.  However, until last week there was no offering for data center networking.  I think this is an area in which Cisco needs to concentrate, especially when you look at their results for the first quarter of their fiscal year that were just released.  Cisco grew its data center networking business by 61% and their UCS success has vaulted them into third place in the server race easily, though some may argue they are a tight contender for second.  What Cisco needs to solidify all that growth is a program that grows data center network engineers from the ground up.

Cisco’s previous path to creating a data center network engineer involved getting a basic CCNA with no specialization and then focusing on the Data Center Networking Infrastructure certifications.  After the networking is taken care of, there is a path for UCS design and support as well.  But that requires a prospective engineer to pick up NX-OS on the fly, not having started with it in the CCNA level.  Thankfully, Cisco has now addressed that little flaw in the program.

CCNA Data Center

Cisco now has a CCNA Data Center certification that consists of non-overlapping material.  640-911Introduction to Data Center Networking DCICN is square one for new data center hopefuls.  It tests over the basics of networking much like the CCNA, but the focus is on NX-OS devices like the Nexus 7k and Nexus 5k.  It’s very much like the ICND1 exam in that is focuses on the basics and theory of general networking.  640-916 Introducing Cisco Data Center Technologies DCICT is the real meat of data center technology.  This is where the various fabric and SAN technologies are tested along with Unified Computing as well as virtualization technology like the Nexus 1000V.  Of these two tests, the DCICT is going to be the really hefty one for most candidates to chew on.  In fact, I’m almost sure that most CCNA-level engineers can go out and pass DCICN without any study beyond their CCNA knowledge.  The DCICT will likely require much more time with the study guides to get past.  Once you’ve gotten through both, you can now proudly display your CCNA: Data Center title.

CCNP Data Center

Once you’ve attained your CCNA Data Center, it’s time to delve into the topics a bit deeper.  Cisco introduced the CCNP Data Center certification track to compliment the entry level offering in the CCNA DC.  Historically, this is where the various partner-focused Data Center specializations have focused.  With the CCNP Data Center, you have to start with the Implementing Data Center Unified Computing DCUCI and Implementing Data Center Unified Fabric DCUFI exams.  Right now, you can take either version 4 or version 5 of these exams, but the version 4 exams will start expiring next year.  Once you’ve passed the implementation exams, you have a choice to make.  You can go down the path of the data center designer with Designing Cisco Data Center Unified Computing DCUCD and Designing Cisco Unifed Data Center Fabric DCUFD.  Those two exams also have a choice between version 4 and version 5, with similar expiration dates in 2013 for the version 4 exams.  If you fancy yourself more of a hands-on troubleshooter, you can opt for the Troubleshooting Cisco Unified Data Center Computing DCUCT and Troubleshooting Cisco Unified Data Center Fabric DCUFT exams.  Note that these exams don’t have a version 4 option.  There seems to have been some confusion about which exams count for what.  You must take the Implementation exams.  After that you can either take the Design exams or the Troubleshooting exams.

Tom’s Take

I’ve spent a lot of time in the last year talking about the CCIE Data Center.  One of the things that struck me about it was how focused it was in its present state on currently trained engineers.  Unless you work with Nexus and UCS every day, you won’t do well on the CCIE DC exam because there isn’t really a training program for it.  Now, with the additions of the CCNA DC and the CCNP DC, aspiring data center rock stars can get started on the road to the CCIE without needing to worry about learning IOS first.  I’m sure that Cisco will eventually retire the data center partner specializations and make the requirement for the Data Center Architecture focused around the CCNA DC and CCNP DC.  There’s no better time to jump out there and get started.  Just remember your jacket.

Brocade – Packet Spraying and SDN Integrating

Brocade kicked off our first double session at Network Field Day 4.  We’d seen them previously at Network Field Day 2 and I’d just been to Brocade’s headquarters for their Tech Day a few weeks before.  I was pretty sure that the discussion that was about to take place was going to revolve around OpenFlow and some of the hot new hardware the Brocade had been showing off recently.  Thankfully, Lisa Caywood (@TheRealLisaC) still has some tricks up her sleeve.

I hereby dub Lisa “Queen of the Mercifully Short Introduction.”  Lisa’s overview of Brocade hit all the high points about what Brocade’s business lines revolve around.  I think by now that most people know that Brocade acquired Foundry for their ethernet switching line to add to their existing storage business that revolves around Fibre Channel.  With all that out of the way, it was time to launch into the presentations.

Jessica Koh was up first to talk to me about a technology that I haven’t seen already – HyperEdge.  This really speaks to me because the majority of my customer base isn’t ever going to touch a VDX or and ADX or an MLXe.  HyperEdge technology is Brocade’s drive to keep the campus network infrastructure humming along to keep pace with the explosion of connectivity in the data center.  Add in the fact that you’ve got all manner of things connecting into the campus network, and you can see how things like manageability can be at the forefront of people’s minds.  To that end, Brocade is starting off the HyperEdge discussion early next year with the ability to stack dissimilar ICX switches together.  This may sound like crazy talk to those of you that are used to stacking together Cisco 3750s or 2960s.  On those platforms, every switch has to be identical.  With the HyperEdge stacking, you can take an ICX 6610 and stack it with an ICX 6450 and it all works just fine.  In addition, you can place a layer 3 capable switch into the stack in order to provide a device that will get your packets off the local subnet.  That is a very nice feature that allows the customer base to buy layer 2 today if needed then add on in the future when they’ve outgrown the single wiring closet or single VLAN.  Once you’ve added the layer 3 switch to the stack, all those features are populated across all the ports of the whole stack.  That helps to get rid of some of the idiosyncrasies of some of the first stacking switch configurations, like not being able to locally switch packets.  Add in the fact that the stacking interfaces on these switches are the integrated 10Gig Ethernet ports, and you can see why I’m kind of excited.  No overpriced stacking kits.  Standard SFP+ interfaces that can be reused in the event I need to break the stack apart.

I’m putting this demo video up to show how a demo during your presentation can be both a boon and a bane.  Clear you cache after you’re done or log in as a different user to be sure you’re getting a clean experience.  The demo can be a really painful part when it doesn’t run correctly.

Kelvin Franklin was up next with an overview of VCS, Brocade’s fabric solution.  This is mostly review material from my Tech Day briefing, but there are some highlights here.  Firstly, Brocade is using yet a third new definition for the word “trunk”.  Unlike Cisco and HP, Brocade refers to the multipath connections into a VCS fabric as a trunk.  Now, a trunk isn’t a trunk isn’t a trunk.  You just have to remember the context of which vendor you’re talking about.  This was also the genesis of packet spraying, which I’m sure was a very apt description for what Brocade’s VCS is doing to the packets as they send them out of the bundled links but it doesn’t sound all that appealing.  Another thing to keep in mind when looking at VCS is that it is heavily based on TRILL for the layer 2 interconnects, but it does use FSPF from Brocade’s heavy fibre channel background to handle the routing of the links instead of IS-IS as the TRILL standard calls for.  Check out Ivan’s post from last year as to why that’s both good and bad.  Brocade also takes time to call out the fact that they’ve done their own ASIC in the new VCS switches as opposed to using merchant silicon like many other competitors.  Only time will tell how effective the move to merchant silicon will be for those that choose to use it, but so long as Brocade can continue to drive higher performance from custom silicon it may be an advantage for them.

This last part of the VCS presentation covers some of the real world use cases for fabrics and how Brocade is taking an incremental approach to building fabrics.  I’m curious to see how the VCS will begin to co-mingle with the HyperEdge strategy down the road.  Cisco has committed to bringing their fabric protocol (FabricPath) to the campus in the Catalyst 6500 in the near future.  With all the advantages of VCS that Brocade has discussed, I would like to see it extending down into the campus as well.  That would be a huge advantage for some of my customers that need the capability to do a lot of east-west traffic flows without the money to invest in the larger VCS infrastructure until their data usage can provide adequate capital.  There may not be a lot that comes out of it in the long run, but even having the option to integrate the two would be a feather in the marketing cap.

After lunch and a short OpenStack demo, we got an overview of Brocade’s involvement with the Open Networking Foundation (ONF) from Curt Beckmann.  I’m not going to say a lot about this video, but you really do need to watch it if you are at all curious to see where Brocade is going with their involvement with OpenFlow going forward.  As you’ve no doubt heard before, OpenFlow is really driving the future of networking and how we think about managing data flows.  Seeing what Brocade is doing to implement ideas and driving direction of ONF development is nice because it’s almost like a crystal ball of networking’s future.

The last two videos really go together to illustrate how Brocade is taking OpenFlow and adopting it into their model for software defined networking (SDN).  By now, I’ve heard almost every imaginable definition of SDN support.  On one end of the spectrum, you’ve got Cisco and Juniper.  A lot of their value is tied up in their software.  IOS and Junos represent huge investments for them.  Getting rid of this software so the hardware can be controlled by a server somewhere isn’t the best solution as they see it.  Their response has been to open APIs into their software and allow programmability into their existing structures.  You can use software to drive your networking, but you’re going to do it our way.  At the other extreme end of the scale, you’ve got NEC.  As I’ve said before, NEC is doubling down on OpenFlow mainly for one reason – survival.  If they don’t adapt their hardware to be fully OpenFlow compliant, they run the risk of being swept off the table by the larger vendors.  Their attachment to their switch OS isn’t as important as making their hardware play nice with everyone else.  In the middle, you’ve got Brocade.  They’ve made some significant investments into their switch software and protocols like VCS.  However, they aren’t married to the idea of their OS being the be all, end all of the conversation.  What they do want, however, is Brocade equipment in place that can take advantage of all the additional features offered from areas that aren’t necessarily OpenFlow specific.  I think their idea around OpenFlow is to push the hybrid model, where you can use a relatively inexpensive Brocade switch to fulfill your OpenFlow needs while at the same time allowing for that switch to perform some additional functionality above and beyond that defined by the ONF when it comes to VCS or other proprietary software.  They aren’t doing it for the reasons of survival like NEC, but it offers them the kind of flexibility they need to get within striking distance of the bigger players in the market.

If you’d like to learn more about Brocade, you can check out their website at http://www.brocade.com.  You can also follow them on Twitter as @BRCDComm.

Tom’s Take

I’ve seen a lot of Brocade in the last couple of months.  I’ve gotten a peek at their strategies and had some good conversations with some really smart people.  I feel pretty comfortable understanding where Brocade is going with their Ethernet business.  Yes, whenever you mention them you still get questions about fibre channel and storage connectivity, but Brocade really is doing what they can to get the word out about that other kind of networking that they do.  From the big iron of the VDX to the ability to stack the ICX switches all the way to the planning in the ONF to run OpenFlow on everything they can, Brocade seems to have started looking at the long-term play in the data networking market.  Yes, they may not be falling all over themselves to go to war with Cisco or even HP right now.  However, a bit of visionary thinking can lead one to be standing on the platform when the train comes rumbling down the track.  That train probably has a whistle that sounds an awful lot like “OpenFlow,” so only time can tell who’s going to be riding on it and who’s going to be underneath it.

Tech Field Day Disclaimer

Brocade was a sponsor of Network Field Day 4.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 4.  In addition, Brocade provided me with a gift bag containing a 2GB USB stick with marketing information and a portable cell phone charger. They did not ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Is SPB the Betamax of Layer 2?

While I was at Brocade Tech Day, I had the wonderful opportunity to sit down with Jon Hudson (@the_solutioneer) and just talk for about half an hour.  While the rest of the day was a whirlwind of presentations and interviews, Jon and I talked about all manner of things not related to VDX or VCS.  Instead, we had a very fascinating discussion about TRILL and SPB.

For those not familiar, TRILL is the IETF standard for the issue of layer 2 multipath.  It’s a very elegant solution for the spanning tree problem.  Our data centers today are running at half capacity.  That’s not because we don’t have enough bandwidth, though.  It’s because half our links are shut down, waiting for a link failure.  Thanks to 802.1d spanning tree, we can’t run two links at the same time unless they are bundled into a link aggregation (LAG) solution.  And heaven forbid we want to terminate that LAG bundle on two different switches to prevent single-switch failure from affecting our traffic.  Transparent Interconnection of Lots of Links (TRILL) fixes this by creating a layer 2 network with link state.  It accomplishes this by running a form of IS-IS, which allows the layer 2 nodes to create an SPF table and determine not only the best path to a node, but other paths that could be equally as good.  This means that we have a real fabric of interconnections with no blocked links.

802.1aq Shortest Path Bridging, or SPB informally, is the IEEE version of a layer 2 multipathing replacement for spanning tree.  It looks a lot like TRILL and even uses IS-IS for the layer 2 protocol as well.  It does differ in some respects, such as using MAC-in-MAC encapsulation for frames as opposed to rewriting the header like TRILL does.  This makes it very attractive to the service provider market, as they don’t have to buy a bunch of new gear to get everything up and running quickly on SPB.  Looking at the proponents of SPB, such as Avaya and Alcatel-Lucent that really comes as no surprise.  Those companies are heavily invested in the service provider space and would really love to see SPB adoption take off as it would protect their initial investments.

The showdown between TRILL and SPB isn’t that far removed from the old showdown between VHS and Betamax.  For those not entirely familiar, this was a case of two competing standards that was eventually settled in the court of the consumer.  While many regard the early Betamax units as technologically superior, there was an issue of tape length (1 hour vs the VHS 2 hour limit).  As time wore on, there was significant development done on both sides that stretched the formats to their absolute limits.  However, by the end, VHS had won due to simple popularity.  Since VHS had become the most popular format for consumers, even the supposed superiority of Betamax couldn’t save it from being relegated to the junk pile of history.  Another more recent case is the battle between HD-DVD and Blu-ray.  Similarly to the analog format wars decades earlier, the digital disc war erupted from two alliances thinking they had the best solution to the problem.  Blu-ray eventually won out in much the same way that VHS – by becoming the format that most people wanted to use.  The irony that Sony actually won a format war isn’t lost on a lot of people either.

I believe that we’re going to see something like these showdowns in TRILL vs. SPB.  Right now, the battle lines seem drawn between the data center vendors supporting TRILL and the service provider vendors getting ready to implement SPB.  Whether or not one of the solutions is technically superior to the other is inconsequential at this point.  It’s all going to come down to popularity.  Brocade and Cisco have non-standard TRILL implementations in VCS and FabricPath.  The assumption is that they will be compatible with TRILL when a working solution is finally released.  I’m also guessing that we’re going to see more support for TRILL in the cloud providers to maximize their revenue potential by offering non-blocking paths to increase throughput for those hungry cloud applications.  Brocade showcased some providers moving to VCS at Brocade Tech Day.  If that’s the case, we’re going to see TRILL at the enterprise level and the cloud provider level connected by an SP core running SPB.  Just like Betamax being the favorite of the professional video industry, SPB will be the go-to protocol for providers, as they can put of yet one more round of equipment upgrades.  I think by that point, however, TRILL will have obtained enough critical mass to drive adoption to the point where TRILL silicon will be a very inexpensive option on most new equipment in a few years, perhaps even becoming the default configuration.  If that is indeed the case, then TRILL will indeed become the VHS or Blu-ray of this protocol war.


Tom’s Take

I can still remember going into the video store and seeing the great divide.  On one side, Betamax.  On the other, VHS.  Slowly, the Betamax side of the house shrank away to nothing.  It happened again with HD DVD and Blu-ray.  In the end, both format wars came down to popularity.  VHS was in more households and offered the ability to record two hours worth of programming instead of one.  Blu-ray got the popular movie studios on board quickly, like Disney.  Once the top selling movies were on Blu-ray, the outcome was all but guaranteed.  In the big debate of TRILL against SPB, it’s going to come down to popularity.  I think we’re already seeing the beginning of TRILL winning this fight.  Sure, the service providers are going to use SPB as long as they can to avoid upgrading to TRILL-compatible hardware.  I could even make a pretty compelling case the neither of these two layer 2 protocols would make a bunch of sense for a service provider.  At the end of the day, though, I’m pretty sure that we’ll eventually be speaking about SPB in the same hushed nostalgia we reserve for the losers of the format wars so many years ago.

Here are a few posts about TRILL and SPB that generated some ideas for me.  You should check them out too:

Does TRILL Stand A Chance At Wide Adoption – Ethan Banks

Why SPB Doesn’t Get Any Attention – Greg Ferro

TRILL and 802.1aq (SPB) Are Like Apples and Oranges – Ivan Pepelnjak

NANOG 50 TRILL vs. SPB Great Debate – PDF of a huge discussion presentation

Brocade Tech Day – Data Centers Made Simple

When I was just a wee lad, my mom decided one year that she’d had enough of the mass produced Halloween costumes available at the local department store.  I wanted to be a ninja (surprise, I know) and she was determined to make me the best ninja costume around.  My mother knows how to sew, but she isn’t a seamstress by any stretch of the imagination.  What she did do was go to the fabric store and pick up a package of Simplicity patterns.  These are wonderful.  They enable those of us without the gift of textile assembly to create something from fabric that can be astounding.  Simplicity takes all the guesswork out of making a costume by giving you easy-to-follow directions.  You don’t have to think about the process beyond a few cuts and some stitches.  Instead, you can think about the applications of the final product, from ninja to Jedi to superhero.

You may be asking yourself…what does this have to do with Brocade and networking?  At Brocade Tech Day, myself and other analysts sat down to hear about the new story from Brocade in the data center.  At the heart of the message was the work “simplicity”.  Simplicity Through Innovation.  The need to radically simplify things in order to achieve the scale and efficiency we need to create huge data centers.  And at the center of it all is Brocade’s VCS Ethernet fabric.  I got a chance to kick the tires on VCS back at Network Field Day 2, but the announcements at Brocade Tech Day were a bit more ambitious.  That’s because the face of VCS is now the Brocade VDX 8770.  This switch is a monster.  It has the capability of learning up to 384,000 MAC addresses in those little CAM tables.  It has the capacity for 384 10GigE and 96 40GigE ports, as well as expandability to 100GigE.  Though I’m a bit unsure of how they arrived at the numbers, they claim it can support up to 320,000 virtual machines on those ports.  Ivan Pepelnjak did a great breakdown of the capabilities of the switch on launch day.  I’m especially keen on the idea that you can create a four-way virtual gateway that shares the same IP and MAC address.  This overcomes the limitations of HSRP/VRRP, as well as some of the quirkiness of GLBP.  That shows that Brocade is at least thinking beyond layer 2, unlike a lot of data center vendors that believe the world is flat (networking wise).  After speaking with Lisa Caywood (@TheRealLisaC), I found that this huge iron switch is being used by customers not at the core of the network but instead at the edge of the data center, where all those hungry hypervisors and servers live.  All the numbers that I’m seeing from the VDX 8770 point to it as a way to aggregate a huge amount of packets coming from a data center server farm and feed it through the rest of the network via VCS.  That makes total sense when coupled with some of Brocade’s prognostications, such as 80% of server traffic becoming east-west (server-to-server) in the next year or so.

Brocade also had some other interesting pieces on display.  One of them was a new capability for the ADX application delivery controller, or load balancer as 90% of the rest of the world calls it.  The ADX is slated to begin using Field Programmable Gate Arrays (FPGAs) to terminate VXLAN tunnels before they head into the VCS fabric.  I find it very interesting that they chose FPGAs to do this, having seen something similar from Arista just a few months ago.  I also get to chuckle a little bit to myself when one of the cornerstones of the software defined networking (SDN) world is terminated in a hardware construct.  I suppose it brings a bit of order back to my world.  Another interesting thing that came up during the presentations is that Brocade is making all their own silicon in the VDX 8770 and moving forward.  In a day and age where moving to merchant silicon seems to be the flavor of the month, I’m curious as to where Brocade is headed with this.  Obviously, the ability to create your own chips gives you an advantage over other competitors when it comes to designing the chip the way you want it to function, such as putting 38MB of TCAM on it or producing something with a ridiculously low port-to-port latency.  However, the agility afforded from merchant silicon gives other vendors the ability to innovate in the software arena.  That, I believe, is where the battleground is really going to be in the coming months.  On the one side, you’ll have vendors invested in custom silicon that will be doing amazing things with hardware.  On the other side, you’ll have the merchant silicon vendors that are all using very similar reference designs but are really concentrating on innovation in software.  It’s an exciting time to be in networking for sure.

Brocade Tech Day Disclaimer

I was invited to attend Brocade Tech Day by Brocade.  They paid for my airfare and lodging.  I also attended an executive dinner that evening that was provided by Brocade.  At no time during this process was any requirement made of me in regards to posting information about Brocade Tech Day.  Brocade did not ask for nor were they promised any consideration in this post.  The conclusions and analysis herein are mine and mine alone.

SDN and the IT Toolbox

There’s been a *lot* of talk about software-defined networking (SDN) being the next great thing to change networking. Article after article has been coming out recently talking about how things like the VMware acquistion of Nicira is going to put network engineers out of work. To anyone that’s been around networking for a while, this isn’t much different than the talk that’s been coming out about any one of a number of different technologies over the last decade.

I’m an IT nerd. I can work on a computer with my eyes closed. However, not everything in my house is a computer. Sometimes I have work on other things, like hanging mini-blinds or fixing a closet door. For those cases, I have to rely on my toobox. I’ve been building it up over the years to include all the things one might need to do odd jobs around the house. I have a hammer and a big set of screwdrivers. I have sockets and wrenches. I have drills and tape measures. The funny thing about these tools is the “new tool mentality”. Every time I get a new tool, I think of all the new things that I can do with it. When I first got my power drill, I was drilling holes in everything. I hung blinds with ease. I moved door knobs. I looked for anything and everything I could find to use my drill for. The problem with that mentality is that after a while, you find that your new tool can’t be used for every little job. I can’t drive a nail with a drill. I can’t measure a board with a drill. In fact, besides drilling holes and driving screws, drills aren’t good for a whole lot of work. With experience, you learn that a drill is a great tool for a range of uses.

This same type of “new tool mentality” is pervasive in IT as well. Once we develop a new tool for a purpose, we tend to use that tool to solve almost every problem. In my time in IT, I have seen protocols being used to solve every imaginable problem. Remember ATM? How about LANE? If we can make everything ATM, we can solve every problem. How about QoS? I was told at the beginning of my networking career that QoS is the answer to every problem. You just have to know how to ask the right question. Even MPLS has fallen into the category at one point in the past. MPLS-ing the entire world just makes it run better, right? Much like my drill analogy above, once the “newness” wore off of these protocols and solutions, we found out that they are really well suited for a much more narrow purpose. MPLS and QoS tend to be used for the things that they are very good at doing and maybe for a few corner cases outside of that focus. That’s why we still need to rely on many other protocols and technologies to have a complete toolbox.

SDN has had the “new tool mentality” for the past few months. There’s no denying at this point that it’s a disruptive technology and ripe to change the way that people like me looking at networking. However, to say that it will eventually become the de facto standard for everything out there and the only way to accomplish networking in the next three years may be stretching things just a bit. I’m pretty sure that SDN is going to have a big impact on my work as an integrator. I know that many of the higher education institutions that I talk to regularly are not only looking at it, but in the case of things like Internet2, they’re required to have support for SDN (the OpenFlow flavor) in order to continue forward with high speed connections. I’ve purposely avoided launching myself into the SDN fray for the time being because I want to be sure I know what I’m talking about. There’s quite a few people out there talking about SDN. Some know what they’re talking about. Others see it as a way to jump into the discussion with a loud voice just to be heard. The latter are usually the ones talking about SDN as a destuctive force that will cause us all to be flipping burgers in two years. Rather than giving credence to their outlook on things, I would say to wait a bit. The new shinyness of SDN will eventually give way to a more realistic way of looking at its application in the networking world.  Then, it will be the best tool for the jobs that it’s suited for.  Of course, by then we’ll have some other new tool to proclaim as the end-all, be-all of networking, but that’s just the way things are.

Cisco Data Center – Network Field Day 3

Day two of Network Field Day 3 brought us to Tasman Drive in San Jose – the home of a little networking company named Cisco.  I don’t know if you’ve heard of them or not, but they make a couple of things I use regularly.  We had a double session of four hours at the Cisco Cloud Innovation Center (CCIC) covering a lot of different topics.  For the sake of clarity I’m going to split the two posts along product lines.  The first will deal with the Cisco Data Center team and their work on emerging standards.

Han Yang, Nexus 1000v Product Manager, started us off with a discussion centered around VXLAN.  VXLAN is an emerging solution to “the problem” (drawing by Tony Bourke):

The Problem

The Problem - courtesy of Tony Bourke

The specific issue we’re addressing with VXLAN is “lots of VLANS”.  As it turns out, when you try to create multitenant clouds for large customers, you tend to run out of VLANs pretty quickly.  Seems 4096 VLANs ranks right up there with 640k of conventional memory on the “Seemed Like A Good Idea At The Time” scale of computer miscalculations.  VXLAN seeks to remedy this issue by wrapping the original frame in a VXLAN header that contains an additional 24-bit VXLAN header along with an additional 802.1q tag:

VXLAN allows the packet to be encapsulated by the vSwitch (in this case a Nexus 1000v) and be tunneled over the network before arriving in the proper destination where the VXLAN header is stripped off, leaving the tag underneath.  The hypervisor isn’t aware of VXLAN at all.  It merely serves as an overlay.  VXLAN does require multicast to be enabled in your network, but for your PIM troubles you get an additional 16 million sub divisions to your network structure.  That means you shouldn’t run out of VLANs any time soon.

Han gave us a great overview of VXLAN and how it’s going to be used a bit more extensively in the data center in the coming months as we begin to attempt to scale out and break through our limitation of VLANs in large clouds.  Here’s hoping that VXLAN really begins to take off and becomes the de facto standard of NVGRE.  Because I still haven’t forgiven Microsoft for Teredo.  I’m not about to give them a chance to screw up the cloud too.

Up next was Victor Moreno, a technical lead in the Data Center Business Unit.  Victor has been a guest on Packet Pushers before on show 54 talking about the Locator/ID Separation Protocol (LISP).  Victor talked to us about LISP as well as the difficulties in creating large-scale data centers.  One key point of Victor’s talk was about moving servers (or workloads as he put them).  Victor pointed out that moving all of the LAN extensions like STP and VTP across the site was totally unnecessary.  The most important part of the move is preservation of IP reachability.  In the video above, this elicited some applause from the delegates because it’s nice to see that people are starting to realize that extending the layer 2 domain everywhere might not be the best way to do things.

Another key point that I took from Victor was about VXLAN headers and LISP headers and even Overlay Transport Virtualization (OTV) headers.  It seems they all have the same 24-bit ID field in the wrapper.  Considering that Cisco is championing OTV and LISP and was an author on the VXLAN draft, this isn’t all that astonishing.  What really caught me was the idea that Victor proposed wherein LISP was used to implement many of the features in VXLAN so that the two protocols could be very interoperable.  This also eliminates the need to continually reinvent the wheel every time a new protocol is needed for VM mobility or long-distance workload migration.  Pay close attention to a slide about 22:50 into the video above.  Victor’s Inter-DC and Intra-DC slide detailing which protocol works best in a given scenario at a specific layer is something that needs to be committed to memory for anyone that wants to be involved in data center networking any time in the next few years.

If you’d like to learn more about Cisco’s data center offerings, you can head over to the data center page on Cisco’s website at http://www.cisco.com/en/US/netsol/ns340/ns394/ns224/index.html.  You can also get data center specific information on Twitter by following the Cisco Data Center account, @CiscoDC.

Tom’s Take

I’m happy that Cisco was able to present on a lot of the software and protocols that are going into building the new generation of data center networking.  I keep hearing things like VXLAN, OTV, and LISP being thrown around when discussing how we’re going to address many of the challenges presented to us by the hypervisor crowd.  Cisco seems to be making strides in not only solving these issues but putting the technology at the forefront so that everyone can benefit from it.  That’s not to say that their solutions are going to end up being the de facto standard.  Instead, we can use the collective wisdom behind things like VXLAN to help us drive toward acceptable methods of powering data center networks for tomorrow.  I may not have spent a lot of my time in the data center during my formal networking days, but I have a funny feeling I’m going to be there a lot more in the coming months.

Tech Field Day Disclaimer

Cisco Data Center was a sponsor of Network Field Day 3.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 3. In addition, they provided me a USB drive containing marketing collateral and copies of the presentation as well as a pirate eyepatch and fake pirate pistol (long story).  They did not ask for, nor where they promised any kind of consideration in the writing of this review/analysis.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.