DockerCon Thoughts – Secure, Sufficient Applications

containerssuspended

I got to spend a couple of days this week at DockerCon and learn a bit more about software containers. I’d always assumed that containers were a slightly different form of virtualization, but thankfully I’ve learned my lesson there. What I did find out about containers gives me a bit of hope about the future of applications and security.

Minimum Viable App

One of the things that made me excited about Docker is that the process isolation idea behind building a container to do one thing has fascinating ramifications for application developers. In the past, we’ve spent out time building servers to do things. We build hardware, boot it with an operating system, and then we install the applications or the components thereof. When we started to virtualize hardware into VMs, the natural progression was to take the hardware resource and turn it into a VM. Thanks to tools that would migrate a physical resource to a virtual one in a single step, most of the first generation VMs were just physical copies of servers. Right down to phantom drivers in the Windows Device Manager.

As we started building infrastructure around the idea of virtualization, we stopped migrating physical boxes and started building completely virtual systems from the ground up. That meant using things like deployment templates, linked clones, and other constructs that couldn’t be done in hardware alone. As time has rolled on, we have a method of quickly deploying virtual resources that we could never do on purely physical devices. We finally figured out how to use virtual platforms efficiently.

Containers are now at the crossroads we saw early on in virtualization. As explained by Mike Coleman (@MikeGColeman), many application developers are starting their container journey by taking an existing app and importing it directly into a container. It’s a bit more involved than the preferred method, but Mike mentioned that even running the entire resource pool in a container does have some advantages. I’m sure the Docker people see container adoption as the first step toward increased market share. Even if it’s a bit clumsy at the start.

The idea then moves toward the breakdown of containers into the necessary pieces, much as it did with virtual machines years ago. Instead of being forced to think about software as a monolithic construct that has to live on a minimum of one operating system, developers can break the system down into application pieces that can execute one program or thread at a time on a container. Applications can be built using the minimum amount of software constructs needed for an individual process. That means that those processes can be spread out and scaled up or down as needed to accomplish goals.

If your database query function is running as a containerized process instead of running on a query platform in a VM then scaling that query to thousands or tens of thousands of instances only requires spinning up new containers instead of new VMs. Likewise, scaling a web-based app to accommodate new users can be accomplished with an explosion of new containers to meet the need. And when the demand dies back down again, the containers can be destroyed and resources returned to the available pool or turned off to save costs.

Segment Isolation

The other exciting thing I saw with containers was the opportunity for security. The new buzzword of the day in security and networking is microsegmentation. VMware is selling it heavily with NSX. Cisco has countered with a similar function in ACI. At the heart of things microsegmentation is simply ensuring that processes that shouldn’t be talking to each other won’t be talking to each other. This prevents exposure by having your app database server visible on the public Internet, for instance.

Microsegmentation is great in overlay and network virtualization systems where we have to take steps to prevent systems from talking to each other. That means policies and safeguards in place to prevent communications. It’s a great way to work on existing networks where the default mode is to let everything on the same subnet talk to everything else. But what if the default was something different.

With containers, there is a sandbox environment for each container to talk to other containers in the same area. If you create a named container network and attach a container to it, that container gains a network interface on that particular named network. It won’t be able to talk to other containers on different networks without creating an explicit connection between the two networks. That means that the default mode of communications for the containers is restricted out of the box.

Imagine how nice it will be to create a network that isn’t insecure by default. Rather than having to unconnected all the things that shouldn’t speak, you can spend your time building connections between the things that should be speaking. That means a little mistake or forgotten connection will prevent communications instead of opening it up. That means much less likelihood that you’re going to cause an incident.

There are still some issues with scaling the networking aspect of Docker right now. The key/value store doesn’t provide a lot of visibility and definitely won’t scale up to tens or hundreds of thousands of connections. My hope is that down the road Docker will implement a more visible solution that can perform drag-and-drop connectivity between containers and leave an audit trail so networking pros can figure out who connected what and how that exposed everything. It also makes it much easier when the connection between the devices has to be explicit to prove intent or malice. But those features are likely to come down the road as Docker builds a bigger, better management platform.


Tom’s Take

I think Docker is doing things right. By making developers look at the core pieces they need to build apps and justify why things are being done the way they’ve always been done, containers are allowing for flexibility and new choices to be made. At the same time, those choices are inherently more secure because resources are only shared when necessary. It’s the natural outgrowth of sandboxing and Jails in the OS from so many years ago. Docker has a chance to make application developers better without making them carry the baggage of years of thinking along with them to a new technology.

Should Microsoft Buy Big Switch?

MSBSN

Network virtualization is getting more press than ever.  The current trend seems to be pitting the traditional networking companies, like Cisco and Juniper, against the upstarts in the server virtualization companies, like VMware and OpenStack.  To hear the press and analysts talk about it makes one think that these companies represent all there is in the industry.

Whither Microsoft?

One company that seems to have been left out of the conversation is Microsoft.  The stalwarts of Redmond have been turning heads with their rapid pace of innovation to reach parity with VMware’s offerings.  However, when the conversation turns to networking Microsoft is usually left out in the cold.  That’s because their efforts at networking in the past have been…problematic.  They are very service oriented and care little for the world outside their comfortable servers.  That won’t last forever.  VMware will be able to easily shift the conversation away from feature parity with Hyper-V and concentrate on all the networking expertise that it has now that is missing in the competitor.

Microsoft can fix that problem with a small investment.  If you can innovate by building it, you need to buy it.  Microsoft has the cash to buy several startups, even after sinking a load of it into Nokia.  But which SDN-focused company makes the most sense for Microsoft?  I spent a lot of time thinking about this very question and the answer became clear for me:  Microsoft needs to buy Big Switch Networks.

A Window On The Future

Microsoft needs SDN expertise.  They have no current networking experience outside of creating DHCP and DNS services on their platforms.  I mean, did anyone ever use their Network Access Protocol solution as a NAC option?  Microsoft has traditionally created bare bones network constructs to please their server customers.  They think networking is a resource outside their domain, which coincidentally is just how their competitors used to look at it as well.  At least until Martin Casado changed their minds.

Big Switch is a perfect fit for Microsoft.  They have the chops to talk OpenFlow.  Their recent shift away from overlays to software on bare metal would play well as a marketing point against VMware and their “overlays are the best way” message.  They could also help Microsoft do more development on NV-GRE, the also ran to VxLAN.  Ivan Pepelnjak (@IOSHints) was pretty impressed with NV-GRE last December, but it’s dropped of the radar in the wake of VMware embracing VxLAN in NSX.  I think having a bit more development work from the minds at Big Switch would put it back into the minds of some smaller network virtualization companies looking to support something other than the de facto standard.  I know that Big Switch has moved away from the overlay model, but if NV-GRE can easily be adapted to the work Big Switch was doing a few months ago, it would be a great additional offering to the idea of running everything in an SDN-enabled switch OS.

Microsoft will also benefit from the pile of SDN applications that Big Switch has rumored to be sitting around and festering for lack of attention.  Applications like network taps sell Big Switch products now.  With NSX introducing the ideas of integrated load balancers and firewalls into the base product, Big Switch is going to be hard pressed to charge extra for them.  Instead, they’re going to have to go out on a limb and finish developing them past the alpha stage and hope that they are enough to sell more product and recoup the development costs.  With the deep pockets in Redmond, finishing those applications would be a drop in the bucket if it means that the new product can compete directly on an even field with VMware.

Building A Bigger Switch

Big Switch gains in this partnership also.  They get to take some pressure of their overworked development team.  It can’t be easy switching horses in mid-stream, especially when it involves changing your entire outlook on how SDN should be done.  Adding a few dozen more people to the project will allow you to branch out and investigate how integrating software into your ideas could be done.  Big Switch has already done a great job developing Project Floodlight.  Why not let some big brains chew on other ideas in the same vein for a while.

Big Switch could also use the stability of working for an established company.  They have a pretty big target on their backs now that everyone is developing an SDN strategy.  Writing an OS for bare metal switches is going to bring them into contention with Cumulus Networks.  Why not let an OS vendor do some of the heavy lifting?  It would also allow Microsoft’s well established partner program to offer incentives to partners that want to sell white label switches with software from Big Switch to get into networking much more cheaply than before.  Think about federal or educational discounts that Microsoft already gives to customers.  Do you think they’d be excited to see the same kind of consideration when it comes to networking hardware?

Tom’s Take

Little fish either get eaten by bigger ones or they have to be agile enough to avoid being snapped up.  The smartest little fish in the ocean may be the remora.  It survives by attaching itself to a bigger fish and providing a benefit for them both.  The remora gets the protection of not being eaten while also not taking too much from the host.  Microsoft would do well to setup some kind of similar arrangement with Big Switch.  They could fund future development into NV-GRE compatible options, or they just by the company outright.  Both parties get something out of the deal: Microsoft gets the SDN component they need.  Big Switch gets a backer with so much industry clout that they can no longer be dismissed.

Disruption in the New World of Networking

This is the one of the most exciting times to be working in networking. New technologies and fresh takes on existing problems are keeping everyone on their toes when it comes to learning new protocols and integration systems. VMworld 2013 served both as an annoucement of VMware’s formal entry into the larger networking world as well as putting existing network vendors on notice. What follows is my take on some of these announcements. I’m sure that some aren’t going to like what I say. I’m even more sure a few will debate my points vehemently. All I ask is that you consider my position as we go forward.

Captain Over, Captain Under

VMware, through their Nicira acquisition and development, is now *the* vendor to go to when you want to build an overlay network. Their technology augments existing deployments to provide software features such as load balancing and policy deployment. In order to do this and ensure that these features are utilized, VMware uses VxLAN tunnels between the devices. VMware calls these constructs “virtual wires”. I’m going to call them vWires, since they’ll likely be called that soon anyway. vWires are deployed between hosts to provide a pathway for communications. Think of it like a GRE tunnel or a VPN tunnel between the hosts. This means the traffic rides on the existing physical network but that network has no real visibility into the payload of the transit packets.

Nicira’s brainchild, NSX, has the ability to function as a layer 2 switch and a layer 3 router as well as a load balancer and a firewall. VMware is integrating many existing technologies with NSX to provide consistency when provisioning and deploying a new sofware-based network. For those devices that can’t be virtualized, VMware is working with HP, Brocade, and Arista to provide NSX agents that can decapsulate the traffic and send it to an physical endpoint that can’t participate in NSX (yet). As of the launch during the keynote, most major networking vendors are participating with NSX. There’s one major exception, but I’ll get to that in a minute.

NSX is a good product. VMware wouldn’t have released it otherwise. It is the vSwitch we’ve needed for a very long time. It also extends the ability of the virtualization/server admin to provision resources quickly. That’s where I’m having my issue with the messaging around NSX. During the second day keynote, the CTOs on stage said that the biggest impediment to application deployment is waiting on the network to be configured. Note that is my paraphrasing of what I took their intent to be. In order to work around the lag in network provisioning, VMware has decided to build a VxLAN/GRE/STT tunnel between the endpoints and eliminate the network admin as a source of delay. NSX turns your network in a fabric for the endpoints connected to it.

Under the Bridge

I also have some issues with NSX and the way it’s supposed to work on existing networks. Network engineers have spent countless hours optimizing paths and reducing delay and jitter to provide applications and servers with the best possible network. Now, that all doesn’t matter. vAdmins just have to click a couple of times and build their vWire to the other server and all that work on the network is for naught. The underlay network exists to provide VxLAN transport. NSX assumes that everything working beneath is running optimally. No loops, no blocked links. NSX doesn’t even participate in spanning tree. Why should it? After all, that vWire ensures that all the traffic ends up in the right location, right? People would never bridge the networking cards on a host server. Like building a VPN server, for instance. All of the things that network admins and engineers think about in regards to keeping the network from blowing up due to excess traffic are handwaved away in the presentations I’ve seen.

The reference architecture for NSX looks pretty. Prettier than any real network I’ve ever seen. I’m afraid that suboptimal networks are going to impact application and server performance now more than ever. And instead of the network using mechanisms like QoS to battle issues, those packets are now invisible bulk traffic. When network folks have no visibility into the content of the network, they can’t help when performance suffers. Who do you think is going to get blamed when that goes on? Right now, it’s the network’s fault when things don’t run right. Do you think that moving the onus for server network provisioning to NSX and vCenter is going to forgive the network people when things go south? Or are the underlay engineers going to be take the brunt of the yelling because they are the only ones that still understand the black magic outside the GUI drag-and-drop to create vWires?

NSX is for service enablement. It allows people to build network components without knowing the CLI. It also means that network admins are going to have to work twice as hard to build resilient networks that work at high speed. I’m hoping that means that TRILL-based fabrics are going to take off. Why use spanning tree now? Your application and service network sure isn’t. No sense adding any more bells and whistles to your switches. It’s better to just tie them into spine-and-leaf CLOS fabrics and be done with it. It now becomes much more important to concentrate on the user experience. Or maybe the wirless network. As long as at least one link exists between your ESX box and the edge switch let the new software networking guys worry about it.

The Recumbent Incumbent?

Cisco is the only major networking manufacturer not publicly on board with NSX right now. Their CTO Padma Warrior has released a response to NSX that talks about lock-in and vertical integration. Still others have released responses to that response. There’s a lot of talk right now about the war brewing between Cisco and VMware and what that means for VCE. One thing is for sure – the landscape has changed. I’m not sure how this is going to fall out on both sides. Cisco isn’t likely to stop selling switches any time soon. NSX still works just fine with Cisco as an underlay. VCE is still going to make a whole bunch of money selling vBlocks in the next few months. Where this becomes a friction point is in the future.

Cisco has been building APIs into their software for the last year. They want to be able to use those APIs to directly program the network through devices like the forthcoming OpenDaylight controller. Will they allow NSX to program them as well? I’m sure they would – if VMware wrote those instructions into NSX. Will VMware demand that Cisco use the NSX-approved APIs and agents to expose network functionality to their software network? They could. Will Cisco scrap OnePK to implement NSX? I doubt that very much. We’re left with a standoff. Cisco wants VMware to use their tools to program Cisco networks. VMware wants Cisco to use the same tools as everyone else and make the network a commodity compared to the way it is now.

Let’s think about that last part for a moment. Aside from some speed differences, networks are largely going to be identical to NSX. It won’t care if you’re running HP, Brocade, or Cisco. Transport is transport. Someone down the road may build some proprietary features into their hardware to make NSX run better but that day is far off. What if a manufacturer builds a switch that is twice as fast as the nearest competition? Three times? Ten times? At what point does the underlay become so important that the overlay starts preferring it exclusively?


Tom’s Take

I said a lot during the Tuesday keynote at VMworld. Some of it was rather snarky. I asked about full BGP tables and vMotioning the machines onto the new NSX network. I asked because I tend to obsess over details. Forgotten details have broken more of my networks than grand design disasters. We tend to fuss over the big things. We make more out of someone that can drive a golf ball hundreds of yards than we do about the one that can consistently sink a ten foot putt. I know that a lot of folks were pre-briefed on NSX. I wasn’t, so I’m playing catch up right now. I need to see it work in production to understand what value it brings to me. One thing is for sure – VMware needs to change the messaging around NSX to be less antagonistic towards network folks. Bring us into your solution. Let us use our years of experience to help rather than making us seem like pariahs responsible for all your application woes. Let us help you help everyone.

Plexxi and the Case for Affinity

Plexxi Logo

Our last presentation from Day 2 of Network Field Day 5 came from a relatively new company – Plexxi.  I hadn’t really heard much from them before they signed up to present at NFD5.  All I really knew was that they had been attracting some very high profile talent from the Juniper ranks.  First Dan Backman (@jonahsfo) shortly after NFD4 then Mike Bushong (@mbushong) earlier this year.  One might speculate that when the talent is headed in a certain direction, it might be best to see what’s going on over there.  If only I had known the whole story up front.

Mat Mathews kicked off the presentation with a discussion about Plexxi and what they are doing to differentiate themselves in the SDN space.  It didn’t take long before their first surprise was revealed.  Our old buddy Derick Winkworth (@cloudtoad) emerged from his pond to tell us the story of why he moved from Juniper to Plexxi just the week before.  He’d kept the news of his destination pretty quiet.  I should have guessed he would end up at a cutting edge SDN-focused company like Plexxi.  It’s good to see smart people landing in places that not only make them excited but give them the opportunity to affect lots of change in the emerging market of programmable networking.

Marten Terpstra jumped in next to talk about the gory details of what Plexxi is doing.  In a nutshell, this all boils down to affinity.  Based on a study done by Microsoft in 2009, Plexxi noticed that there are a lot of relationships between applications running in a data center.  Once you’ve identified these relationships, you can start doing things with them.  You can create policies that provide for consistent communications between applications.  You can isolate applications from one another.  You can even ensure which applications get preferential treatment during a network argument.  Now do you see the SDN applications?  Plexxi took the approach that there is more data to be gathered by the applications in the network.  When they looked for it, sure enough it was there.  Now, armed with more information, they could start crafting a response.  What they came up with was the Plexxi Switch.  This is a pretty standard 32-port 10GigE switch with 4 QSFP ports..  Their differentiator is the 40GigE uplinks to the other Plexxi Switches.  Those are used to create a physical ring topology that allows the whole conglomeration to work together to create what looked to me like a virtual mesh network.  Once connected in such a manner, the affinities between the applications running at the edges of the network can now begin to be built.

Plexxi has a controller that sits above the bits and bytes and starts constructing the policy-based affinities to allow traffic to go where it needs to go.  It can also set things up so that things don’t go where they’re not supposed to be, as in the example Simon McCormack gives in the above video.  Even if the machine is moved to a different host in the network via vMotion or Live Migration, the Plexxi controller and network are smart enough to figure out that those hosts went somewhere different and that the policy providing for an isolated forwarding path needs to be reimplemented.  That’s one of the nice things about programmatic networking.  The higher-order networking controllers and functions figure out what needs to change in the network and implements the changes either automatically or with a minimum of human effort.  This ensures that the servers don’t come in and muck up the works with things like Dynamic Resource Scheduler (DRS) moves or other unforeseen disasters.  Think about the number of times you’ve seen a VM with an anti-affinity rule that keeps it from being evacuated from a host because there is some sort of dedicated link for compliance or security reasons.  With Plexxi, that can all be done automagically.  Derick even showed off some interesting possibilities around using Python to extend the capabilities of the CLI at the end of the video.

If you’d like to learn more about Plexxi, you can check them out at http://www.plexxi.com.  You can also follow them on Twitter as @PlexxiInc


Tom’s Take

Plexxi has a much different feel than many of the SDN products I’ve seen so far.  That’s probably because they aren’t trying to extend an existing infrastructure with programmability.  Instead, they’ve taken a singular focus around affinity and managed to tun it into something that looks to have some very fascinating applications in today’s data centers.  If you’re going to succeed in the SDN-centric world of today, you either need to be front of the race as it is being run today, like Cisco and Juniper, or you need to have a novel approach to the problem.  Plexxi really is looking at this whole thing from the top down.  As I mentioned to a few people afterwards, this feels like someone reimplemented QFabric with a significant amount of flow-based intelligence.  That has some implications for higher order handling that can’t be addressed by a simple fabric forwarding engine.  I will stay tuned to Plexxi down the road.  If nothing else, just for the crazy sock pictures.

Tech Field Day Disclaimer

Plexxi was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, they also gave the delegates a Nerf dart gun and provided us with after hours refreshments.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Additional Coverage of Plexxi and Network Field Day 5

Smart Optical Switching – Your Plexxible Friend – John Herbert

Plexxi Control – Anthony Burke

Cisco Borderless Idol

Cisco Logo

Day one of Network Field Day 5 (NFD5) included presentations from the Cisco Borderless team. You probably remember their “speed dating” approach at NFD4 which gave us a wealth of information in 15 minute snippets. The only drawback to that lineup is when you find a product or a technology that interests you there really isn’t any time to quiz the presenter before they are ushered off stage. Someone must have listened when I said that before, because this time they brought us 20 minute segments – 10 minutes of presentation, 10 minutes of demo. With the switching team, we even got to vote on our favorite to bring the back for the next round (hence the title of the post). More on that in a bit.

6500 Quad Supervisor Redundancy

First up on the block was the Catalyst 6500 team. I swear this switch is the Clint Howard of networking, because I see it everywhere. The team wanted to tell us about a new feature available in the ((verify code release)) code on the Supervisor 2T (Sup2T). Previously, the supervisor was capable of performing a couple of very unique functions. The first of these was Stateful Switch Over (SSO). During SSO, the redundant supervisor in the chassis can pick up where the primary left off in the event of a failure. All of the traffic sessions can keep on trucking even if the active sup module is rebooting. This gives the switch a tremendous uptime, as well as allowing for things like hitless upgrades in production. The other existing feature of the Sup2T is Virtual Switching System (VSS). VSS allows two Sup2Ts to appear as one giant switch. This is helpful for applications where you don’t want to trust your traffic to just one chassis. VSS allows for two different chassis to terminate Multi-Chassis EtherChannel (MLAG) connections so that distribution layer switches don’t have a single point of failure. Traffic looks like it’s flowing to one switch when in actuality it may be flowing to one or the other. In the event that a Supervisor goes down, the other one can keep forwarding traffic.

Enter the Quad Sup SSO ability. Now, instead of having an RPR-only failover on the members of a VSS cluster, you can setup the redundant Sup2T modules to be ready and waiting in the event of a failure. This is great because you can lose up to three Sup2Ts at once and still keep forwarding while they reboot or get replaced. Granted, anything that can take out 3 Sup2Ts at once is probably going to take down the fourth (like power failure or power surge), but it’s still nice to know that you have a fair amount of redundancy now. This only works on the Sup2T, so you can’t get this if you are still running the older Sup720. You also need to make sure that your linecards support the newer Distributed Forwarding Card 3 (DFC3), which means you aren’t going to want to do this with anything less than a 6700-series line card. In fact, you really want to be using the 6800 series or better just to be on the safe side. As Josh O’brien (@joshobrien77) commented, this is a great feature to have. But it should have been there already. I know that there are a lot of technical reasons why this wasn’t available earlier, and I’m sure the increase fabric speeds in the Sup2T, not to mention the increased capability of the DFC3, are the necessary component for the solution. Still, I think this is something that probably should have shipped in the Sup2T on the first day. I suppose that given the long road the Sup2T took to get to us that “better late than never” is applicable here.

UCS-E

Next up was the Cisco UCS-E series server for the ISR G2 platform. This was something that we saw at NFD4 as well. The demo was a bit different this time, but for the most part this is similar info to what we saw previously.


Catalyst 3850 Unified Access Switch

The Catalyst 3800 is Cisco’s new entry into the fixed-configuration switch arena. They are touting this a “Unified Access” solution for clients. That’s because the 3850 is capable of terminating up to 50 access points (APs) per stack of four. This think can basically function as a wiring closet wireless controller. That’s because it’s using the new IOS wireless controller functionality that’s also featured in the new 5760 controller. This gets away from the old Airespace-like CLI that was so prominent on the 2100, 2500, 4400, and 5500 series controllers. The 3850, which is based on the 3750X, also sports a new 480Gbps Stackwise connector, appropriately called Stackwise480. This means that a stack of 3850s can move some serious bits. All that power does come at a cost – Stackwise480 isn’t backwards compatible with the older Stackwise v1 and v2 from the 3750 line. This is only an issue if you are trying to deploy 3850s into existing 3750X stacks, because Cisco has announced the End of Sale (EOS) and End of Life (EOL) information for those older 3750s. I’m sure the idea is that when you go to rip them out, you’ll be more than happy to replace them with 3850s.

The 3850 wireless setup is a bit different from the old 3750 Access Controller that had a 4400 controller bolted on to it. The 3850 uses Cisco’s IOS-XE model of virtualizing IOS into a sort of VM state that can run on one core of a dual-core processor, leaving the second core available to do other things. Previously at NFD4, we’d seen the Catalyst 4500 team using that other processor core for doing inline Wireshark captures. Here, the 3850 team is using it to run the wireless controller. That’s a pretty awesome idea when you think about it. Since I no longer have to worry about IOS taking up all my processor and I know that I have another one to use, I can start thinking about some interesting ideas.

The 3850 does have a couple of drawbacks. Aside from the above Stackwise limitations, you have to terminate the APs on the 3850 stack itself. Unlike the CAPWAP connections that tunnel all the way back to the Airespace-style controllers, the 3850 needs to have the APs directly connected in order to decapsulate the tunnel. That does provide for some interesting QoS implications and applications, but it doesn’t provide much flexibility from a wiring standpoint. I think the primary use case is to have one 3850 switch (or stack) per wiring closet, which would be supported by the current 50 AP limitation. the othe drawback is that the 3850 is currently limited to a stack of four switches, as opposed to the increased six switch limit on the 3750X. Aside from that, it’s a switch that you probably want to take a look at in your wiring closets now. You can buy it with an IP Base license today and then add on the AP licenses down the road as you want to bring them online. You can even use the 3850s to terminate CAPWAP connections and manage the APs from a central controller without adding the AP license.

Here is the deep dive video that covers a lot of what Cisco is trying to do from a unified wired and wireless access policy standpoint. Also, keep an eye out for the cute Unifed Access video in the middle.

Private Data Center Mobility

I found it interesting this this demo was in the Borderless section and not the Data Center presentation. This presentation dives into the world of Overlay Transport Virtualization (OTV). Think of OTV like an extra layer of 802.1 q-in-q tunneling with some IS-IS routing mixed in. OTV is Cisco’s answer to extending the layer 2 boundary between data centers to allow VMs to be moved to other sites without breaking their networking. Layer 2 everywhere isn’t the most optimal solution, but it’s the best thing we’ve got to work with the current state of VM networking (until Nicira figures out what they’re going to do).

We loved this session so much that we asked Mostafa to come back and talk about it more in depth.

The most exciting part of this deep dive to me was the introduction of LISP. To be honest, I haven’t really been able to wrap my head around LISP the first couple of times that I saw it. Now, thanks to the Borderless team and Omar Sultan (@omarsultan), I’m going to dig into a lot more in the coming months. I think there are some very interesting issues that LISP can solve, including my IPv6 Gordian Knot.


Tom’s Take

I have to say that I liked Cisco’s approach to the presentations this time.  Giving us discussion time along with a demo allowed us to understand things before we saw them in action.  The extra five minutes did help quite a bit, as it felt like the presenters weren’t as rushed this time.  The “Borderless Idol” style of voting for a presentation to get more info out of was brilliant.  We got to hear about something we wanted to go into depth about, and I even learned something that I plan on blogging about later down the line.  Sure, there was a bit of repetition in a couple of areas, most notably UCS-E, but I can understand how those product managers have invested time and effort into their wares and want to give them as much exposure as possible.  Borderless hits all over the spectrum, so keeping the discussion focused in a specific area can be difficult.  Overall, I would say that Cisco did a good job, even without Ryan Secrest hosting.

Tech Field Day Disclaimer

Cisco was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, Cisco provided me with a breakfast and lunch at their offices.  They also provided a Moleskine notebook, a t-shirt, and a flashlight toy.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

VMware Partner Exchange 2013

VMwarePEXTitle

Having been named a vExpert for 2012, I’ve been trying to find ways to get myself invovled with the virtualization community. Besides joining my local VMware Users Group (VMUG), there wasn’t much success. That is, until the end of February. John Mark Troyer (@jtroyer), the godfather of the vExperts, put out a call for people interested in attending the VMware Partner Exchange in Las Vegas. This would be an all-expenses paid trip from a vendor. Besides going to a presentation and having a one-on-one engagement with them, there were no other restrictions about what could or couldn’t be said. I figured I might as well take the chance to join in the festivites. I threw my name into the hat and was lucky enough to get selected!

Most vendors have two distinctly different conferences througout the year. One is focused on end-users and customers and usually carries much more technical content. For Cisco, this is Cisco Live. For VMware, this is VMWorld. The other conference revolves around existing partners and resellers. Instead of going over the gory details of vMotions or EIGRP, it instead focuses on market strategies and feature sets. That is what VMware Partner Exchange (VMwarePEX) was all about for me. Rather than seeing CLI and step-by-step config guides to advanced features, I was treated to a lot of talk about differentiation and product placement. This fit right in with my new-ish role at my VAR that is focused toward architecture and less on post-sales technical work.

The sponsoring vendor for my trip was tried-and-true Hewlett Packard. Now, I know I’ve said some things about HP in the past that might not have been taken as glowing endoresements. Still, I wanted to look at what HP had to offer with an open mind. The Converged Application Systems (CAS) team specifically wanted to engage me, along with Damian Karlson (@sixfootdad), Brian Knudtson (@bknudtson), and Chris Wahl (@chriswahl) to observe and comment on what they had to offer. I had never heard of this group inside of HP, which we’ll get into a bit more here in a second.

My first real day at VMwarePEX was a day-long bootcamp from HP that served as an introduction to their product lines and how the place themselves in the market alongside Cisco, Dell, and IBM. I must admit that this was much more focused on sales and marketing than my usual presentation lineup. I found it tough to concentrate on certain pieces as we went along. I’m not knocking the presenters, as they did a great job of keeping the people in the room as focused as possible. The material was…a bit dry. I don’t think there was much that could have helped it. We covered servers, networking, storage, applications, and even management in the six hours we were in the session. I learned a lot about what HP had to offer. Based on my previous experiences, this was a very good thing. Once you feel like someone has missed on your expectations you tend to regard them with a wary eye. HP did a lot to fix my perception problem by showing they were a lot more than some wireless or switching product issues.

Definition: Software

I attended the VMwarePEX keynote on Tuesday to hear all about the “software defined datacenter.” To be honest, I’m really beginning to take umberage with all this “software defined <something>” terminology being bandied about by every vendor under the sun. I think of it as the Web 2.0 hype of the 2010s. Since VMware doesn’t manufacture a single piece of hardware to my knowledge, of course their view is that software is the real differentiator in the data center. Their message no longer has anything to do with convincing people that cramming twenty servers into one box is a good idea. Instead, they now find themsevles in a dog fight with Amazon, Citrix, and Microsoft on all fronts. They may have pioneered the idea of x86 virtualization, but the rest of the contenders are catching up fast (and surpassing them in some cases).

VMware has to spend a lot of their time now showing the vision for where they want to take their software suites. Note that I said “suite,” because VMware’s message at PEX was loud and clear – don’t just sell the hypervisor any more. VMware wants you to go out and sell the operations managment and the vCloud suite instead. Gone are the days when someone could just buy a single license for ESX or download ESXi and put in on a lab system to begin a hypervisor build-out. Instead, we now see VMware pushing the whole package from soup to nuts. They want their user base to get comfortable using the ops management tools and various add-ons to the base hypervisor. While the trend may be to stay hypervisor agnostic for the most part, VMware and their competitors realize that if you feel cozy using one set of tools to run your environment, you’ll be more likely to keep going back to them as you expand.

Another piece that VMware is really driving home is the idea of the hybrid cloud. This makes sense when you consider that the biggest public cloud provider out there isn’t exactly VMware-friendly. Amazon has a huge marketshare among public cloud providers. They offer the ability to convert your VMware workloads to their format. But, there’s no easy way back. According to VMware’s top execs, “When a customer moves a workload to Amazon, they lose. And we lose them forever.” The first part of that statement may be a bit of a stretch, but the second is not. Once a customer moves their data and operations to Amazon, they have no real incentive to bring it back. That’s what VMware is trying to change. They have put out a model that allows a customer to build a private cloud inside their own datacenter and have all the features and functionality that they would have in Reston, VA or any other large data center. However, through the use of magic software, they can “cloudburst” their data to a VMware provider/partner in a public cloud data center to take advantage of processing surplus when needed, such as at tax time or when the NCAA tournement is taxing your servers. That message is also clear to me: Spend your money on in-house clouds first, and burst only if you must. Then, bring it all back until you need to burst again. It’s difficult to say whether or not VMware is going to have a lot of success with this model as the drive toward moving workloads into the public cloud gains momentum.

I also got the chance to sit down with the HP CAS group for about an hour with the other bloggers and talk about some of the things they are doing. The CAS group seems to be focused on taking all the pieces of the puzzle and putting them together for customers. That’s similar to what I do in the VAR space, but HP is trying to do that for their own solutions instead of forcing the customer to pay an integrator to do it. While part of me does worry that other companies doing something similar will eventually lead to the demise of the VAR I think HP is taking the right tactic in their specific case. HP knows better than anyone else how their systems should play together. By creating a group that can give customers and integrators good reference designs and help us get past the sticky points in installation and configuration, they add a significant amount of value to the equation. I plan to dig into the CAS group a bit more to find out what kind of goodies they have that might make be a better engineer overall.


Tom’s Take

Overall, I think that VMwarePEX is well suited for the market that it’s trying to address. This is an excellent place for solution focused people to get information and roadmaps for all kinds of products. That being said, I don’t think it’s the place for me. I’m still an old CLI jockey. I don’t feel comfortable in a presentation that has almost no code, no live demos, or even a glory shot of a GUI tool. It’s a bit like watching a rugby game. Sure, the action is somewhat familiar and I understand the majority of what’s going on. It still feels like something’s just a bit out of place, though. I think the next VMware event that I attend will be VMWorld. With the focus on technical solutions and “nuts and bolts” detail, I think I’ll end up getting more out of it in the long run. I appreciate HP and VMware for taking the time to let me experience Partner Exchange.

Disclaimer

My attendance at VMware Parter Exchange was a result of a all expenses paid sponsored trip provided by Hewlett Packard and VMware. My conference attendance, hotel room, meals and incidentals were paid in full. At no time did HP or VMware propose or restrict content to be written on this blog. All opinions and analysis provided herein and on any VMwarePEX-related posts is mine and mine alone.

Is It Time To Remove the VCP Class Requirement?

While I was at VMware Partner Exchange, I attended a keynote address. This in and of itself isn’t a big deal. However, one of the bullet points that came up in the keynote slide deck gave me a bit of pause. VMware is chaging some of their VSP and VTSP certifications to be more personal and direct. Being a VCP, this didn’t really impact me a whole lot. But I thought it might be time to tweet out one of my oft-requested changes to the certification program:

Oops. I started getting flooding with mentions. Many were behind me. Still others were vehemently opposed to any changes. They said that dropping the class requirement would devalue the certification. I responded as best I could in many of these cases, but the reply list soon outgrew the words I wanted to write down. After speaking with some people, both officially and unofficially, I figured it was due time I wrote a blog post to cover my thoughts on the matter.

When I took the VMware What’s New class for vSphere 5, I mentioned therein that I thought the requirement for taking a $3,000US class for a $225 test was a bit silly. I myself took and passed the test based on my experience well before I sat the class. Because my previous VCP was on VMware ESX 3 and not on ESX 4, I still had to sit in the What’s New course before my passing score would be accepted. To this day I still consider that a silly requirement.

I now think I understand why VMware does this. Much of the What’s New and Install, Configure, and Manage (ICM) classes are hands-on lab work. VMware has gone to great lengths to build out the infrastructure necessary to allow students to spend their time practicing the lab exercises in the courses. These labs rival all but the CCIE practice lab pods that I’ve seen. That makes the course very useful to all levels of students. The introductory people that have never really touched VMware get to experience it for real instead of just looking at screenshots in a slide deck. The more experienced users that are sitting the class for certification or perhaps to refresh knowledge get to play around on a live system and polish skills.

The problem comes that investment in lab equipment is expensive. When the CCIE Data Center lab specs were released, Jeff Fry calculated the list price of all the proposed equipment and it was staggering. Now think about doing that yourself. With VMware, you’re going to need a robust server and some software. Trial versions can be used to some degree, but to truly practice advanced features (like storage vMotion or tiering) you’re going to need a full setup. That’s a bit out of reach for most users. VMware addressed this issue by creating their own labs. The user gets access to the labs for the cost of the ICM or What’s New class.

How is VMware recovering the costs of the labs? By charging for the course. Yes, training classes aren’t cheap. You have to rent a room and pay for expenses for your instructor and even catering and food depending on the training center. But $3,000US is a bit much for ICM and What’s New. VMware is using those classes to recover the costs of the lab development and operation. In order to be sure that the costs are recovered in the most timely manner, the metrics need to make sense for class attendance. Given the chance, many test takers won’t go to the training class. They’d rather study from online material like the PDFs on VMware’s site or use less expensive training options like TrainSignal. Faced with the possiblity that students may elect to forego the expensive labs, VMware did what they had to so to ensure the labs would get used, and therefore the metrics worked out in their favor – they required the course (and labs) in order to be certified.

For those that say that not taking the class devalues the cert, ask yourself one question. Why does VMware only require the class for new VCPs? Why are VCPs in good standing allowed to take the test with no class requirement and get certified on a new version? If all the value is in the class, then all VCPs should be required to take a What’s New class before they can get upgraded. If the value is truly in the class, no one should be exempt from taking it. For most VCPs, this is not a pleasant thought. Many that I talked to said, “But I’ve already paid to go to the class. Why should I pay again?” This just speaks to my point that the value isn’t in the class, it’s in the knowledge. Besides VMware Education, who cares where people acquire the knowledge and experience? Isn’t a home lab just as good as the ones that VMware built.

Thanks to some awesome posts from people like Nick Marus and his guide to building an ESXi cluster on a Mac Mini, we can now acquire a small lab for very little out-of-pocket. It won’t be enough to test everything, but it should be enough to cover a lot of situations. What VMware needs to do is offer an alternate certification requirement that takes a home lab into account. While there may be ways to game the system, you could require a VMware employee or certified instructor or VCP to sign off on the lab equipment before it will be blessed for the alternate requirement. That should keep it above board for those that want to avoid the class and build their own lab for testing.

The other option would be to offer a more “entry level” certification with a less expensive class requirement that would allow people to get their foot in the door without breaking the bank. Most people see the VCP as the first step in getting VMware certified. Many VMware rock stars can’t get employed in larger companies because they aren’t VCPs. But they can’t get their VCP because they either can’t pay for the course or their employer won’t pay for it. Maybe by introducing a VMware Certified Administration (VCA) certification and class with a smaller barrier to entry, like a course in the $800-$1000US range, VMware can get a lot of entry level people on board with VMware. Then, make the VCA an alternate requirement for becoming a VCP. If the student has already shown the dedication to getting their VCA, VMware won’t need to recoup the costs from them.


Tom’s Take

It’s time to end the VCP class requirement in one form or another. I can name five people off the top of my head that are much better at VMware server administration than I am that don’t have a VCP. I have mine, but only because I convinced my boss to pay for the course. Even when I took the What’s New course to upgrade to a VCP5, I had to pull teeth to get into the last course before the deadline. Employers don’t see the return on investment for a $3,000US class, especially if the person that they are going to send already has the knowledge shared in the class. That barrier to entry is causing VMware to lose out on the visbility that having a lot of VCPs can bring. One can only hope that Microsoft and Citrix don’t beat VMware to the punch by offering low-cost training or alternate certification paths. For those just learning or wanting to take a less expensive route, having a Hyper-V certification in a world of commoditized hypervisors would fit the bill nicely. After that, the reasons for sticking with VMware become less and less important.

New Wrinkles in the Fabric – Cisco Nexus Updates

There’s no denying that The Cloud is an omnipresent fixture in our modern technological lives.  If we aren’t already talking about moving things there, we’re wondering why it’s crashed.  I don’t have any answers about these kinds of things, but thankfully the people at Cisco have been trying to find them.  They let me join in on a briefing about the announcements that were made today regarding some new additions to their data center switching portfolio more commonly known by the Nexus moniker.

Nexus 6000

The first of the announcements is around a new switch family, the Nexus 6000.  The 6000 is more akin to the 5000 series than the 7000, containing a set of fixed-configuration switches with some modularity.  The Nexus 6001 is the true fixed-config member of the lot.  It’s a 1U 48-port 10GbE switch with 4 40GbE uplinks.  If that’s not enough to get your engines revving, you can look at the bigger brother, the Nexus 6004.  This bad boy is a 4U switch with a fixed config of 48 40GbE ports and 4 expansion modules that can double the total count up to 96 40GbE ports.  That’s a lot of packets flying across the wire.  According to Cisco, those packets can fly at a 1 microsecond latency port-to-port.  The Nexus 6000 is also an Fibre Channel over Ethernet (FCoE) switch, as all Nexus switches are.  This one is a 40GbE-capable FCoE switch.  However, as there are no 40GbE targets available in FCoE right now, it’s going to be on an island until those get developed.  A bit of future proofing, if you will.  The Nexus 6000 also support FabricPath, Cisco’s TRILL-based fabric technology, along with a large number of multicast entries in the forwarding table.  This is no doubt to support VXLAN and OTV in the immediate future for layer 2 data center interconnect.

The Nexus line also gets a few little added extras.  There is going to be a new FEX, the 2248PQ, that features 10GbE downlink ports and 40GbE uplink ports.  There’s also going to be a 40GbE expansion module for the 5500 soon, so your DC backbone should be able to run a 40GbE with a little investment.  Also of interest is the new service module  for the Nexus 7000.  That’s right, a real service module.  The NAM-NX1 is a Network Analysis Module (NAM) for the Nexus line of switches.  This will allow spanned traffic to be pumped though for analysis of traffic composition and characteristics without taking a huge hit to performance.  We’ve all known that the 7000 was going to be getting service modules for a while.  This is the first of many to roll off the line.  In keeping with Cisco’s new software strategy, the NAM also has a virtual cousin, not surprising named the vNAM.  This version lives entirely in software and is designed to serve the same function that its hardware cousin does only in the land of virtual network switches.  Now that the Nexus line has service modules, kind of makes you wonder what the Catalyst 6500 has all to itself now?  We know that the Cat6k is going to be supported in the near term, but is it going to be used as a campus aggregation or core?  Maybe as a service module platform until the SMs can be ported to the Nexus?  Or maybe with the announcement of FabricPath support for the Cat6k this venerable switch will serve as a campus/DC demarcation point?  At this point the future of Cisco’s franchise switch is really anyone’s guess.

Nexus 1000v InterCloud

The next major announcement from Cisco is the Nexus 1000v InterCloud.  This is very similar to what VMware is doing with their stretched data center concept in vSphere 5.1.  The 1000v InterCloud (1kvIC) builds a secure layer 2 GRE tunnel between your private could and a provider’s public could.  You can now use this tunnel to migrate workloads back and forth between public and private server space.  This opens up a whole new area of interesting possibilities, not the least of which is the Cloud Services Router (CSR).  When I first heard about the CSR last year at Cisco Live, I thought it was a neat idea but had some shortcomings.  The need to be deployed to a place where it was visible to all your traffic was the most worrisome.  Now, with the 1kvIC, you can build a tunnel between yourself and a provider and use CSR to route traffic to the most efficient or cost effective location.  It’s also a very compelling argument for disaster recovery and business continuity applications.  If you’ve got a category 4 hurricane bearing down on your data center, the ability to flip a switch and cold migrate all your workloads to a safe, secure vault across the country is a big sigh of relief.

The 1kvIC also has its own management console, the vNMC.  Yes, I know there’s already a vNMC available from Cisco.  The 1kvIC version is a bit special thought.  It not only gives you control over your side of the interconnect, but it also integrates with the provider’s management console as well.  This gives you much more visibility into what’s going on inside the provider instances beyond what we already have from simple dashboards or status screens on public web pages.  This is a great help when you think about the kinds of things you would be doing with intercloud mobility.  You don’t want to send your workloads to the provider if an engineer has started an upgrade on their core switches on a Friday night.  When it comes to the cloud, visibility is viability.

CiscoONE

In case you haven’t heard, Cisco wants to become a software company.  Not a bad idea when hardware is becoming a commodity and software is the home of the high margins.  Most of the development that Cisco has been doing along the software front comes from the Open Network Environment (ONE) initiative.  In today’s announcement, CiscoONE will now be the home for an OpenFlow controller.  In this first release, Cisco will be supporting OpenFlow and their own OnePK API extensions on the southbound side.  On the northbound side of things, the CiscoONE Controller will expose REST and Java hooks to allow interaction with flows passing though the controller.  While that’s all well and good for most of the enterprise devs out there, I know a lot of homegrown network admins that hack together their own scripts through Perl and Python.  For those of you that want support for your particular flavor of language built into CiscoONE, I highly recommend getting to their website and telling them what you want.  They are looking at adding additional hooks as time goes on, so you can get in on the ground floor now.

Cisco is also announcing OnePK support for the ISR G2 router platform and the ASR 1000 platform.  There will be OpenFlow support on the Nexus 3000 sometime in the near future, along with support in the Nexus 1000v for Microsoft Hyper-V and KVM.  And somewhere down the line, Cisco will have a VXLAN gateway for all the magical unicorn packet goodness across data centers that stretch via non-1kvIC links.


Tom’s Take

The data center is where the dollars are right now.  I’ve heard people complain that Cisco is leaving the enterprise campus behind as they charge forward into the raised floor garden of the data center.  These are the people driving the data that produces the profits that buy more equipment.  Whether it be massive Hadoop clusters or massive private cloud projects, the accounting department has given the DC team a blank checkbook today.  Cisco is doing its best to drive some of those dollars their way by providing new and improved offerings like the Nexus 6000.  For those that don’t have a huge investment in the Nexus 7000, the 6000 makes a lot of sense as both a high speed core aggregation switch or an end-of-row solution for a herd of FEXen.  The Nexus 1000v InterCloud is competing against VMware’s stretched data center concept in much the same way that the 1000v itself competes against the standard VMware vSwitch.  WIth Nicira in the driver’s seat of VMware’s networking from here on out, I wouldn’t be shocked to see more solutions that come from Cisco that mirror or augment VMware solutions as a way to show VMware that Cisco can come up with alternatives just as well as anyone else.

VMware Certification for Cisco People

During the November 14th vBrownBag, which is an excellent weekly webinar dedicated to many interesting virtualization topics, the question was raised on Twitter about mapping the VMware certification levels to their corresponding counterparts in Cisco certification.  That caught me a bit off guard at first because certification programs among the various vendors tend to be very insular and don’t compare well to other programs.  The Novell CNE isn’t the same animal as the JNCIE.  It’s not even in the same zoo.  Still, the watermark for difficult certifications is still the CCIE for most people, due to its longevity and reputation as a tough exam.  Some were wondering how it compared to the VCDX, VMware’s premier architect exam.  So I decided to take it upon myself to write up a little guide for those out there that may be Cisco certification junkies (like me) and are looking to see how their test taking skills might carry over into the nebulous world of vKernels and port groups.  Note that I’m going to focus on the data center virtualization track of the VMware certification program, as that’s the one I’ve had the most experience with and the other tracks are relatively new at this time.

VCP

The VMware Certified Professional (VCP) is most like the CCNA from Cisco.  It’s a foundational knowledge exam designed to test a candidate’s ability to understand and configure a VMware environment consisting of the ESXi hypervisor and vCenter management server.  The questions on the VCP tend to fall into the area of “Which button do you click?” and “What is the maximum number of x?” types of questions.  These are the things you will need to know when you find yourself staring at a vCenter window and you need to program a vKernel port or turn on LACP on a set of links.  Note that according to the VCP blueprint, there aren’t any of those nasty simulation questions on the VCP, unlike the CCNA.  That means you won’t have to worry about a busted Flash simulation that doesn’t support the question mark key or other crazy restrictions.  However, the VCP does have a prerequisite that I’m none too pleased about.  In order to obtain the VCP, you must attend a VMware-authorized training course.  There’s no getting around it.  Even if you take the exam and pass, you won’t get the credential until you’ve coughed up the $3000 US for the class.  That creates a ridiculous barrier to entry for many that are starting out in the virtualization industry.  It’s difficult in some cases for candidates to pony up the cost of the exam itself.  Asking them to sell a kidney in order to go to class is crazy.  For reference, that’s two CCIE lab fees.  Just for a class.  Yes, I know that existing VCPs can recertify on the new version without going to class.  But it’s a bit heavy handed to require new candidates to go to class, especially when the material that’s taught in class is readily available from work experience and the VMware website.

VCAP-DCA

The next tier of VMware certifications is the VMware Certified Advanced Professional (VCAP).  This is actually split into two different disciplines – Data Center Administration (DCA) and Data Center Design (DCD).  The VCAP-DCA is very similar to the CCIE.  Yes, I know that’s a pretty big leap from the CCNA-like VCP.  However, the structure of the exam is unlike anything but the CCIE in Ciscoland.  The VCAP-DCA is a 4-hour live practical exam.  You are configuring a set of 30-40 tasks on real servers.  You have access to the official documentation, although just like the CCIE you need to know your stuff and be able to do it quickly or you will run out of time.  Also, just like the CCIE, you are given constraints on some things, such as “Configure this task using the CLI, not the GUI.”  When you leave the secured testing facility, you won’t know your score for up to fifteen days until the exam is graded, likely by a combination of script and live person (just like the CCIE).  David M. Davis of Trainsignal is both a CCIE and a VCAP and has an excellent blog post about his VCAP experience.  He says that while the exam format of the VCAP is very similar to the CCIE, the exam contents themselves aren’t as tricky or complicated.  That makes sense when you think about the mid-range target for this exam.  This is for those people who are the best at administering VMware infrastructure.  They know more than the VCP blueprint and want to show that they are capable of troubleshooting all the wacky things that can happen to a virtual cluster.  Note that while there is a recommended training class available for the VCAP, it isn’t required to sit the test.  Also note that the VCAP is a restricted exam, meaning you must request authorization in order to sit it.  That makes sense when you consider that it’s a 4-hour test that can only be taken at a secured Pearson VUE testing center.

VCAP-DCD

The other VMware Certified Advanced Professional (VCAP) exam is the Data Center Design (DCD) exam.  This is where the line starts to blur between people that spend their time plugging away and configurations and people that spend their time in Visio putting data centers together.  Rather than focusing on purely practical tasks like the VCAP-DCA, the VCAP-DCD instead tests the candidate’s ability to design VMware-focused data centers based on a set of conditions.  The exam consists of a grouping of multiple choice, fill-in-the-blank, and in-exam design sessions.  The latter appears to have some Visio-like design components according to those that have taken the test.  This would put the exam firmly in the territory of the CCDP or even the CCDE.  The material on the DCD may be focused on design specifically, but the exam format seems to speak more to the kind of advanced questions you might see in the higher level Cisco design exams.  Just like the DCA, there are recommended courses for the DCD (like the VMware Design Workshop), but these are not requirements.  You will receive your score as soon as you leave, since there aren’t enough live configuration items on the exam to warrant a live person grading your exam.

VCDX

The current king of the mountain for VMware certifications is the VMware Certified Design Expert (VCDX).  This the VMware’s premier architecture certification.  It’s also one of the most rigorous.  A lot of people compare this to the CCIE as the showcase cert for a given industry, but based on what I’ve seen the two certifications only mirror each other in number of attempts per candidate.  The VCDX is actually more akin to the Cisco Certified Architect (CCAr) or Microsoft Certified Master certification.  That’s because rather than have a lab of gear to configure, you have to create a total solution around a given problem and demonstrate your knowledge to a council of people live and in person.  It’s not a inexpensive, either in terms of time or cost.  You have to pay a $300 fee to even have your application submitted.  This is pretty similar to the CCIE written exam.  However, even if you submit the proposal, there’s no guarantee you’ll make it to the defense.  Your application has to be scrutinized and there has to be a reasonable chance of you defending it.  If you’re submission isn’t up to snuff, you get recycled to the back of the pile with a pat on the head and a “try again later” note.  If you do make the cut, you have to fly out to a pre-determined location to defend.  Unlike Cisco’s policy of having a lab in many different locations all over the world, the defense locations tend to move around.  You may defend at VMWorld in San Francisco and have to try again in Brussels or even Tokyo.  It all really depends on timing.  Once you get in the room for your defense, you have to present your proposal to the council as well as field questions about it.  You’ll probably have to end up whiteboarding at some point to prove you know what you’re talking about.  And this council doesn’t accept simple answers.  If they ask you why you did something, you’d better have a good answer.  And “Because it’s best practice” doesn’t cut it either.  You need to show an in-depth knowledge of all facets of not only the VMware pieces of the solution, but third party pieces as well.  You need to think about all the things that you would put into a successful implementation, from environmental impacts to fault tolerance. Implementation plans and training schedules could also come up.  The idea is that you are working your way through a complete solution that shows you are a true architect, not just a mouse-clicker in the trenches.  That’s why I tend to look at the VCDX as above the CCIE.  It’s more about strategic thinking instead of brilliant tactical maneuvers.  Read up on my CCAr post from earlier this year to get an idea of what Cisco’s looking for in their architects.  That’s what VMware is looking for too.


That’s VMware certification in a nutshell.  It doesn’t map one-to-one to the existing Cisco certification lineup, but I would argue that’s due more to the VMware emphasis on practical experience versus book learning.  Even the VCAP-DCD, which would appear to be a best practices exam, has a component of live drag-and-drop design in a simlet.  I would argue that if Cisco had to do it all over again, their certification program would look a lot like the VMware version.  I talked earlier this year about wanting to do the VCAP in some form this year.  I don’t think I’m going to get there.  But knowing what I know now about the program and where I need to focus my studies based on what I’m doing today, I think that the VCAP is a very realistic goal for 2013.  The VCDX may be a bit out of my league for the time being, but who knows?  I said the same thing about the CCIE many years ago.

My First VMUG

If you’re a person that is using VMware or interested in starting, you should be a member of the VMware User Group (VMUG).  This organization is focused on providing a local group that talks about all manner of virtualization-related topics.  It can be a learning resource for you to pick up new techniques or technologies.  It can also serve as a sounding board for those that want to discuss in-depth design challenges or project ideas.  The various regional VMUGs have quite a following, with many quarterly meetings encompassing a full day of breakout sessions and keynote addresses.

I signed up for the Oklahoma City VMUG about six months ago shortly after confirmation that I had been selected as vExpert for 2012.  I wanted to gauge interest in VMware locally and hopefully get some ideas about where people were taking it outside my own experiences.  I work mostly with primary education institutions in my day job, and many of them are just now starting to realize the advantages of virtualizing their systems.  In fact, my previous virtualization primer was directed at this group of individuals.  However, I know there are many more organizations that are making effective use of this technology and I hoped that many of them would be involved in the VMUG.

What I found after I joined was a bit disjointed.  There didn’t seem to be a lot of activity on the discussion boards.  I couldn’t really find the leadership group that was in charge of meetings and such.  As it turned out, there hadn’t even been a VMUG meeting for almost two years.  There were a lot of people that wanted to be involved in some capacity, but no real direction.  Thankfully, that changed at VMWorld this year thanks to Joey Ware.  Joey is an admin at the University of Oklahoma Heath Sciences Center.  He jumped in the driver’s seat and started planning a new meeting to allow everyone to circle back up and catch up with what had been going on recently.

When I arrived at the meeting on Nov. 12th, I wasn’t really sure what to expect.  I know that organizations like the New England VMUG and the UK VMUG are rather large.  I didn’t know if the OKC VMUG was going to attract a crowd or a basketball team.  Imagine my surprise when there were upwards of 50 people in the room!  There were university administrators, energy company architects, and corporate developers.  There were VMware employees and even an EMC vSpecialist.  After a welcome back introduction, we got a nice overview of the new things in vSphere 5.1.  Much of this was review for me, having been tuned in during the launch at VMWorld this year and reading great blog articles released thereafter (check out the massive archive here courtesy of Eric Seibert).  It was great to see so many people looking at moving to vSphere 5.1.  Of course, I couldn’t let the whole briefing go without injecting a bit of commentary about one of my least-liked features, VMware Storage Appliance (VSA).  VSA, to me at least, is a half-baked idea designed to give cost conscious customers access to advanced VMware features without buying a SAN or even take the time to roll their own NAS from a Linux distro.  It really feels like something someone threw together right before a code freeze deadline and got it on the checklist of Cool Things You Can Do In vSphere.  If you are at all seriously considering using VSA, save your time and money and just buy a SAN.  Now, during the VMUG session, there were several people that mentioned that VSA does have a place, but purely as a last ditch option.  I’d tend to agree with this assessment, but again save your resources and get something useful.

We got a good discussion about vCenter Operations Manager (vCOps) from Sean O’Dell (@CloudyChance).  VMware is really pusing vCOps in 5.1 as a way to increase your productivity and reduce the chance for human error in your configuration.  They are really trying to push it by making the Foundation edition free in vSphere 5.1.  The Foundation edition helps you get started with some of the alert capabilities and health monitoring pieces that many admins would find useful.  Once you find that you like what vCOps is telling you and you want to start using the more advanced features to start managing your environment, you’re ready to move up to the Standard edition, which does cost around $125/VM in packs of 25.  If you’re managing that many VMs today without some kind of automation, you should really look at investing in vCOps.  I promise that it’s going to end up saving you more than 25 hours worth of work over the course of a year, which will more than pay for itself in the long run.


Tom’s Take

My first VMUG was well worth it.  I was really happy that there were that many people in my area that want to learn more about VMware and want to talk to people that work with it.  Just when I think that I’m the only one trying to do awesome things with virtualization, my peers go out and show me that I don’t really live in a vacuum.  I really hope that Joey can keep the OKC VMUG going far into the future and keep spreading the word about virtualization to anyone that will listen.  Who knows?  Maybe I’ll get brave enough to give a presentation sometime soon.

If you are interested in joining your local VMUG, head over to http://www.vmug.com/l/pw/rs and sign up.  It’s totally free and open to anyone.  For those reading my post that are in the Oklahoma City area, the link to the OKC VMUG workspace is here.  We’re going to try to have quarterly meetings, so I look forward to seeing more new faces after the first of the year.