Short Take – The Present Future of the Net

A few random thoughts from ONS and Networking Field Day 15 this week:

  • Intel is really, really, really pushing their 5 generation (5G) wireless network. Note this is not Gen5 fibre channel or 5G 802.11 networking. This is the successor to LTE and capable of pushing a ridiculous amount of data to a very small handset. This is one of those “sure thing” technologies that is going to have a huge impact on our networks. Carriers and service providers are already trying to cope with the client rates we have now. What happens when they are two or three times faster?
  • PNDA has some huge potential for networking a data analytics. Their presentation had some of the most technical discussion during the event. They’re also the basis for a lot of other projects that are in the pipeline. Make sure you check them out. The project organizers suggest that you get started with the documentation and perhaps even help contribute some writing to get more people on board.
  • VMware hosted a dinner for us that had some pretty luminary speakers like Bruce Davie and James Watters. They talked about the journey from traditional networking to a new paradigm filled with microservices and intelligence in the application layer. While I think this is the golden standard that everyone is looking toward for the future, I also think there is still quite a bit of technical debt to unpack before we can get there.
  • Another fun thought kicking around: When we look at these new agile, paradigm shifting deployments, why are they always on new hardware? Would you see the similar improvement of existing processes on new hardware? What would these new processes look like on existing things? I think this one is worth investigating.
Advertisements

Nutanix and Plexxi – An Affinity to Converge

nutanix-logo

Nutanix has been lighting the hyperconverged world on fire as of late. Strong sales led to a big IPO for their stock. They are in a lot of conversations about using their solution in place of large traditional virtualization offerings that include things like blade servers or big boxes. And even coming off the recent Nutanix .NEXT conference there were some big announcements in the networking arena to help them complete their total solution. However, I think Nutanix is missing a big opportunity that’s right in front of them.

I think it’s time for Nutanix to buy Plexxi.

Software Says

If you look at the Nutanix announcements around networking from .NEXT, they look very familiar to anyone in the server space. The highlights include service chaining, microsegmentation, and monitoring all accessible through an API. If this sounds an awful lot like VMware NSX, Cisco ACI, or any one of a number of new networking companies then you are in the right mode of thinking as far as Nutanix is concerned.

SDN in the server space is all about overlay networking. Segmentation of flows and service chaining are the reason why security is so hard to do in the networking space today. Trying to get traffic to behave in a certain way drives networking professionals nuts. Monitoring all of that to ensure that you’re actually doing what you say you’re doing just adds complexity. And the API is the way to do all of that without having to walk down to the data center to console into a switch and learn a new non-Linux CLI command set.

SDN vendors like VMware and Cisco ACI would naturally have jumped onto these complaints and difficulties in the networking world and both have offered solutions for them with their products. For Nutanix to have bundled solutions like this into their networking offering is no accident. They are looking to battle VMware head-to-head and need to offer the kind of feature parity that it’s going to take a make medium to large shops shift their focus away from the VMware ecosystem and take a long look at what Nutanix is offering.

In a way, Nutanix and VMware are starting to reinforce the idea that the network isn’t a magical realm of protocols and tricks that make applications work. Instead, it’s a simple transport layer between locations. For instance, Amazon doesn’t rely on the magic of the interstate system to get your packages from the distribution center to your home. Instead, the interstate system is just a transport layer for their shipping overlays – UPS, FedEX, and so on. The overlay is where the real magic is happening.

Nutanix doesn’t care what your network looks like. They can do almost everything on top of it with their overlay protocols. That would seem to suggest that the focus going forward should be to marginalize or outright ignore the lower layers of the network in favor of something that Nutanix has visibility into and can offer control and monitoring of. That’s where the Plexxi play comes into focus.

Plexxi Logo

Affinity for Awesome

Plexxi has long been a company in search of a way to sell what they do best. When I first saw them years ago, they were touting their Affinities idea as a way to build fast pathways between endpoints to provide better performance for applications that naturally talked to each other. This was a great idea back then. But it quickly got overshadowed by the other SDN solutions out there. It even caused Plexxi to go down a slightly different path for a while looking at other options to compete in a market that they didn’t really have a perfect fit product.

But the Affinities idea is perfect for hyperconverged solutions. Companies like Nutanix are marking their solutions as the way to create application-focused compute nodes on-site without the need to mess with the cloud. It’s a scalable solution that will eventually lead to having multiple nodes in the future as your needs expand. Hyperconverged was designed to be consumable per compute unit as opposed to massively scaling out in leaps and bounds.

Plexxi Affinities is just the tip of the iceberg. Plexxi’s networking connectivity also gives Nutanix the ability to build out a high-speed interconnect network with one advantage – noninterference. I’m speaking about what happens when a customer needs to add more networking ports to support this architecture. They need to make a call to their Networking Vendor of Choice. In the case of Cisco, HPE, or others, that call will often involve a conversation about what they’re doing with the new network followed by a sales pitch for their hyperconverged solution or a partner solution that benefits both companies. Nutanix has a reputation for being the disruptor in traditional IT. The more they can keep their traditional competitors out of the conversation, the more likely they are to keep the business into the future.


Tom’s Take

Plexxi is very much a company with an interesting solution in need of a friend. They aren’t big enough to really partner with hyperconverged solutions, and most of the hyperconverged market at this point is either cozy with someone else or not looking to make big purchases. Nutanix has the rebel mentality. They move fast and strike quickly to get their deals done. They don’t take prisoners. They look to make a splash and get people talking. The best way to keep that up is to bundle a real non-software networking component alongside a solution that will make the application owners happy and keep the conversation focused on a single source. That’s how Cisco did it back and the day and how VMware has climbed to the top of the virtualization market.

If Nutanix were to spend some of that nice IPO money on a Plexxi Christmas present, I think 2017 would be the year that Nutanix stops being discussed in hushed whispers and becomes a real force to be reckoned with up and down the stack.

The Death of TRILL

wasteland_large

Networking has come a long way in the last few years. We’ve realized that hardware and ASICs aren’t the constant that we could rely on to make decisions in the next three to five years. We’ve thrown in with software and the quick development cycles that allow us to iterate and roll out new features weekly or even daily. But the hardware versus software battle has played out a little differently than we all expected. And the primary casualty of that battle was TRILL.

Symbiotic Relationship

Transparent Interconnection of Lots of Links (TRILL) was proposed as a solution to the complexity of spanning tree. Radia Perlman realized that her bridging loop solution wouldn’t scale in modern networks. So she worked with the IEEE to solve the problem with TRILL. We also received Shortest Path Bridging (SPB) along the way as an alternative solution to the layer 2 issues with spanning tree. The motive was sound, but the industry has rejected the premise entirely.

Large layer 2 networks have all kinds of issues. ARP traffic, broadcast amplification, and many other numerous issues plague layer 2 when it tries to scale to multiple hundreds or a few thousand nodes. The general rule of thumb is that layer 2 broadcast networks should never get larger than 250-500 nodes lest problems start occurring. And in theory that works rather well. But in practice we have issues at the software level.

Applications are inherently complicated. Software written in the pre-Netflix era of public cloud adoption doesn’t like it when the underlay changes. So things like IP addresses and ARP entries were assumed to be static. If those data points change you have chaos in the software. That’s why we have vMotion.

At the core, vMotion is a way for software to mitigate hardware instability. As I outlined previously, we’ve been fixing hardware with software for a while now. vMotion could ensure that applications behaved properly when they needed to be moved to a different server or even a different data center. But they also required the network to be flat to overcome limitations in things like ARP or IP. And so we went on a merry journey of making data centers as flat as possible.

The problem came when we realized that data centers could only be so flat before they collapsed in on themselves. ARP and spanning tree limited the amount of traffic in layer 2 and those limits were impossible to overcome. Loops had to be prevented, yet the simplest solution disabled bandwidth needed to make things run smoothly. That caused IEEE and IETF to come up with their layer 2 solutions that used CLNS to solve loops. And it was a great idea in theory.

The Joining

In reality, hardware can’t be spun that fast. TRILL was used as a reference platform for proprietary protocols like FabricPath and VCS. All the important things were there but they were locked into hardware that couldn’t be easily integrated into other solutions. We found ourselves solving problem after problem in hardware.

Users became fed up. They started exploring other options. They finally decided that hardware wasn’t the answer. And so they looked to software. And that’s where we started seeing the emergence of overlay networking. Protocols like VXLAN and NV-GRE emerged to tunnel layer 2 packets over layer 3 networks. As Ivan Pepelnjak is fond of saying layer 3 transport solves all of the issues with scaling. And even the most unruly application behaves when it thinks everything is running on layer 2.

Protocols like VXLAN solved an immediate need. They removed limitations in hardware. Tunnels and fabrics used novel software approaches to solve insurmountable hardware problems. An elegant solution for a thorny problem. Now, instead of waiting for a new hardware spin to fix scaling issues, customers could deploy solutions to fix the issues inherent in hardware on their own schedule.

This is the moment where software defined networking (SDN) took hold of the market. Not when words like automation and orchestration started being thrown about. No, SDN became a real thing when it enabled customers to solve problems without buying more physical devices.


Tom’s Take

Looking back, we realize now that building large layer 2 networks wasn’t the best idea. We know that layer 3 scales much better. Given the number of providers and end users running BGP to top-of-rack (ToR) switches, it would seem that layer 3 scales much better. It took us too long to figure out that the best solution to a problem sometimes takes a bit of thought to implement.

Virtualization is always going to be limited by the infrastructure it’s running on. Applications are only as smart as the programmer. But we’ve reached the point where developers aren’t counting on having access to layer 2 protocols that solve stupid decision making. Instead, we have to understand that the most resilient way to fix problems is in the software. Whether that’s VXLAN, NV-GRE, or a real dev team not relying on the network to solve bad design decisions.

The Marriage of the Ecosystem

 

marriage

A recent discussion with Greg Ferro (@EtherealMind) of Packet Pushers and Nigel Poulton (@NigelPoulton) of In Tech We Trust got me thinking about product ecosystems. Nigel was talking about his new favorite topic of Docker and containers. He mentioned to us that it had him excited because it felt like the good old days of VMware when they were doing great things with the technology. That’s when I realized that ecosystems aren’t all they are cracked up to be.

Courting Technology

Technology is a huge driver for innovation. New ideas are formed into code that runs to accomplish a task. That code is then disseminated to teams and built upon to create toolsets to accomplish even more tasks. That’s how programs happen. Almost every successful shift in technology starts with the courtship of focused code designed to accomplish a simple task or solve a quick problem.

The courtship evolves over time to include other aspects of technology. Development work extends the codebase to accept things like plugins to provide additional functionality. Not core functions though. The separation comes when people want to add additional pieces without compromising the original program. Bolting additional non-core pieces on to existing code causes all kinds of headaches.

That’s how ecosystems start. People build new functions to augment and support the new problems the crop up around those solved by the original tool. Finding new problems is key to driving the ecosystem forward. Without problems to solve, the environment around a particular program starts to contract and disappear.

The Old Ball And Chain

Ecosystems eventually reach the point of stagnation, however. This usually comes when the ecosystem around a product becomes more important than the actual program itself. Think about the ecosystem around Microsoft Office. Office was originally a word processor. That drove additional programs to solve spreadsheets and presentations. Now, people buy the Office productivity suite for more than the word processor. More than a few buy it for the email program. But very little innovation is going into the word processor any longer. Aside from some UI design changes and few minor function additions the majority of the work is being driven around other programs.

This is also the problem with VMware today. The development around the original hypervisor is mostly moot. That problem has been solved completely. Today, all of the marketing hype around the VMware is on other things. Public cloud architectures. Storage virtualization. Networking virtualization. None of these things have anything to do with they hypervisor beyond tying into the ecosystem created around it.

Ecosystems can’t exist without recognizing the original problems being solved and why they are so important. If you build an environment around a product and then leave that product to wither on the vine, your ecosystem will eventually collapse. When your company pivots away from what makes it successful in the first place you run the risk of disaster.

Note that this doesn’t include what happens when the technology landscape forces you to shift your focus. Token ring networking doesn’t solve a big problem today. Companies focusing on it needed to pivot away from it to solve new problems. As such, there really isn’t a token ring ecosystem today.

Now, look at tape backup units as a counterpoint. They still solve a problem – backing up large amounts of data at low cost. Quite a few of the old tape backup vendors have moved away from the market and are concentrating on new solutions. A few of the old vendors, such as SpectraLogic, still support tape solutions and are continuing to drive the tape ecosystem with new ideas. But those ideas still manage to come back to tape. That’s how they can keep the ecosystem grounded and relevant.


Tom’s Take

New technology is like dating. You get excited and giddy about where things are going and all the potential you see. You enjoy spending time together just talking or existing. As you start to get more serious you start to see issues crop up the need to be solved. Eventually you take the plunge and make things super serious. What you don’t want to have happen at this point is the trap that some people fall into. When you concentrate on the issues that crop up around things you start to lose focus. It’s far to easy to think about bills and schools and other ancillary issues and lose sight of the reason why you’re together in the first place.

Ecosystems are like that. People start focusing on the ecosystem at the expense of the technology that brought everyone together in the first place. When you do that you forget about all the great things that happened in the beginning and you concentrate on the problems that have appeared and not the technology. In order to keep your ecosystem vibrant and relevant, you have to step back and remember the core technology from time to time.

 

 

Disruption in the New World of Networking

This is the one of the most exciting times to be working in networking. New technologies and fresh takes on existing problems are keeping everyone on their toes when it comes to learning new protocols and integration systems. VMworld 2013 served both as an annoucement of VMware’s formal entry into the larger networking world as well as putting existing network vendors on notice. What follows is my take on some of these announcements. I’m sure that some aren’t going to like what I say. I’m even more sure a few will debate my points vehemently. All I ask is that you consider my position as we go forward.

Captain Over, Captain Under

VMware, through their Nicira acquisition and development, is now *the* vendor to go to when you want to build an overlay network. Their technology augments existing deployments to provide software features such as load balancing and policy deployment. In order to do this and ensure that these features are utilized, VMware uses VxLAN tunnels between the devices. VMware calls these constructs “virtual wires”. I’m going to call them vWires, since they’ll likely be called that soon anyway. vWires are deployed between hosts to provide a pathway for communications. Think of it like a GRE tunnel or a VPN tunnel between the hosts. This means the traffic rides on the existing physical network but that network has no real visibility into the payload of the transit packets.

Nicira’s brainchild, NSX, has the ability to function as a layer 2 switch and a layer 3 router as well as a load balancer and a firewall. VMware is integrating many existing technologies with NSX to provide consistency when provisioning and deploying a new sofware-based network. For those devices that can’t be virtualized, VMware is working with HP, Brocade, and Arista to provide NSX agents that can decapsulate the traffic and send it to an physical endpoint that can’t participate in NSX (yet). As of the launch during the keynote, most major networking vendors are participating with NSX. There’s one major exception, but I’ll get to that in a minute.

NSX is a good product. VMware wouldn’t have released it otherwise. It is the vSwitch we’ve needed for a very long time. It also extends the ability of the virtualization/server admin to provision resources quickly. That’s where I’m having my issue with the messaging around NSX. During the second day keynote, the CTOs on stage said that the biggest impediment to application deployment is waiting on the network to be configured. Note that is my paraphrasing of what I took their intent to be. In order to work around the lag in network provisioning, VMware has decided to build a VxLAN/GRE/STT tunnel between the endpoints and eliminate the network admin as a source of delay. NSX turns your network in a fabric for the endpoints connected to it.

Under the Bridge

I also have some issues with NSX and the way it’s supposed to work on existing networks. Network engineers have spent countless hours optimizing paths and reducing delay and jitter to provide applications and servers with the best possible network. Now, that all doesn’t matter. vAdmins just have to click a couple of times and build their vWire to the other server and all that work on the network is for naught. The underlay network exists to provide VxLAN transport. NSX assumes that everything working beneath is running optimally. No loops, no blocked links. NSX doesn’t even participate in spanning tree. Why should it? After all, that vWire ensures that all the traffic ends up in the right location, right? People would never bridge the networking cards on a host server. Like building a VPN server, for instance. All of the things that network admins and engineers think about in regards to keeping the network from blowing up due to excess traffic are handwaved away in the presentations I’ve seen.

The reference architecture for NSX looks pretty. Prettier than any real network I’ve ever seen. I’m afraid that suboptimal networks are going to impact application and server performance now more than ever. And instead of the network using mechanisms like QoS to battle issues, those packets are now invisible bulk traffic. When network folks have no visibility into the content of the network, they can’t help when performance suffers. Who do you think is going to get blamed when that goes on? Right now, it’s the network’s fault when things don’t run right. Do you think that moving the onus for server network provisioning to NSX and vCenter is going to forgive the network people when things go south? Or are the underlay engineers going to be take the brunt of the yelling because they are the only ones that still understand the black magic outside the GUI drag-and-drop to create vWires?

NSX is for service enablement. It allows people to build network components without knowing the CLI. It also means that network admins are going to have to work twice as hard to build resilient networks that work at high speed. I’m hoping that means that TRILL-based fabrics are going to take off. Why use spanning tree now? Your application and service network sure isn’t. No sense adding any more bells and whistles to your switches. It’s better to just tie them into spine-and-leaf CLOS fabrics and be done with it. It now becomes much more important to concentrate on the user experience. Or maybe the wirless network. As long as at least one link exists between your ESX box and the edge switch let the new software networking guys worry about it.

The Recumbent Incumbent?

Cisco is the only major networking manufacturer not publicly on board with NSX right now. Their CTO Padma Warrior has released a response to NSX that talks about lock-in and vertical integration. Still others have released responses to that response. There’s a lot of talk right now about the war brewing between Cisco and VMware and what that means for VCE. One thing is for sure – the landscape has changed. I’m not sure how this is going to fall out on both sides. Cisco isn’t likely to stop selling switches any time soon. NSX still works just fine with Cisco as an underlay. VCE is still going to make a whole bunch of money selling vBlocks in the next few months. Where this becomes a friction point is in the future.

Cisco has been building APIs into their software for the last year. They want to be able to use those APIs to directly program the network through devices like the forthcoming OpenDaylight controller. Will they allow NSX to program them as well? I’m sure they would – if VMware wrote those instructions into NSX. Will VMware demand that Cisco use the NSX-approved APIs and agents to expose network functionality to their software network? They could. Will Cisco scrap OnePK to implement NSX? I doubt that very much. We’re left with a standoff. Cisco wants VMware to use their tools to program Cisco networks. VMware wants Cisco to use the same tools as everyone else and make the network a commodity compared to the way it is now.

Let’s think about that last part for a moment. Aside from some speed differences, networks are largely going to be identical to NSX. It won’t care if you’re running HP, Brocade, or Cisco. Transport is transport. Someone down the road may build some proprietary features into their hardware to make NSX run better but that day is far off. What if a manufacturer builds a switch that is twice as fast as the nearest competition? Three times? Ten times? At what point does the underlay become so important that the overlay starts preferring it exclusively?


Tom’s Take

I said a lot during the Tuesday keynote at VMworld. Some of it was rather snarky. I asked about full BGP tables and vMotioning the machines onto the new NSX network. I asked because I tend to obsess over details. Forgotten details have broken more of my networks than grand design disasters. We tend to fuss over the big things. We make more out of someone that can drive a golf ball hundreds of yards than we do about the one that can consistently sink a ten foot putt. I know that a lot of folks were pre-briefed on NSX. I wasn’t, so I’m playing catch up right now. I need to see it work in production to understand what value it brings to me. One thing is for sure – VMware needs to change the messaging around NSX to be less antagonistic towards network folks. Bring us into your solution. Let us use our years of experience to help rather than making us seem like pariahs responsible for all your application woes. Let us help you help everyone.

VMware Partner Exchange 2013

VMwarePEXTitle

Having been named a vExpert for 2012, I’ve been trying to find ways to get myself invovled with the virtualization community. Besides joining my local VMware Users Group (VMUG), there wasn’t much success. That is, until the end of February. John Mark Troyer (@jtroyer), the godfather of the vExperts, put out a call for people interested in attending the VMware Partner Exchange in Las Vegas. This would be an all-expenses paid trip from a vendor. Besides going to a presentation and having a one-on-one engagement with them, there were no other restrictions about what could or couldn’t be said. I figured I might as well take the chance to join in the festivites. I threw my name into the hat and was lucky enough to get selected!

Most vendors have two distinctly different conferences througout the year. One is focused on end-users and customers and usually carries much more technical content. For Cisco, this is Cisco Live. For VMware, this is VMWorld. The other conference revolves around existing partners and resellers. Instead of going over the gory details of vMotions or EIGRP, it instead focuses on market strategies and feature sets. That is what VMware Partner Exchange (VMwarePEX) was all about for me. Rather than seeing CLI and step-by-step config guides to advanced features, I was treated to a lot of talk about differentiation and product placement. This fit right in with my new-ish role at my VAR that is focused toward architecture and less on post-sales technical work.

The sponsoring vendor for my trip was tried-and-true Hewlett Packard. Now, I know I’ve said some things about HP in the past that might not have been taken as glowing endoresements. Still, I wanted to look at what HP had to offer with an open mind. The Converged Application Systems (CAS) team specifically wanted to engage me, along with Damian Karlson (@sixfootdad), Brian Knudtson (@bknudtson), and Chris Wahl (@chriswahl) to observe and comment on what they had to offer. I had never heard of this group inside of HP, which we’ll get into a bit more here in a second.

My first real day at VMwarePEX was a day-long bootcamp from HP that served as an introduction to their product lines and how the place themselves in the market alongside Cisco, Dell, and IBM. I must admit that this was much more focused on sales and marketing than my usual presentation lineup. I found it tough to concentrate on certain pieces as we went along. I’m not knocking the presenters, as they did a great job of keeping the people in the room as focused as possible. The material was…a bit dry. I don’t think there was much that could have helped it. We covered servers, networking, storage, applications, and even management in the six hours we were in the session. I learned a lot about what HP had to offer. Based on my previous experiences, this was a very good thing. Once you feel like someone has missed on your expectations you tend to regard them with a wary eye. HP did a lot to fix my perception problem by showing they were a lot more than some wireless or switching product issues.

Definition: Software

I attended the VMwarePEX keynote on Tuesday to hear all about the “software defined datacenter.” To be honest, I’m really beginning to take umberage with all this “software defined <something>” terminology being bandied about by every vendor under the sun. I think of it as the Web 2.0 hype of the 2010s. Since VMware doesn’t manufacture a single piece of hardware to my knowledge, of course their view is that software is the real differentiator in the data center. Their message no longer has anything to do with convincing people that cramming twenty servers into one box is a good idea. Instead, they now find themsevles in a dog fight with Amazon, Citrix, and Microsoft on all fronts. They may have pioneered the idea of x86 virtualization, but the rest of the contenders are catching up fast (and surpassing them in some cases).

VMware has to spend a lot of their time now showing the vision for where they want to take their software suites. Note that I said “suite,” because VMware’s message at PEX was loud and clear – don’t just sell the hypervisor any more. VMware wants you to go out and sell the operations managment and the vCloud suite instead. Gone are the days when someone could just buy a single license for ESX or download ESXi and put in on a lab system to begin a hypervisor build-out. Instead, we now see VMware pushing the whole package from soup to nuts. They want their user base to get comfortable using the ops management tools and various add-ons to the base hypervisor. While the trend may be to stay hypervisor agnostic for the most part, VMware and their competitors realize that if you feel cozy using one set of tools to run your environment, you’ll be more likely to keep going back to them as you expand.

Another piece that VMware is really driving home is the idea of the hybrid cloud. This makes sense when you consider that the biggest public cloud provider out there isn’t exactly VMware-friendly. Amazon has a huge marketshare among public cloud providers. They offer the ability to convert your VMware workloads to their format. But, there’s no easy way back. According to VMware’s top execs, “When a customer moves a workload to Amazon, they lose. And we lose them forever.” The first part of that statement may be a bit of a stretch, but the second is not. Once a customer moves their data and operations to Amazon, they have no real incentive to bring it back. That’s what VMware is trying to change. They have put out a model that allows a customer to build a private cloud inside their own datacenter and have all the features and functionality that they would have in Reston, VA or any other large data center. However, through the use of magic software, they can “cloudburst” their data to a VMware provider/partner in a public cloud data center to take advantage of processing surplus when needed, such as at tax time or when the NCAA tournement is taxing your servers. That message is also clear to me: Spend your money on in-house clouds first, and burst only if you must. Then, bring it all back until you need to burst again. It’s difficult to say whether or not VMware is going to have a lot of success with this model as the drive toward moving workloads into the public cloud gains momentum.

I also got the chance to sit down with the HP CAS group for about an hour with the other bloggers and talk about some of the things they are doing. The CAS group seems to be focused on taking all the pieces of the puzzle and putting them together for customers. That’s similar to what I do in the VAR space, but HP is trying to do that for their own solutions instead of forcing the customer to pay an integrator to do it. While part of me does worry that other companies doing something similar will eventually lead to the demise of the VAR I think HP is taking the right tactic in their specific case. HP knows better than anyone else how their systems should play together. By creating a group that can give customers and integrators good reference designs and help us get past the sticky points in installation and configuration, they add a significant amount of value to the equation. I plan to dig into the CAS group a bit more to find out what kind of goodies they have that might make be a better engineer overall.


Tom’s Take

Overall, I think that VMwarePEX is well suited for the market that it’s trying to address. This is an excellent place for solution focused people to get information and roadmaps for all kinds of products. That being said, I don’t think it’s the place for me. I’m still an old CLI jockey. I don’t feel comfortable in a presentation that has almost no code, no live demos, or even a glory shot of a GUI tool. It’s a bit like watching a rugby game. Sure, the action is somewhat familiar and I understand the majority of what’s going on. It still feels like something’s just a bit out of place, though. I think the next VMware event that I attend will be VMWorld. With the focus on technical solutions and “nuts and bolts” detail, I think I’ll end up getting more out of it in the long run. I appreciate HP and VMware for taking the time to let me experience Partner Exchange.

Disclaimer

My attendance at VMware Parter Exchange was a result of a all expenses paid sponsored trip provided by Hewlett Packard and VMware. My conference attendance, hotel room, meals and incidentals were paid in full. At no time did HP or VMware propose or restrict content to be written on this blog. All opinions and analysis provided herein and on any VMwarePEX-related posts is mine and mine alone.

Is It Time To Remove the VCP Class Requirement?

While I was at VMware Partner Exchange, I attended a keynote address. This in and of itself isn’t a big deal. However, one of the bullet points that came up in the keynote slide deck gave me a bit of pause. VMware is chaging some of their VSP and VTSP certifications to be more personal and direct. Being a VCP, this didn’t really impact me a whole lot. But I thought it might be time to tweet out one of my oft-requested changes to the certification program:

Oops. I started getting flooding with mentions. Many were behind me. Still others were vehemently opposed to any changes. They said that dropping the class requirement would devalue the certification. I responded as best I could in many of these cases, but the reply list soon outgrew the words I wanted to write down. After speaking with some people, both officially and unofficially, I figured it was due time I wrote a blog post to cover my thoughts on the matter.

When I took the VMware What’s New class for vSphere 5, I mentioned therein that I thought the requirement for taking a $3,000US class for a $225 test was a bit silly. I myself took and passed the test based on my experience well before I sat the class. Because my previous VCP was on VMware ESX 3 and not on ESX 4, I still had to sit in the What’s New course before my passing score would be accepted. To this day I still consider that a silly requirement.

I now think I understand why VMware does this. Much of the What’s New and Install, Configure, and Manage (ICM) classes are hands-on lab work. VMware has gone to great lengths to build out the infrastructure necessary to allow students to spend their time practicing the lab exercises in the courses. These labs rival all but the CCIE practice lab pods that I’ve seen. That makes the course very useful to all levels of students. The introductory people that have never really touched VMware get to experience it for real instead of just looking at screenshots in a slide deck. The more experienced users that are sitting the class for certification or perhaps to refresh knowledge get to play around on a live system and polish skills.

The problem comes that investment in lab equipment is expensive. When the CCIE Data Center lab specs were released, Jeff Fry calculated the list price of all the proposed equipment and it was staggering. Now think about doing that yourself. With VMware, you’re going to need a robust server and some software. Trial versions can be used to some degree, but to truly practice advanced features (like storage vMotion or tiering) you’re going to need a full setup. That’s a bit out of reach for most users. VMware addressed this issue by creating their own labs. The user gets access to the labs for the cost of the ICM or What’s New class.

How is VMware recovering the costs of the labs? By charging for the course. Yes, training classes aren’t cheap. You have to rent a room and pay for expenses for your instructor and even catering and food depending on the training center. But $3,000US is a bit much for ICM and What’s New. VMware is using those classes to recover the costs of the lab development and operation. In order to be sure that the costs are recovered in the most timely manner, the metrics need to make sense for class attendance. Given the chance, many test takers won’t go to the training class. They’d rather study from online material like the PDFs on VMware’s site or use less expensive training options like TrainSignal. Faced with the possiblity that students may elect to forego the expensive labs, VMware did what they had to so to ensure the labs would get used, and therefore the metrics worked out in their favor – they required the course (and labs) in order to be certified.

For those that say that not taking the class devalues the cert, ask yourself one question. Why does VMware only require the class for new VCPs? Why are VCPs in good standing allowed to take the test with no class requirement and get certified on a new version? If all the value is in the class, then all VCPs should be required to take a What’s New class before they can get upgraded. If the value is truly in the class, no one should be exempt from taking it. For most VCPs, this is not a pleasant thought. Many that I talked to said, “But I’ve already paid to go to the class. Why should I pay again?” This just speaks to my point that the value isn’t in the class, it’s in the knowledge. Besides VMware Education, who cares where people acquire the knowledge and experience? Isn’t a home lab just as good as the ones that VMware built.

Thanks to some awesome posts from people like Nick Marus and his guide to building an ESXi cluster on a Mac Mini, we can now acquire a small lab for very little out-of-pocket. It won’t be enough to test everything, but it should be enough to cover a lot of situations. What VMware needs to do is offer an alternate certification requirement that takes a home lab into account. While there may be ways to game the system, you could require a VMware employee or certified instructor or VCP to sign off on the lab equipment before it will be blessed for the alternate requirement. That should keep it above board for those that want to avoid the class and build their own lab for testing.

The other option would be to offer a more “entry level” certification with a less expensive class requirement that would allow people to get their foot in the door without breaking the bank. Most people see the VCP as the first step in getting VMware certified. Many VMware rock stars can’t get employed in larger companies because they aren’t VCPs. But they can’t get their VCP because they either can’t pay for the course or their employer won’t pay for it. Maybe by introducing a VMware Certified Administration (VCA) certification and class with a smaller barrier to entry, like a course in the $800-$1000US range, VMware can get a lot of entry level people on board with VMware. Then, make the VCA an alternate requirement for becoming a VCP. If the student has already shown the dedication to getting their VCA, VMware won’t need to recoup the costs from them.


Tom’s Take

It’s time to end the VCP class requirement in one form or another. I can name five people off the top of my head that are much better at VMware server administration than I am that don’t have a VCP. I have mine, but only because I convinced my boss to pay for the course. Even when I took the What’s New course to upgrade to a VCP5, I had to pull teeth to get into the last course before the deadline. Employers don’t see the return on investment for a $3,000US class, especially if the person that they are going to send already has the knowledge shared in the class. That barrier to entry is causing VMware to lose out on the visbility that having a lot of VCPs can bring. One can only hope that Microsoft and Citrix don’t beat VMware to the punch by offering low-cost training or alternate certification paths. For those just learning or wanting to take a less expensive route, having a Hyper-V certification in a world of commoditized hypervisors would fit the bill nicely. After that, the reasons for sticking with VMware become less and less important.