With version 8.x of Cisco’s Communications Manager (CallManager or CUCM) software, the capability to virtualize the OS in VMware is the most touted feature. Many people that I talk to are happy for this option, as VMware is quickly becoming an indispensable tool in the modern datacenter. The ability to put CUCM on a VM gives the server admins a lot more flexibility in supporting the software. However, some people I talk to about virtual CUCM say “So what?”. They’re arguments talk about the fact that it’s only supported on Cisco hardware at the moment, or that it only supports ESXi, or even that they don’t see the utility of putting an appliance server on a VM. I’ve been thinking about the tangible reasons for virtualizing CUCM beyond the marketing stuff I keep seeing floating around that involves words like flexibility, application agility, and so on.
1. Platform Independence – A key feature of putting CUCM in a VM is the ability to divorce the OS/Application from a specific hardware platform. Anyone who has tried to install CUCM on a non-MCS knows the pain of figuring out the supported HP/IBM hardware. Cisco certified only certain server models to run CUCM. This means that if the processor in your IBM-purchased server is 200Mhz faster than the one quoted on the specs, your CUCM installation will fail. This means that Cisco has a hard time buying servers when they OEM them from IBM or HP. Cisco has to buy a LOT of servers of the exactly same specifications. Same processor, same RAM, same hard disk configurations. This means moving to new technology when it’s available become difficult, as the hardware must be certified for use with the software, then it must be moved into the supply chain. Look at how long it has taken to get an upgraded version of the 7835 and 7845 servers. Those are the workhorses of large CUCM deployments, and they have only been revised 3 times since their introduction years ago.
Now, think about virtualization. Since you’ll be using the same OVA/OVF templates every time to create your virtual machines, you don’t need to worry about ensuring the same processor and RAM in each batch of hardware purchases. You get that from the VM itself. All you need to do is define what virtual hardware you are going to need. Now, all you really need to do is worry about certifying the underlying VM hardware. Luckily, VMware has taken care of that for you. They certify hardware to run their ESX/ESXi software, so all you need to do as a vendor like Cisco is tell the users what their minimum supported specs are supposed to be. For those of you that claim that this is garbage since vCUCM is only supported on Cisco hardware right now, think about the support scenario from Cisco’s perspective. Would you rather have your TAC people troubleshooting software issues on a small set of mostly-similar hardware while they work out the virtualization bugs? Or do you want to slam your TAC people with every conceivable MacGyver-esque config slapped together for a lab setup? Amusingly, one of those sounds a whole lot more like Apple’s hardware approach, and the other sounds a lot like Microsoft’s approach. Which support system do you like better? I have no doubts that the ability to virtualize CUCM on non-Cisco hardware will be coming sooner rather than later. And when it does, it will give Cisco a great opportunity to position CUCM to quickly adapt to changing infrastructures and eliminate some of the supply chain and ordering issues that have plagued the platform for the last year or so. It also makes it much easier to redeploy your assets quickly in case of strategic alliance dissolution.
2. Failover / Fault Tolerance – Firstly, vMotion is NOT supported on vCUCM installation today. Part of the reason is that the call quality of a cluster can’t be confirmed to be 100% reliable when a CUCM server has 100 calls going out of an MGCP gateway and suddenly vMotions to a cluster on the other side of a datacenter WAN link. My own informal lab testing says that you CAN vMotion a CUCM VM. It’s just not supported or recommended. Now, once the bugs have been worked out of that particular piece of technology, think about the ramifications. I’ve heard some people tell me they would really like to use CUCM in their environments, but because the Publisher / Subscriber model doesn’t support 100% uptime in a failover scenario, they just can’t do it. With vMotion and HA handling the VMs, hardware failures are no longer an issue. If there is a scenario where an ESXi server is about to go down for maintenance or a faulty hard disk, the publisher can be moved without triggering a subscriber failover. Likewise, if the ESXi system housing the publisher gets hosed, the publisher can be failed over to another system with no impact. I don’t see a change to the Pub/Sub model coming any time soon, but the impact of having an offline publisher is greatly reduced when you can rely on other mechanisms to ensure that the system is up. Another thing to think about is the fault tolerance of the hardware itself. Normally, we have an MCS server with two power supplies and a RAID 1 setup, along with one or two NICs. Now, think about the typical server used in virtualization in a datacenter. Multiple power supplies, multiple NICs, and if there is onboard storage, it’s usually RAID 5 or better. In many cases, the VMs are stored on a very fault-tolerant SAN. Those hardware specs are worlds better than any you’re every going to be able to achieve with MCS hardware. I’d feel more comfortable having my CUCM servers virtualized on that kind of hardware even without vMotion and HA.
3. True appliance behavior – A long time ago, CallManager used to be a set of software services running on top of an operating system. Of course, that OS was Windows 2000, and it was CallManager version 3.x and 4.x. Eventually, Cisco moved away from the Services-on-OS model and went to an appliance solution. Around the 6.x release time frame, I heard some strong rumors that said Cisco was going to look at abstracting the services portion of CUCM from the OS and allow that package to run on just about anything. Alas, that plan never really came to fruition. The appliance model works well for things like CUCM and Unity Connection, so the hassle of porting all those services to run on Windows and Solaris and MacOS was not really worth it. Now, flash forward to the present day. By allowing CUCM to run in a VM, we’ve essentially created a service platform divorced from a customer’s OS preference. In CUCM, the OS really acts as a hardware controller and a way to access the database. In the terms of server admins and voice people, the OS might as well not exist. All we’re concerned about is the web interface to configure our phones and gateways. Now, there has been grousing in the past from the server people when the VoIP guys want to walk in a drop a new server down that consumes powers and generates heat in their perfectly designed datacenter. Now that CUCM can be entirely virtualized, the only cost is creating a new VM from an OVF template and letting the VoIP people load their software. After that, it simply serves as an application running in the VMware cloud. This is what Cisco was really going after when they said they wanted to make CUCM run as a service. Little to no impact, and able to be deployed quickly.
Those are my thoughts about CUCM virtualization. I think this a bold step forward for Cisco, and once they get up to speed by allowing us to do the things we take for granted with virtualization, like running on any supported hardware and vMotion/HA, the power of a virtualized CUCM model will allow us to do some interesting things going forward. No longer will we be bound by old hardware or application loading limitations. Instead, we can concentrate on the applications themselves and all the things they can do for us.