Changing CallManager’s IP Address

Network renumbering happens from time to time.  You outgrow a segment or buy a company and need to readdress things.  Or you start doing that new IPv6-thingy and need to renumber IPv4 to make more sense.  In any case, you go through everything that is attached to the network and make your changes.  The routers act just fine.  The switches couldn’t care less.  Even the toaster is still churning out perfect slices.  And then you get to CallManager.  As soon as you type in your password to the web administration page, ominous organ music starts playing in the back ground after a thunderclap.  You start to get the feeling that maybe this isn’t such a good idea.

There’s a good reason for that.  CallManager is very dependent on using IP addresses to communicate with the rest of the CUCM cluster (you did cluster your call processors, didn’t you???).  In fact, Cisco’s best practice is to use IP for communication with the cluster as opposed to relying on DNS.  You have to make the choice during the platform installation, and the only way to change is to completely reload the system.  For the purposes of this post, I’m going to assume you followed best practice and are using IP addresses for your communications.  When the time comes to change the IP addresses of your cluster members, you shouldn’t fret about the complexities.  That is, provided you have some patience and some familiarity with the CUCM command line interface.

1.  Make sure your cluster is healthy. This can’t be stressed enough.  If you’ve got database replication issues or network instability, this whole process is going to suck like a graviton-powered Hoover.  SSH to the publisher using your favorite SSH client and log in using the platform administrator login.  Once there, run this command:

show perf query class "Number of Replicates Created and State of Replication"

It should come back with the number of replicates created and what the replicate state is.  The state should say “2”.  This means your cluster is healthy and replicating like it should.  You could also see a “0” here if you only have one publisher and no subscribers.  This would be the case if you’re running a Business Edition (CUCMBE) all-in-one system.  Next, you should check the network connectivity with this command:

utils diagnose module validate_network

This command should return a status of “passed”.  If those command both succeed with no errors, your cluster is healthy and ready for the next step.

2.  Change the Subscriber Server IP in CUCM Administration. This is where my first IP change attempt failed spectacularly.  You need to change the IP address of the server in the CUCM Admin page before you do anything at the OS level.  This is due to the fact that the database services rely on the IP for a great number of things, and changing the IP at the OS level without changing the database first will cause the DB services to fail to start when you reboot. You need to change the subscribers before the publisher in order to maintain consistency.  First, login into the administration page, then go to System –> Server and select the node name for the subscriber you want to start with.  Change the IP address on this page to reflect the subscriber’s new IP address.  At this point, stop what you are doing and go get some coffee.  Just walk away from the PC for a few minutes.  While you’re up, I like my coffee with sugar and cream.

Back already?  And, I see you didn’t bring me any coffee.  Oh well.  We need to verify that the database has replicated the changes across the cluster before we start monkeying around with the OS configuration.  SSH into the publisher and run this command:

run sql select name,nodeid from ProcessNode

You should see the new IP address in the output table.  Now we need to actually change the subscriber’s IP.

3.  Change the Subscriber IP address in the OS. Fire up that SSH client again and connect to the subscriber you just changed in the publisher admin page.  Don’t worry, the address is still the same at the OS level, so you’ll be able to connect.  Once on the command line, you need to change the IP.  I recommend doing it this way to be sure the changes stick and make it a little easier to reboot the system afterwards.  If you’re moving the server to a different subnet, make sure to change the default gateway first like this:

set default gateway x.x.x.x

After that, or if you didn’t need to change the gateway, you need to change the IP address for the server.

set network ip eth0 x.x.x.x y.y.y.y

When you complete this command, you’re going to get a warning popup:

***   W A R N I N G   ***

If there are IP addresses (not hostnames)
configured in CallManager Administration
under System -> Servers
then you must change the IP address there BEFORE
changing it here or call processing will fail.
This will cause the system to restart
=======================================================
Note: To recognize the new IP address all nodes within
the cluster will have to be manually rebooted.
=======================================================

You can safely ignore this warning because you’re following these steps and you’ve already done this.  Type “Yes” at the prompt to force the subscriber to reboot.  Now would be a great time to enjoy that coffee.  It’s going to take a little longer than usual for the subscriber to come back up due to the changes to need to be made to the Tomcat server and the DB server.  At this point, if this is the only change you are making, Cisco recommends you reboot all the other cluster servers to change the name resolution and hosts file entries.  If you’ve still got more servers to do, go ahead and keep changing the subscribers as above.  Make sure the database replicates with the new IP before changing the OS IP address.  If your cluster is configured correctly, the phones should fail over to the active subscribers as you take down the other subscribers for the IP changes.  Once you’ve completed the subscriber IP changes, you’re almost done.

4.  Change the Publisher Server IP Addresses. The above steps are still correct for the the publisher server, except for one added step.  After you’ve returned with your fifth cup of coffee following the DB replication issues, you need to change the publisher IP on all the subscribers.  Go to the OS admin page of each one and go to Settings –> IP –> Publisher.  Once there, change the IP address of the Publisher server to the new IP you are going to set on the publisher.  If you happen to be a CLI junkie, you can do this from the comfort of your SSH program provided you’re on CUCM 6.1(2) or later with this command:

set network cluster publisher ip x.x.x.x

Be warned that you’ll need to reboot immediately after typing in that command.  Once that’s taken care of on all the subscribers, you can proceed to change the IP address in the OS admin of the publisher and then reboot.  After the publisher comes back up, communication should be restored and all the ominous organ music and thunderclapping in the background should stop.

Should you find yourself unlucky enough to use DNS for cluster identification, you’re going to want to start putting some Irish Crème in that coffee.  You’ll also want to refer to this page to get some information about the additional steps required to sort out the DNS mess on your system as you start changing IPs.

And there you have it.  With a little patience, a few cups of coffee, and a little CLI wizardry, even the mean, nasty CallManager can get a new IP address.  Just be sure to check the database replication after changing the IP in the system administration page.  Because if the DB changes haven’t replicated before you start changing OS settings, you’re going to have a bunch of fun getting things back to a good running state before you try again.  Good luck, and may the voice be with you.

The Recertification Treadmill

I like tests.  Probably a lot more than I should.  Oh, it wasn’t always like this.  I dreaded test days in college.  Cramming chapters worth of information into my brain so that it could just be regurgitated later and forgotten shortly after than.  In fact, I can distinctly remember studying the OSI model for one of my IT infrastructure classes and thinking, “I only need to remember this for the exam.  After that, I’ll never see it again.”  Of course, that same OSI model is now permanently tattooed on the insides of my eyelids.

Then I entered the Real World.  I found out about certification tests and all they entail.  You mean I can take one test proving my mastery of a subject and you guys send me a certificate and a little wallet card?  Sign me up!  It also helped that my employer is a partner with multiple vendors and needed me to take as many tests as I could to keep their partner status up-to-date.  So I set off on my odyssey of test taking.  I’ve got certifications from Novell, Microsoft, CompTIA, Cisco, HP, (ISC)2, and many more.  I’ve taken enough tests that the test administrator at my local testing center recognizes me in the street.  I know more about the ins and outs of testing procedure than most people should.  And, I’ve been handsomely rewarded for my test taking prowess.  And, for the most part, I’ve enjoyed every second of my learning.  Except for recert day.

Yes, every once in a while one of the vendors sends me a note that says I’m due for renewal.  My professional title is now in jeopardy if I don’t study some new information and go see my local Pearson/Prometric guru.  So I start pouring over material in an effort to not need new business cards.  I cram all that new information in my stuffed head and run out to take the test again.  And I pass.  And for a while, I’m a golden boy again.  Until recert day comes up again.

Some vendors tell  you that you can keep your certification for ever and ever.  Like my MCSE.  Of course, I’m not technically “current” with that one, especially now that the new title is MCITP (or something like that).  So, while I’m a whiz when it comes to Windows 2000, I’m not really authorized on the new hotness of Server 2008.  Oh well.  Other vendors, like Cisco, keep the same certification title, but they change the tests around from time to time.  Like my CCVP.  I originally certified on CUCM 4.1.  Back when there was a separate test for those gatekeeper thingies.  And then Cisco went and released a new CCVP track about CUCM 6.x.  I didn’t have to recertify because my CCVP was still good.  But now, they have eliminated the CCVP and changed the voice certification track to the CCNP: Voice.  You can still take the CCVP tests and get grandfathered in before the change to the new CUCM 8.x material if you want.  And that’s what I found myself doing about 2 months ago.  I figured since I worked with voice everyday it shouldn’t be too rough to just jump in and take the tests.  My reasoning was that the partner requirements for Advanced Unified Communications would change after the CCVP –> CCNP: Voice move, so I wanted to get out in front of this change before I was forced to.  I managed to stumble through the troubleshooting test and both CallManager tests in fairly short order.  As I brushed up on my CVOICE basics, I remembered that a previous visit to the Certification Tracker showed that I hadn’t taken the QoS exam, even though I distinctly remembered the pain and agony of that one.  I wrote in to Cisco Cert Support, hoping that I didn’t have to go through it all over again.  While I kept studying for my CVOICE test, I got the response.  It seems that those tests expire after 3 years, and I would need to retake it again for it to be valid.  However, according to Cisco, I was already a CCNP: Voice, so I wouldn’t need to retake it.  Huh?  When did that happen?

Cisco’s recertification policy for professional level exams says that taking any professional test with a ‘642’ prefix will recertify your CCxP.  Little did I know at the time that my first test, Troubleshooting Unified Communications, had recertified my CCVP and triggered the upgrade to a CCNP:Voice.  So, the CUCM tests were for naught.  The CVOICE test did give me a CCNA: Voice tag, so I’ve got that going for me now.  The Cisco recert cycle is nothing new to me.  I’ve been taking the CCIE written exam every year because it’s the only way to keep my specialist designations current.  In order to keep my employer in the good partner graces, I have to keep remembering OSPF and MPLS trivia and take the CCIE written at least every two years.  It’s the only way for me to keep my certifications current without devoting all my time to studying and taking tests instead of doing the job I was hired for.  I was confused in this particular instance with the CCNP: Voice because the certification website never said anything about there being an upgrade path from my 4.2 CCVP to the 8.x CCNP: Voice.  I’m happy nonetheless, but I started thinking about the whole recertification process and why it bothers me somewhat.

I can take any 642 level Cisco exam and recertify all my CCxA and CCxP titles.  I can take the CCIE written and do the same, including my specialist tags.  VMware makes me take a new test and sit through 5 days of training to get a VCP4.  Microsoft wants me to take a whole new set of tests to become a new MCSE/MCITP.  Novell just keeps certifying me on Linux stuff even though I haven’t taken Novell test in years.  And we won’t talk about HP.  Ethan has a great post about recerting his CCIE that hits on a lot of good points.  Normally, we have to either shut down our productivity for a few weeks to get into the recertification groove, or try and find time outside of work to study.  Either way, it seems like a colossal waste of time. It’s almost like being elected to the House of Representatives.  You need to start campaigning for re-election right after you’ve been elected.  It’s just annoying that I have to take time out of my schedule to relearn things I’m already doing.  Is there any way to fix this?

Find a lawyer.  Any lawyer.  If you’re having trouble, check behind the nearest ambulance.  Now, ask them how many times they’ve retaken the bar exam.  Odds are good they’ll stare at you and tell you that you’ve lost your mind.  Lawyers don’t have to resit the bar exam every time they need to renew their fancy degree.  They are allowed to use Continuing Professional Education credits.  All they have to do is take a class or attend a conference and they can count that learning toward recertifying their degree and certification requirements.  IT people are the same.  We spend a lot of our time watching webcasts and going to trade shows.  I go to Cisco Live Networkers almost every year.  When I’m there, I take the opportunity to learn about technologies I don’t encounter in my every day job, like TRiLL or FabricPath.  I’m doing an awful lot to keep current with trends and technology in the industry, and it feels like it’s all for my own edification.  It doesn’t really count toward anything.  Except in one case – my CISSP.  Because (ISC)2 uses a CPEs too.

The vendor-neutral certification bodies have it right, in my opinion.  (ISC)2, BICSI, and CWNP all have a CPE policy.  They say that you can go to conferences or read books and count that learning toward your certification.  They want you to prove that you’re staying current, and in return they’ll make sure you are current when it comes to certifications.  Sure, in the case of the CISSP, most of the learning needs to be focused on security, but that’s how it should be.  I can count some amount of general education credits toward my CISSP, but the bulk of the education needs to be focused on the subject matter of the certification.  I think something like this would be a great addition to Cisco’s arsenal.  Give your certified professionals a chance to apply the learning they do every day toward recertification.  You’d sell more Cisco Press books if I knew I could read one and count 5 points toward my CCSP.  There’d be even more attendees at Networkers if it counted for 40 CPEs every year.  But, there also need to be some restrictions.

Some vendors don’t like the idea that one test can recertify all your titles.  Juniper doesn’t.  So make sure that the education credits only count toward a specific area of knowledge.  The Migrating CUCM class from Brandon Ta that I go to every year could count toward my CCVP, but not my CCSP.  My TRiLL webcasts could count for points to recertify my CCIE R&S or SP, but not the CCIE Wireless.  If you marry the education to a specific certification, you’ll see much higher attendance for those kinds of things.  For people like us that spend time writing about things on the Interwebs, authoring articles for places like Network World or Information Week could count as well, since you are disseminating the knowledge you’ve obtained to the masses.  Even teaching could count toward recertifying.

This idea is not without issue, though.  The first argument is that allowing certified individuals to use CPEs might cause problems with the cottage industry that has sprung up around teaching these subjects to people.  Ask yourself, How many people would go to VMware classroom learning if it wasn’t required to obtain the VCP?  I’m sure the answer would be “A whole lot less.”  It’s no secret that Cisco and HP and Microsoft make a lot of money offering classes to people in order to get the certified on technology.  Companies can specialize in just teaching certification coursework and turn a tidy profit.  And these same companies might not be too keen on the idea of a revenue stream drying up because Cisco or Novell decided to be noble and not require everyone to take a new test every 2 years.

Another consequence, though one for the better, would be the contraction of the “braindump” market.  A lot of people talk about the braindump market catering to those who want a fast track to the CCNA or other entry-level cert.  I’m of the opinion that a larger portion of the dumping population consists of already-certified individuals that have neither the time nor the energy to study for a recertification exam.  These people are facing a deadline of needing to stay current with whatever alphabet soup comes after their name, except now that they have a steady job they don’t have the time to devote to studying all night to pass.  Faced with the option of letting their certification expire, or paying money to someone for the answers to the test, they swallow their pride and take the easy way out.  In their mind, no harm is done because they were already a CCxA in the first place.  They know the material, they just don’t have time to remember what the “vendor answer” is on the test.  Now, give these same people the opportunity to apply a webcast or vendor presentation that they’d sit through anyway to that CCxA.  I bet that more than half the dumping sites would go away within a year.  When the market starts drying up, it’s time to move on.

I really hope that the vendors out there take the time in 2011 to reassess their recertification strategies.  Giving certified professionals more options when it comes to proving they know their material can only build goodwill in the future.  Because the current method feels way too much like a treadmill right now.  I keep running in place as fast as I can just to stay where I’m at.  I think things need to change in order to make the education and learning that I do have a tangible impact on my certification progress.  Because sooner or later I’m not going to be able to keep up with the recertification treadmill.  And we all know what the result is when that happens…

COBRAS!

If you are a voice networking professional, and you are tasked with working on any Cisco voice mail product, do yourself a favor and go download the Cisco (Unified) Backup and Restore Application Suite (COBRAS).  You’ll thank me later.

If you’ve ever tried to upgrade Unity, you know the pain that comes from the good old Disaster Recovery Tool (DiRT).  Cisco’s best practice for upgrading Unity involves using DiRT to backup your existing database, installing a new server, installing the old version of Unity and performing a restore, then attempting to upgrade the new server to the new version of Unity.  Any one of those steps if fraught with danger and terror.  DiRT was never really designed to do upgrades.  It was only ever meant to restore your Unity configuration and data in the event of a meteor strike or alien invasion.  Other than those two corner cases, it pretty much sucks.  In case you couldn’t tell, I’m not a fan of DiRT.  It’s like magnetic tape.  It serves one purpose that it’s good for, but if you start getting creative you are asking for trouble.

When Cisco started pushing Unity Connection as a viable alternative to Unity, the need arose to find a way to get the data out of Unity and import it into Unity Connection.  This is not a job for DiRT.  The worst-case-scenario is that you need to pop up two web browsers and input the information from one system into the other.  Not a fun job if you have even 50 mailboxes.  A nightmare if you have a few thousand.  And, you can’t move user passwords and PINs.  A better solution needed to be found.  And, thanks to some enterprising TAC engineers, we have COBRAS.

COBRAS started its life as a very unsupported tool on the Cisco Unity Tools website.  Anyone that has worked on Unity for more than five minutes has been to this website.  At various points in its life, it has been referred to as AnswerMonkey.net and LindborgLabs.com.  My best guess is that it started as a repository run by some TAC engineers for the purpose of giving the long-suffering Unity support people a place to download tools and scripts to help get Unity working as properly as you can for a program that requires you to speak in tongues to fix it.  About three years ago, I had the good fortune to take a class at Cisco Live Networkers that dealt with the problem of migrating CallManager and Unity to newer versions.  I specifically took the class because I was about to face an unpleasant customer upgrade from CUCM 4.1 to 6.1.  At the same time, the customer wanted to move from Unity 4 to 5.  All of the official documentation I read said that the DiRT migration was the best way to move to new hardware.  Luckily, the class from Brandon Ta at Cisco Networkers pointed me in the direction of COBRAS.  Of course, when I went to download it from the Unity Tools website, the warning messages and dire predictions told me why I hadn’t seen it before – it wasn’t quite supported.  As in, it wasn’t supported at all.   Still, faced with the choice of something that I knew I wouldn’t like or the idea of trying something new that had little support, I bit the bullet and went with the new idea.  And boy did I like it.  COBRAS pulled all the Unity information from the old system in short order.  When I brought the new system online, a quick import into Unity 5 got the voicemails flowing again in no time.  Rather than spend hours waiting for the inevitable issues with DiRT restores, I could instead concentrate of cursing the Data Migration Assistant.  But that’s another story entirely.

Flash forward to this past month.  We’d been running Unity at our office for several years, and my users had become very dependent on unified messaging.  When the Windows admins decided to upgrade to Exchange 2007, they forgot to warn me about when they had planned on doing this.  So, when I walked the next morning, the voicemail integration was offline.  It took a couple of hours before I was able to install the correct engineering special and repair all the Unity permissions with the scripts from the Grand Unity Grimoire.  I’ve known that an upgrade to Exchange 2010 is in the works at some point.  I’ve also grown tired of the difficulties that Unity presents.  I can only administer it from Internet Explorer.  The need to keep Active Directory healthy just for the sake of Unity was annoying.  The need to know intimate details about Exchange 2003 made me cringe.  Even my conversations with product managers from Cisco didn’t leave me all that excited about Unity.  But, I couldn’t move to Unity Connection just yet.  Because of the unified messaging issue.  My coworkers couldn’t fathom the idea of NOT getting voicemails in one inbox.  IMAP wasn’t going to cut it.  So I bided my time and plotted the demise of Unity.  When Cisco formally announced the feature set for Unity Connection 8.5, I jumped for joy.  Unity Connection 8.5 contains a unified messaging agent that allows you to synchronize your Unity Connection mailstore with Exchange 2003, 2007, and 2010.  My users would receive the same benefits as they had now with Unity, and I would get a unified management platform on the back end that was no longer soulbound with Exchange.

Once I got Unity Connection installed on a spare server, I needed to export 50 mailboxes worth of data from Unity.  I checked Cisco Unity Tools and found they had updated COBRAS to support the latest versions of Unity Connection, and in fact COBRAS was now supported by TAC!  It felt like this angel finally got its wings!  I downloaded the Unity Export and Connection Import programs and installed them.  The Unity Export program needs to be installed on Unity.  In fact, it’s only support on Windows Server platforms.  The Connection Import program can be installed on any Windows system, but you also need to install the IBM Informix Database drivers to allow communication with the Unity Connection database directly.  I resorted to installing the program and database drivers in Windows XP virtual machine, as my 64-bit Windows 7 installation has already show an intolerance for drivers in general, which my USB-to-serial adapter will attest to.  Once I installed the programs, I exported all the configuration data from Unity in one shot.  It took all of 20 minutes.  I was shocked.  I had fully expected this whole export to eat up my entire afternoon.  When I went to export the voicemail messages from the database, I found that I needed to be logged in as the “UnityMsgStoreSvc” account to have access to that particular database.  Hopefully, you’ve got that account’s password jotted down somewhere in the deep, dark recesses of the documentation black hole.  The message export process took a little longer, mainly because there are some users on my system that have never deleted a voicemail.  In all, I exported 168 MB of WAV files into a backup folder, along with a database of the account configuration.

Now, to import into Unity Connection.  When you first fire up the Import program, you’ll be asked to pick the backup you wish to restore from.  You then have to navigate though a 68-step wizard.  Don’t worry, it’s not as bad as it sounds.  Many of the steps are verification of the Unity Connection configuration.  And there are many steps that get skipped depending on what kind of data you are importing.  It took me a couple of steps to get the messages imported due to some configuration issues (click here for those steps).  Once that was accomplished, everything went great.  I was able to mirror the Unity database onto the Unity Connection server.  I setup a separate voicemail profile in CUCM and awaited my cutover.  Just like expected, the actual cutover took about 10 minutes.  Once the call handlers and voicemail greetings were verified everything was done.

I’m now ready to shutdown the old Unity server and remove all the old voice mail profile information.  Once that’s done, Unity and I have a date in the parking lot.  I’m hoping to recreate something like this scene.  Fast forward to 1:09 for the good stuff, and be warned the audio has NSFW words.

In the end, I don’t think any of this would have been possible without the help of COBRAS.  It may not be a ruthless terrorist organization determined to rule the world, but I’m sure it would help them migrate their voicemail server.  And now you know, and as always knowing is half the battle.

Twelve Days of Christmas Networking

In the spirit of Christmas, and because my wife has made me listen to the song about 400 times so far this year, I present the Twelve Days of Christmas, Networking Nerd style.  To save you all the trouble of singing the whole song, we’ll just skip to day tweleve.  On the Twelfth Day of Christmas, the Networking Nerd gave to me:

- Twelve character passwords

- Eleven 802.11n Access Points

- Ten Gigabit Ethernet

- Nine 9971 phones

- Eight-port switch blades

- Seven CCIE Tracks

- Six Hours in the CCIE Lab

- Five Magic Digits! (I hope…)

- Four-port FXOs

Three Packet Pushers

- Two L2MP options

- And One Goal: To get my CCIE!

Special Thanks to JT (@WannabeCCIE) for giving me the idea for this.

Merry Christmas to all the folks out there.  May your holidays be filled with joy and caring.  May your families not drive you insane, and may your Christmas stocking be filled with all the goodies you asked Santa for.

 

What’s in a Title?

“A title by any other name would stink as bad.” –Okay, it’s not Shakespeare, but it’s close.

After my little engineering diatribe, I’ve been thinking of new titles to that I can use besides engineer or rock star.  Because rock star makes you sound pretentious.  And I got really tired of waking up at 5 in the morning to feather my hair with a case of Aquanet.  I shy away from terms like “architect” and “champion” because they may sound cool, but they convey absolutely nothing about what I do.  So, I started making a list:

  • Director of Bailing People’s Asses Out of the Fire
  • Chief Google Search Officer (CGSO)
  • Vice President of Explaining Things to People that Don’t Understand Me
  • Executive Chairman of Just Buy What I Tell You and Don’t Ask Questions
  • President of Throwing Salesmen Under the Bus
  • Head of Deciphering TLAs
  • High Priest of Unity/Exchange Voodoo
  • Sergeant-at-Arms of Explaining Why Your Hair-Brained Idea Won’t Work
  • Chief Caffeine Consumer
  • Vice Regent of Solving Executive Problems RIGHT NOW!
  • Owner/Operator of I Told You So, INC

I think I’m going to need to get bigger business cards..

Feel free to leave some of your favorite titles in the comments.  Just make sure they are descriptive about what your job title is.  And for the love of all that’s holy, DON’T put “engineer”.

They Hackin Everybody Out Here

I’ve learned a couple of important lessons in my time as an Internet citizen.  First, don’t taunt the Internet Hate Machine known more colloquially known as “Anonymous”.  Secondly, keep your passwords secure and complex and don’t use them for every website.  Should you do #1 and neglect #2, be certain that #1 will bite you in the ass.  As the people at Gawker Media learned this past week.

A group known as Gnosis posted a 500MB torrent containing various data pulled from a variety of Gawker Media websites.  They claimed the hack was due to Gawker’s hubris and their mocking of previous hacks.  There is also evidence to support the idea that some in Gawker may have taken a stance against the actions of Anonymous in their crusade against those that were involved in the Wikileaks debacle in early December.  While the file contains things like chat logs and FTP servers for various sites that probably don’t want them published, there was a singular gem amongst the chaff.  The most critical piece of this file is the dump of the Gawker MySQL database.  Gnosis was able to access the database and pull the table containing the list of user IDs and passwords.  According to the README.TXT contained in the torrent (and reposted across several websites), they decided to stop dumping the database after about 1.3 million users.  Gnosis then turned to using John the Ripper to decrypt the passwords, which were stored in the table in DES-encrypted format.  The good news is that Gawker decided to store the passwords in a non-plaintext format.  The bad news?  DES is limited to using 8-character keys for encryption (Check this out for more information).  That means that only the first eight characters of the passwords were encrypted and stored.  So, if you were diligent and created a super hard password like “passwordc4n7b3|-|4ck3d”, it would only store “password” in encrypted format.  So, armed with a password database, a sophisticated cracking tool, and a weak encryption algorithm, Gnosis set out to see what they could see.

What did they find?  Well, for one, people violated my second rule by making some pretty easy-to-guess passwords.  Like “password”.  No kidding.  It was the second most popular password out of the bunch, with about 2,100 people out of the 300,000 released hashes using it.  What was more popular than that one?  How about “123456”?  More than 3,000 people used that one.  And the third most popular one was “12345678”.  For a full list of the most popular passwords, check out the Wall Street Journal Blog.

Guess what?  Those passwords SUCK!  Yes, they are easy to remember.  Yes, it’s slightly more secure that not having a password.  Guess what?  They’re also quite easy to guess.  Thanks to rainbow tables, it’s not hard to find the DES hash for password.  In fact, just so you know, it’s “uDGdyZA2EBdWk”.  Just search for that string in the database and you’ll know tons of accounts with unsecured passwords.  Because I know that everyone reading this knows how to make a secure password, I won’t patronize you with password policy.  But, just in case my mom ever decides to read this, a proper password includes ALL of these things:

  • At least EIGHT characters (the more, the better)
  • A number
  • A capital letter
  • A symbol
  • Non-obvious (see above for a list of some obvious stuff)

If you password doesn’t meet those guidelines, it’s probably not that secure.  The longer and more complex the password, the more likely it is to stand up to a dictionary attack or brute force attempt.  However, even if you have a nice, complicated password, reuse of it all over the place can still get you in trouble, as the Gawker people found out on Monday.

Once the Gnosis people got finished having their way with the the Gawker MySQL database, they took their hack to the next level.  They thought to themselves, “I wonder if these people use the same password everywhere?”  So, armed with a list of e-mail addresses and usernames and passwords, they started checking around.  Getting into GMail and Yahoo mail accounts.  Logging into Twitter and Facebook.  Causing general chaos.  Like Twitter accounts randomly tweeting about acai berry products.  The first thought was a new URL-exploiting worm.  Then they realization that a lot of people that were singing the praises of the lowly acai berry were victims of a hijack attack from people that had downloaded the torrent from the Gnosis hack.  Because these users had utilized the same password across multiple accounts, a security breech in one had exposed all of them.

In my opinion, Gawker’s response to the hack wasn’t quite as effective as it could have been.  They posted banners on all their websites advising users to change their passwords.  Except they had taken down the database for some time to patch the holes in it.  Which left their password reset mechanism offline.  What should have happened was an immediate, blanket password reset of EVERY account in the Gawker database.  Gawker already had their e-mail addresses, which would be used to mail the password after a manual reset.  It should be a simple matter to reset the password automatically and send off the new temporary password to the account in the database.  Instead, the users were forced to take the steps themselves or risk further exposure.  A little forethought and perhaps some heavy-handed security admin 101 might have gone a long way to restoring user faith in Gawker.

What we have here is a case of the perfect storm of an older system riddled with insecure passwords that was compromised by a determined foe and then exploited far beyond what anyone except the most pessimistic security expert could have imagined.  Hacks of this magnitude are becoming more and more common, and as we spend more and more time online the information exposure becomes worse each time.  It is quickly reaching the point where it will be necessary to start compartmentalizing our lives in order to keep ourselves secure.  Many people I know have instituted something like this already.  Sites like Facebook and LinkedIn get one type of password.  E-mail and banking sites get a totally different password that is more secure.  For IT professionals, keeping track of multiple passwords isn’t that difficult, especially with password management tools such as 1Password to help us keep our lives straight.  But, to be fair, IT professionals aren’t the true targets of these kinds of hacks.

IT professionals and technology-savvy people are hard targets.  We rotate passwords.  We make secure logins.  We’re always conscious of what information is being stored and shared.  We make lousy hack targets.  But, people like my mom that use the Internet for Facebook and e-mail and shopping are prime targets.  They make accounts on websites like the ones run by Gawker to make a comment on a story.  They use the same password that they use for their Yahoo Mail account and Facebook.  And when something like this comes along and upsets everyone’s apple cart, those people are the ones that suffer.  They aren’t walled off and sure of what information may have leaked.  And they aren’t sure of what passwords to change or when to do it.  And so they might find themselves on the news talking about getting hacked and all the doom and dismay that it has caused.  And who knows?  Maybe someone will autotune my mom into an Internet meme.  Let’s hope not.  Because if there’s anything worse in this world than password database leaks or FBI backdoors into IPSec, it’s listening to my mom sing, autotuned or not.

Stuxnet: Be Afraid

“Doesn’t that bother any of you? Because it scares the living piss outta me!” – Lloyd Bridge as Admiral Tug Benson

That pretty much sums up my feelings about the Stuxnet worm the more and more I read about it.  It seems like every week brings more and more dastardly information about this worm and its consequences for cyber warfare in general for the foreseeable future.  First, a refresher course for those that might not be totally familiar with this little gem.

Anatomy of a Scary Virus

A Belarusian security firm got it’s hands on a sample of a new worm in mid-June of 2010.  It was a Windows-based attack that seemed to be quite virulent from the very beginning.  More disturbing, however, was the complexity that lay just beneath the surface upon further examination.  Stuxnet targetted 4 separate zero-day exploits in Windows.  In the security arena, this is the equivalent of showing your hand too early in a poker game.  Zero-day exploits have great value on the black market for virus writers, so they tend to be hoarded and exploited only when a significant advantage can be had.  For a virus to use four of them at once meant that it was serious about infecting things.  Secondly, it installed a rootkit on the target system.  While this isn’t necessarily new in and of itself, the way it succeeded was brilliant.  The writers of the virus hijacked to signed security certificates from trusted manufacturers JMicron and Realtek.  This meant that the kernel mode drivers necessary for rootkit operation could be installed without so much as a blip of a warning.  Also disturbing was the method in which the virus was constructed, a mish-mash of C and C++ code.  This is quite odd for a trade that typically uses simple coding techniques.

After digging into the payload and operation of the virus, the malicious intent cranked up two or three more notches.  The virus used a data cable connect between the PC and a Siemens Programmable Logic Controller (PLC) to hop into the PLC where it really started its nefarious work.  Firstly, a rootkit was installed to hide the infection.  Then, using the PLC it started messing with variable frequency drives that were slaved to the unit.  Specifically, it was looking for drives that spin between frequencies of 807 Hz and 1210 Hz.  Why so specific, you ask?  Because drives that run at those frequencies just happen to be of the same kind that are used in centrifuges, which are critical to process needed to enrich uranium in nuclear power plants.  Once it found the target, it didn’t make itself obvious by disabling the drive.  Instead, it varied the rotational speed of the unit, ramping it up to 1400 Hz then back down to 2 Hz then back up again.  To the outside observer, it would just look like the device was going haywire or having mechanical difficulties.  At worst, you might think to pull the drive out and replace it with another unit.  Of course, as soon as that unit was connected to the PLC, it would be infected by the Stuxnet worm and the whole process would begin all over again.

A New Chapter in Warfare

Once the security firm started tracing the command and control centers for the virus, the trail started going cold as servers were shutdown and erased from the face of the Internet.  Usually, those kinds of disappearing acts are perpetrated by the kind of three-letter agencies that don’t like to make the headlines.  And so it was that a large number of security researchers started speculating about the nature and purpose of Stuxnet.  Symantec believes that a well-coordinated team of 5 to 10 individuals spent several months writing the virus.  As well, the largest number of infected systems appears to be located in Iran.  Based on the specific target of the virus (industrial equipment known to be purchased by Iran), it seems quite plausible to assume that someone or something wanted to make sure that the equipment didn’t function correctly.  But, rather than take it out completely, the idea behind Stuxnet was to mask the damage done and make it look like mechanical failure.  Indeed, since it was looking for such specific target criteria, it might have laid dormant for months before unmasking itself.  The speculation currently is that the worm was designed to do one thing with brutal efficiency – cripple the Iranian nuclear program.  Not by airstrikes or conventional means, but with cyber warfare.

When you think back on many of the malware programs that have sprung up and been quite irritating over the last few years, realize that the authors wanted to make a statement with them.  Whether it was the theft of personal information or the hijacking of your PC for less-than-honorable purposes, each author left a stamp or calling card.  These are the kinds of people that do things for fame and fortune.  They want the exposure.  If someone finds out who wrote Code Red or Nimda, all the better for them.  Exposure gives credibility and prestige in that community.  Even something like the SQL Slammer worm was an attempt to exploit a known vulnerability, perhaps for use by someone at a later date.  Only the ham-handedness of the coding caused it to race out of control and be fought back so quickly.  And so security professionals see these viruses and malware infections and combat them as best we can.  But we only catch them because we can see the tell-tale signs.

Stuxnet appears to have been coded by a person or persons who don’t ever intend to be known.  Their job succeeds when no one knows they did anything.  These kind of people don’t leave marks or traces of any kind when they are done.  They are professional.  The pick a target and pursue it relentlessly until it is neutralized.  And when all is said and done, no one would think twice about the cause of the misfortune to be man-made or inflicted.

Imagine if this had happened in America?  Infected USB drives are scattered around a parking lot at a facility that services nuclear power plants.  Or mailed to key individuals that have access to sensitive areas.  Imagine the chaos that could ensue if the payload hadn’t been designed to subtly cripple, but instead was crafted to cause mayhem and disorder?  Imagine what might happen if it were to occur on the scale of something that we can’t live without, like the GPS constellation?  The idea that agencies and organizations that have made careers out of the kind of malicious and nasty tricks that mark intelligence and spying are now beginning to focus on cyber warfare is frightning.  Think about what could happen if the most prolific and successful malware creators were hired for a job that would pay a fortune, provided the attack was successful and left zero trace.  Would it be worth several million dollars if a country could cripple the military command and control functions of their enemy with a moment’s notice?  What would happen if an invading army had no fear about its ability to render any and all resistance moot with the press of a button from some previous malware infection that went totally undetected until it was too late?

Granted, this all pie-in-the-sky rambling, but the directions that these types of programs can be taken in boggles even the most die-hard security researchers.  Think about how many information system breaches we’ve seen.  Now think about what would happen if it was targeted to, say the Department of Defense.  Or the Social Security Administration? And no amount of money or threat of prosecution could deter the people doing it.  State-sponsored terrorism is bad enough today.  What happens when state-sponsored cyber terrorism becomes more prevalent?  And before you answer that question too quickly, look at what happened with GMail just a few months ago.  And realize that many in the security realm are starting to believe that those attacks were state-sponsored.

For those of you science fiction fans out there, my thought exercises may sound eerily similar to the reimagined Battlestar Galactica mini-series, where the Cylons were able to cripple the entire military effectiveness of the Colonials with a few well-placed programs.  We all laughed at it and said that it made for great story telling, but it was still just fiction.  Well, with the rise of Stuxnet and inevitably more programs like it, we can only hope that the escalation of cyber warfare doesn’t lead us to some kind of horrible conclusion.  Because it’s something like that which makes me truly afraid.