Meraki Will Never Be A Large Enterprise Solution

Cisco-Cloud-Networking-Meraki

Thanks to a couple of recent conversations, I thought it was time to stir the wireless pot a little. First was my retweet of an excellent DNS workaround post from Justin Cohen (@CanTechIt). One of the responses I got from wireless luminary Andrew von Nagy (@RevolutionWifi):

This echoed some of the comments that I heard from Sam Clements (@Samuel_Clements) and Blake Krone (@BlakeKrone) during this video from Cisco Live Milan in January:

During that video, you can hear Sam and Blake asking for a few features that aren’t really supported on Meraki just yet. And it all comes down to a simple issue.

Should It Just Work?

Meraki has had a very simple guiding philosophy since the very beginning. Things should be easy to configure and work without hassle for their customers. It’s something we see over and over again in technology. From Apple to Microsoft, the focus has shifted away from complexity and toward simplicity. Gone are the field of radio buttons and obscure text fields. In their place we find simple binary choics. “Do You Want To Do This Thing? YES/NO”.

Meraki believes that the more complicated configuration items confuse users and lead to support issues down the road. And in many ways they are absolutely right. If you’ve ever seen someone freeze up in front of a Coke Freestyle machine, you know how easy it is to be overwhelmed by the power of choice.

In a small business or small enterprise environment, you just need things to work. A business without a dedicated IT department doesn’t need to spend hours figuring out how to disable 802.11b data rates to increase performance. That SMB/SME market has historically been the one that Meraki sells into better than anyone else. The times are changing though.

Exceptions Are Rules?

Meraki’s acquistion by Cisco has raised their profile and provided a huge new sales force to bring their hardware and software to the masses. The software in particular is a tipping point for a lot of medium and large enterprises. Meraki makes it easy to configure and manage large access point deployments. And nine times out of ten their user interface provides everything a person could need for configuration.

Notice that was “nine times out of ten”. In an SME, that one time out of ten that something more was needed could happen once or twice in the lifetime of a deployment. In a large enterprise, that one time out of ten could happen once a month or even once a week. With a huge number of clients accessing the system for long periods of time, the statistical probability that an advanced feature will need to be configured does approach certainty quickly.

Meraki doesn’t have a way to handle these exceptions currently. They have an excellent feature request system in their “Make A Wish” feedback system, but the tipping point required for a feature to be implemented in a new release doesn’t have a way to be weighted for impact. If two hundred people ask for a feature and the average number of access points in their networks is less than five, it reflects differently than if ten people ask for a feature with an average of one thousand access points per network. It is important to realize that enterprises can scale up rapidly and they should carry a heavier weight when feature requests come in.

That’s not to say that Meraki should go the same route as Cisco Unified Communications Manager (CUCM). Several years ago, I wrote about CSCsb42763 which is a bug ID that enables a feature by typing that code into an obscure text field. It does enable the feature, but you have no idea what or how or why. In fact, if it weren’t for Google or a random call to TAC, you’d never even know about the feature. This is most definitely not the way to enable advanced features.

Making It Work For Me

Okay, the criticism part is over. Now for the constructive part. Because complaining without offering a solution is just whining.

Meraki can fix their issues with large enterprises by offering a “super config mode” to users that have been trained. It’s actually not that far away from how they validate licenses today. If you are listed as an admin on the system and you have a Meraki Master ID under your profile then you get access to the extra config mode. This would benefit both enterprise admins as well as partners that have admin accounts on customer systems.

This would also be a boon for the Meraki training program. Sure, having another piece of paper is nice. But what if all that hard work actually paid off with better configuration access to the system? Less need to call support instead of just getting slightly better access to engineers? If you can give people what they need to fix my problem without calling for support they will line up outside your door to get it.

If Meraki isn’t willing to take that giant leap just yet, another solution would be to weight the “Make A Wish” suggestions based on the number of APs covered by the user. They might even do this now. But it would be nice to know as a large enterprise end user that my feature requests are being taken under more critical advisement than a few people with less than a dozen APs. Scale matters.


Tom’s Take

Yes, the headline is a bit of clickbait. I don’t think it would have had quite the same impact if I’d titled it “How Meraki Can Fix Their Enterprise Problems”. You, the gentle reader, would have looked at the article either way. But the people that need to see this wouldn’t have cared unless it looked like the sky was falling. So I beg your forgiveness for an indulgence to get things fixed for everyone.

I use Meraki gear at home. It works. I haven’t even configured even 10% of what it’s capable of doing. But there are times when I go looking for a feature that I’ve seen on other enterprise wireless systems that’s just not there. And I know that it’s not there on purpose. Meraki does a very good job reaching the customer base that they have targeted for years. But as Cisco starts pushing their solutions further up the stack and selling Meraki into bigger and more complex environments, Meraki needs to understand how important it is to give those large enterprise users more control over their systems. Or “It Just Works” will quickly become “It Doesn’t Work For Me”.

Cisco Just Killed The CLI

DeadCLI

Gallons of virtual ink have been committed to virtual paper in the last few days with regards to Cisco’s lawsuit against Arista Networks.  Some of it is speculating on the posturing by both companies.  Other writers talk about the old market vs. the new market.  Still others look at SDN as a driver.

I didn’t just want to talk about the lawsuit.  Given that Arista has marketed EOS as a “better IOS than IOS” for a while now, I figured Cisco finally decided to bite back.  They are fiercely protective of IOS and they have to be because of the way the trademark laws in the US work.  If you don’t go after people that infringe you lose your standing to do so and invite others to do it as well.  Is Cisco’s timing suspect? One does have to wonder.  Is this about knocking out a competitor? It’s tough to say.  But one thing is sure to me.  Cisco has effectively killed the command line interface (CLI).

“Industry Standards”

EOS is certainly IOS-like.  While it does introduce some unique features (see the NFD3 video here), the command syntax is very much IOS.  That is purposeful.  There are two broad categories of CLIs in the market:

  • IOS-like – EOS, HP Procurve, Brocade, FTOS, etc
  • Not IOS-like – Junos, FortiOS, D-Link OS, etc

What’s funny is that the IOS-like interfaces have always been marketed as such.  Sure, there’s the famous “industry standard” CLI comment, followed by a wink and a nudge.  Everyone knows what OS is being discussed.  It is a plus point for both sides.

The non-Cisco vendors can sell to networking teams by saying that their CLI won’t change.  Everything will be just as easy to configure with just a few minor syntax changes.  Almost like speaking a different dialect of a language.  Cisco gains because more and more engineers become familiar with the IOS syntax.  Down the line, those engineers may choose to buy Cisco based on familiarity with the product.

If you don’t believe that being IOS-like is a strong selling point, take a look PIX and Airespace.  The old PIX OS was transformed into something that looked a lot more like traditional IOS.  In ASA 8.2 they even changed the NAT code to look like IOS.  With Airespace it took a little longer to transform the alien CLI into something IOS-like.  They even lost functionality in doing so, simply to give networking teams an interface that is more friendly to them.  Cisco wants all their devices to run a CLI that is IOS-like.  Junos fans are probably snickering right now.

In calling out Arista for infringing on the “generic command line interface” in patent #7,047,526, Cisco has effectively said that they will start going after companies that copy the IOS interface too well.  This leaves companies in a bit of conundrum.  How can you continue to produce an OS with an “industry standard” CLI and hope that you don’t become popular enough to get noticed by Cisco?  Granted, it seems that all network switching vendors are #2 in the market somehow.  But at what point does being a big enough #2 get the legal hammer brought to bear?  Do you have to be snarky in marketing messages? Attack the 800-pound gorilla enough that you anger them?  Or do you just have to have a wildly successful quarter?

Laid To REST

Instead, what will happen is a tough choice.  Either continue to produce the same CLI year and year and hope that you don’t get noticed or overhaul the whole system.  Those that choose not to play Russian Roulette with the legal system have a further choice to make.  Should we create a new, non-infringing CLI from the ground up? Or scrap the whole idea of a CLI moving forward?  Both of those second choices are going to involve a lot of pain and effort.  One of them has a future.

Rewriting the CLI is a dead-end road.  By the time you’ve finished your Herculean task you’ll find the market has moved on to bigger and better things.  The SDN revolution is about making complex networks easier to program and manage.  Is that going to be accomplished via yet another syntax?  Or will it happen because of REST APIs and programing interfaces?  Given an equal amount of time and effort on both sides, the smart networking company will focus their efforts on scrapping the CLI and building programmability into their devices.  Sure, the 1.0 release is going to sting a little.  It’s going to require a controller and some rough interface conventions.  But building the seeds of a programmable system now means it will be growing while other CLIs are withering on the vine.

It won’t be easy.  It won’t be fun.  And it’s a risk to alienate your existing customer base.  But if your options are to get sued or spend all your effort on a project that will eventually go the way of the dodo your options don’t look all that appealing anyway.  If you’re going to have to go through the upheaval of rewriting something from the ground up, why not choose to do it with an eye to the future?


Tom’s Take

Cisco and Arista won’t be finished for a while.  There will probably be a settlement or a licensing agreement or some kind of capitulation on both sides in a few years time.  But by that point, the fallout from the legal action will have finally finished off the CLI for good.  There’s no sense in gambling that you won’t be the next target of a process server.  The solution will involve innovative thinking, blood, sweat, and tears on the part of your entire development team.  But in the end you’ll have a modern system that works with the new wave of the network.  If nothing else, you can stop relying on the “industry standard” ploy when selling your interface and start telling your customers that you are setting the new standard.

 

CCIE Version 5: Out With The Old

Cisco announced this week that they are upgrading the venerable CCIE certification to version five.  It’s been about three years since Cisco last refreshed the exam and several thousand people have gotten their digits.  However, technology marches on.  Cisco talked to several subject matter experts (SMEs) and decided that some changes were in order.  Here are a few of the ones that I found the most interesting.

CCIEv5 Lab Schedule

Time Is On My Side

The v5 lab exam has two pacing changes that reflect reality a bit better.  The first is the ability to take some extra time on the troubleshooting section.  One of my biggest peeves about the TS section was the hard 2-hour time limit.  One of my failing attempts had me right on the verge of solving an issue when the time limit slammed shut on me.  If I only had five more minutes, I could have solved that problem.  Now, I can take those five minutes.

The TS section has an available 30 minute overflow window that can be used to extend your time.  Be aware that time has to come from somewhere, since the overall exam is still eight hours.  You’re borrowing time from the configuration section.  Be sure you aren’t doing yourself a disservice at the beginning.  In many cases, the candidates know the lab config cold.  It’s the troubleshooting the need a little more time with.  This is a welcome change in my eyes.

Diagnostics

The biggest addition is the new 30-minute Diagnostic section.  Rather than focusing on problem solving, this section is more about problem determination.  There’s no CLI.  Only a set of artifacts from a system with a problem: emails, log files, etc.  The idea is that the CCIE candidate should be an expert at figuring out what is wrong, not just how to fix it.  This is more in line with the troubleshooting sections in the Voice and Security labs.  Parsing log files for errors is a much larger part of my time than implementing routing.  Teaching candidates what to look for will prevent problems in the future with newly minted CCIEs that can diagnose issues in front of customers.

Some are wondering if the Diagnostic section is going to be the new “weed out” addition, like the Open Ended Questions (OEQs) from v3 and early v4.  I see the Diagnostic section as an attempt to temper the CCIE with more real world needs.  While the exam has never been a test of ideal design, knowing how to fix a non-ideal design when problems occur is important.  Knowing how to find out what’s screwed up is the first step.  It’s high time people learned how to do that.

Be Careful What You Wish For

The CCIE v5 is seeing a lot of technology changes.  The written exam is getting a new section, Network Principles.  This serves to refocus candidates away from Cisco specific solutions and more toward making sure they are experts in networking.  There’s a lot of opportunity to reinforce networking here and not idle trivia about config minimums and maximums.  Let’s hope this pays off.

The content of the written is also being updated.  Cisco is going to make sure candidates know the difference between IOS and IOS XE.  Cisco Express Forwarding is going to get a focus, as is ISIS (again).  Given that ISIS is important in TRILL this could be an indication of where FabricPath development is headed.  The written is also getting more IPv6 topics.  I’ll cover IPv6 in just a bit.

The biggest change in content is the complete removal of frame relay.  It’s been banished to the same pile as ATM and ISDN.  No written, no lab.  In it’s place, we get Dynamic Multipoint VPN (DMVPN).  I’ve talked about why Frame Relay is on the lab before.  People still complained about it.  Now, you get your wish.  DMVPN with OSPF serves the same purpose as Frame Relay with OSPF.  It’s all about Stupid Router Tricks.  Using OSPF with DMVPN requires use of mGRE, which is a Non-Broadcast Multi-Access (NBMA) network.  Just like Frame Relay.  The fact that almost every guide today recommends you use EIGRP with DMVPN should tell you how hard it is to do.  And now you’re forced to use OSPF to simulate NBMA instead of Frame Relay.  Hope all you candidates are happy now.

vCCIE

The lab is also 100% virtual now.  No physical equipment in either the TS or lab config sections.  This is a big change.  Cisco wants to reduce the amount of equipment that needs to be physically present to build a lab.  They also want to be able to offer the lab in more places than San Jose and RTP.  Now, with everything being software, they could offer the lab at any secured PearsonVUE testing center.  They’ve tried in the past, but the access requirements caused some disaster.  Now, it’s all delivered in a browser window.  This will make remote labs possible.  I can see a huge expansion of the testing sites around the time of the launch.

This also means that hardware-specific questions are out.  Like layer 2 QoS on switches.  The last reason to have a physical switch (WRR and SRR queueing) is gone.  Now, all you are going to get quizzed on is software functionality.  Which probably means the loss of a few easy points.  With the removal of Frame Relay and L2 QoS, I bet that services section of the lab is going to be really fun now.

IPv6 Is Real

Now, for my favorite part.  The JNCIE has had a robust IPv6 section for years.  All routing protocols need to be configured for IPv4 and IPv6.  The CCIE has always had a separate IPv6 section.  Not any more.  Going forward in version 5, all routing tasks will be configured for v4 and v6.  Given that RIPng has been retired to the written exam only (finally), it’s a safe bet that you’re going to love working with OSPFv3 and EIGRP for IPv6.

I think it’s great that Cisco has finally caught up to the reality of the world.  If CCIEs are well versed in IPv6, we should start seeing adoption numbers rise significantly.  Ensuring that engineers know to configure v4 and v6 simultaneously means dual stack is going to be the preferred transition method.  The only IPv6-related thing that worries me is the inclusion of an item on the written exam: IPv6 Network Address Translation.  You all know I’m a huge fan of NAT.  Especially NAT66, which is what I’ve been told will be the tested knowledge.

Um, why?!? 

You’ve removed RIPng to the trivia section.  You collapsed multicast into the main routing portions.  You’re moving forward with IPv6 and making it a critical topic on the test.  And now you’re dredging up NAT?!? We don’t NAT IPv6.  Especially to another IPv6 address.  Unique Local Addresses (ULA) is about the only thing I could see using NAT66.  Ed Horley (@EHorley) thinks it’s a bad idea.  Ivan Pepelnjak (@IOSHints) doesn’t think fondly of it either, but admits it may have a use in SMBs.  And you want CCIEs and enterprise network engineers to understand it?  Why not use LISP instead?  Or maybe a better network design for enterprises that doesn’t need NAT66?  Next time you need an IPv6 SME to tell you how bad this idea is, call me.  I’ve got a list of people.


Tom’s Take

I’m glad to see the CCIE update.  Getting rid of Frame Relay and adding more IPv6 is a great thing.  I’m curious to see how the Diagnostic section will play out.  The flexible time for the TS section is way overdue.  The CCIE v5 looks to be pretty solid on paper.  People are going to start complaining about DMVPN.  Or the lack of SDN-related content.  Or the fact that EIGRP is still tested.  But overall, this update should carry the CCIE far enough into the future that we’ll see CCIE 60,000 before it’s refreshed again.

More CCIE v5 Coverage:

Bob McCouch (@BobMcCouch) – Some Thoughts on CCIE R&S v5

Anthony Burke (@Pandom_) – Cisco CCIE v5

Daniel Dib (@DanielDibSWE) – RS v5 – My Thoughts

INE – CCIE R&S Version 5 Updates Now Official

IPExpert – The CCIE Routing and Switching (R&S) 5.0 Lab Is FINALLY Here!

Cisco CMX – Marketing Magic? Or Big Brother?

Cisco Logo

The first roundtable presenter at Interop New York was Cisco. Their Enterprise group always brings interesting technology to the table. This time, the one that caught my eye was the Connected Mobile Experience (CMX). CMX is a wireless mobility technology that allows a company to do some advanced marketing wizardry.

CMX uses your Cisco wireless network to monitor devices coming into the air space. They don’t necessarily have to connect to your wireless network for CMX to work. They just have to be beaconing for a network, which all devices do. CMX can then push a message to the device. This message can be a simple “thank you” for coming or something more advanced like a coupon or notification to download a store specific app. CMX can then store the information about that device, such as whether or not they joined the network, where they went, and how long they were there. This gives the company to pull some interesting statistics about their customer base. Even if they never hop on the wireless network.

I have to be honest here. This kind of technology gives me the bit of the creeps. I understand that user tracking is the hot new thing in retail. Stores want to know where you went, how long you stayed there, and whether or not you saw an advertisement or a featured item. They want to know your habits so as to better sell to you. The accumulation of that data over time allows for some patterns to emerge that can drive a retail operation’s decision making process.

A Thought Exercise

Think about an average person. We’ll call him Mike. Mike walks four blocks from his office to the subway station every day after work. He stops at the corner about halfway between to cross a street. On that street just happens to be a coffee shop using something like CMX. Mike has a brand new phone that uses wifi and bluetooth and Mike keeps them on all the time. CMX can detect when the device comes into range. It knows that Mike stays there for about 2 minutes but never joins the network. It then moves out of the WLAN area. The data cruncher for the store wants to drive new customers to the store. They analyze the data and find that lots of people stay in the area for a couple of minutes. They equate this to people stopping to decide if they want to have a cup of coffee from the shop. They decide to create a CMX coupon push notification that pops up after one minute on devices that have been seen in the database for the last month. Mike will see a coupon for $1 off a cup of coffee the next time he waits for the light in front of the coffee shop.

That kind of reach is crazy. I keep thinking back to the scenes in Minority Report where the eye scanners would detect you looking at an advertisement and then target a specific ad based on your retina scan. You may say that’s science fiction. But with products like CMX, I can build a pretty complete profile of your behavior even if I don’t have a retina scan. Correlating information provides a clear picture of who you are without any real identity information. Knowing that someone likes to spend their time in the supermarket in the snack aisles and frozen food aisles and less time in the infants section says a lot. Knowing the route a given device takes through the store can help designers place high volume items in the back and force shoppers to take longer routes past featured items.


Tom’s Take

I’m not saying that CMX is a bad product. It’s providing functionality that can be of great use to retail companies. But, just like VHS recorders and Bittorrent, good ideas can often be used to facilitate things that aren’t as noble. I suggested to the CMX developers that they could implement some kind of “opt out” message that popped up if I hadn’t joined the wireless network in a certain period of time. I look at that as a way of saying to shoppers “We know you aren’t going to join. Press the button and we’ll wipe our your device info.” It puts people at ease to know they aren’t being tracked. Even just showing them what you’re collecting is a good start. With the future of advertising and marketing focusing on instant delivery and data gathering for better targeting, I think the products like CMX will be powerful additions. But, great power requires even greater responsibility.

Tech Field Day Disclaimer

Cisco was a presenter at the Tech Field Day Interop Roundtable.  They did not ask for any consideration in the writing of this review nor were they promised any.  The conclusions and analysis contained in this post are mine and mine alone.

Disruption in the New World of Networking

This is the one of the most exciting times to be working in networking. New technologies and fresh takes on existing problems are keeping everyone on their toes when it comes to learning new protocols and integration systems. VMworld 2013 served both as an annoucement of VMware’s formal entry into the larger networking world as well as putting existing network vendors on notice. What follows is my take on some of these announcements. I’m sure that some aren’t going to like what I say. I’m even more sure a few will debate my points vehemently. All I ask is that you consider my position as we go forward.

Captain Over, Captain Under

VMware, through their Nicira acquisition and development, is now *the* vendor to go to when you want to build an overlay network. Their technology augments existing deployments to provide software features such as load balancing and policy deployment. In order to do this and ensure that these features are utilized, VMware uses VxLAN tunnels between the devices. VMware calls these constructs “virtual wires”. I’m going to call them vWires, since they’ll likely be called that soon anyway. vWires are deployed between hosts to provide a pathway for communications. Think of it like a GRE tunnel or a VPN tunnel between the hosts. This means the traffic rides on the existing physical network but that network has no real visibility into the payload of the transit packets.

Nicira’s brainchild, NSX, has the ability to function as a layer 2 switch and a layer 3 router as well as a load balancer and a firewall. VMware is integrating many existing technologies with NSX to provide consistency when provisioning and deploying a new sofware-based network. For those devices that can’t be virtualized, VMware is working with HP, Brocade, and Arista to provide NSX agents that can decapsulate the traffic and send it to an physical endpoint that can’t participate in NSX (yet). As of the launch during the keynote, most major networking vendors are participating with NSX. There’s one major exception, but I’ll get to that in a minute.

NSX is a good product. VMware wouldn’t have released it otherwise. It is the vSwitch we’ve needed for a very long time. It also extends the ability of the virtualization/server admin to provision resources quickly. That’s where I’m having my issue with the messaging around NSX. During the second day keynote, the CTOs on stage said that the biggest impediment to application deployment is waiting on the network to be configured. Note that is my paraphrasing of what I took their intent to be. In order to work around the lag in network provisioning, VMware has decided to build a VxLAN/GRE/STT tunnel between the endpoints and eliminate the network admin as a source of delay. NSX turns your network in a fabric for the endpoints connected to it.

Under the Bridge

I also have some issues with NSX and the way it’s supposed to work on existing networks. Network engineers have spent countless hours optimizing paths and reducing delay and jitter to provide applications and servers with the best possible network. Now, that all doesn’t matter. vAdmins just have to click a couple of times and build their vWire to the other server and all that work on the network is for naught. The underlay network exists to provide VxLAN transport. NSX assumes that everything working beneath is running optimally. No loops, no blocked links. NSX doesn’t even participate in spanning tree. Why should it? After all, that vWire ensures that all the traffic ends up in the right location, right? People would never bridge the networking cards on a host server. Like building a VPN server, for instance. All of the things that network admins and engineers think about in regards to keeping the network from blowing up due to excess traffic are handwaved away in the presentations I’ve seen.

The reference architecture for NSX looks pretty. Prettier than any real network I’ve ever seen. I’m afraid that suboptimal networks are going to impact application and server performance now more than ever. And instead of the network using mechanisms like QoS to battle issues, those packets are now invisible bulk traffic. When network folks have no visibility into the content of the network, they can’t help when performance suffers. Who do you think is going to get blamed when that goes on? Right now, it’s the network’s fault when things don’t run right. Do you think that moving the onus for server network provisioning to NSX and vCenter is going to forgive the network people when things go south? Or are the underlay engineers going to be take the brunt of the yelling because they are the only ones that still understand the black magic outside the GUI drag-and-drop to create vWires?

NSX is for service enablement. It allows people to build network components without knowing the CLI. It also means that network admins are going to have to work twice as hard to build resilient networks that work at high speed. I’m hoping that means that TRILL-based fabrics are going to take off. Why use spanning tree now? Your application and service network sure isn’t. No sense adding any more bells and whistles to your switches. It’s better to just tie them into spine-and-leaf CLOS fabrics and be done with it. It now becomes much more important to concentrate on the user experience. Or maybe the wirless network. As long as at least one link exists between your ESX box and the edge switch let the new software networking guys worry about it.

The Recumbent Incumbent?

Cisco is the only major networking manufacturer not publicly on board with NSX right now. Their CTO Padma Warrior has released a response to NSX that talks about lock-in and vertical integration. Still others have released responses to that response. There’s a lot of talk right now about the war brewing between Cisco and VMware and what that means for VCE. One thing is for sure – the landscape has changed. I’m not sure how this is going to fall out on both sides. Cisco isn’t likely to stop selling switches any time soon. NSX still works just fine with Cisco as an underlay. VCE is still going to make a whole bunch of money selling vBlocks in the next few months. Where this becomes a friction point is in the future.

Cisco has been building APIs into their software for the last year. They want to be able to use those APIs to directly program the network through devices like the forthcoming OpenDaylight controller. Will they allow NSX to program them as well? I’m sure they would – if VMware wrote those instructions into NSX. Will VMware demand that Cisco use the NSX-approved APIs and agents to expose network functionality to their software network? They could. Will Cisco scrap OnePK to implement NSX? I doubt that very much. We’re left with a standoff. Cisco wants VMware to use their tools to program Cisco networks. VMware wants Cisco to use the same tools as everyone else and make the network a commodity compared to the way it is now.

Let’s think about that last part for a moment. Aside from some speed differences, networks are largely going to be identical to NSX. It won’t care if you’re running HP, Brocade, or Cisco. Transport is transport. Someone down the road may build some proprietary features into their hardware to make NSX run better but that day is far off. What if a manufacturer builds a switch that is twice as fast as the nearest competition? Three times? Ten times? At what point does the underlay become so important that the overlay starts preferring it exclusively?


Tom’s Take

I said a lot during the Tuesday keynote at VMworld. Some of it was rather snarky. I asked about full BGP tables and vMotioning the machines onto the new NSX network. I asked because I tend to obsess over details. Forgotten details have broken more of my networks than grand design disasters. We tend to fuss over the big things. We make more out of someone that can drive a golf ball hundreds of yards than we do about the one that can consistently sink a ten foot putt. I know that a lot of folks were pre-briefed on NSX. I wasn’t, so I’m playing catch up right now. I need to see it work in production to understand what value it brings to me. One thing is for sure – VMware needs to change the messaging around NSX to be less antagonistic towards network folks. Bring us into your solution. Let us use our years of experience to help rather than making us seem like pariahs responsible for all your application woes. Let us help you help everyone.

Poaching CCIEs

CCIEIce

During the CCIE Netvet Reception at Cisco Live 2013, a curious question came up during our Q&A session with CEO John Chambers. Paul Borghese asked if it was time for the partner restriction on CCIE tenure to be lifted in order to increase the value of a CCIE in the larger market. For those not familiar, when a CCIE is hired by a Cisco partner, they need to attach their number to the company in order for the company to receive the benefits of having hired a CCIE. Right now, that means counting toward the CCIE threshold for Silver and Gold status. When a CCIE leaves the the first company and moves to another partner their number stays associated with the original company for one year and cannot be counted with the new company until the expiration of that year.

There are a multitude of reasons why that might be the case. It encourages companies to pay for CCIE training and certification if the company knows that the newly-minted CCIE will be sticking around for at least a year past their departure. It also provides a lifeline to a Cisco partner in the event a CCIE decides to move on. By keeping the number attached to the company for a specific time period, the original company has the time necessary to hire or train new resources to take over for the departed CCIE’s job role. If the original partner is up for any contracts or RFPs that require a CCIE on staff, that grace period could be the difference between picking up or losing that contract.

As indicated above, Paul asked if maybe that policy needed to change. In his mind, the restriction of the CCIE number was causing CCIEs to stay at their current companies because their inability to move their number to the new company in a timely manner made them less valuable. I know now that the question came on behalf of Eman Conde, the CCIE Agent, who is very active in making sure the rights and privileges of CCIEs everywhere are well represented. I remember meeting Eman for the first time back at Cisco Live 2008 at an IPExpert party, long before I was a CCIE. In that time, Eman has worked very hard to make sure that CCIEs are well represented in the job market.  It is also in Eman’s best interests to ensure that CCIEs can move freely between companies without restriction.

My biggest fear is that removing the one-year association restriction for Cisco Partners will cause partners to stop funding CCIE development.  I was very fortunate to have my employer pay the entire cost of my CCIE from beginning to end.  In return, I agreed in principle to stay with them for a period of time and not seek employment from anyone else.  There was no agreement in place.  There was no contract.  Just a handshake.  Even after I left to go work with Gestalt IT, my number is locked to them for the next year.  This doesn’t really bother me.  It does make them feel better about moving to a competitor.  What would happen if I could move my number freely to the next business without penalty?

Could you imagine a world where CCIEs were being paid top dollar to work at a company not for their knowledge but because it was cheaper to buy CCIEs that it was to build them?  Think of a sports team that doesn’t have a good minor league system but instead buys their talent for absurd amounts of money.  If you had pictures of the New York Yankees in your head, you probably aren’t far removed from my line of thinking.  When the only value of a CCIE is associating the number to your company then you’ve missed the whole point of the program.

CCIEs are more valuable than their number.  With the exception of the Gold/Silver partner status their number is virtually useless.  What is more important is the partner specializations they can bring it.  My CCIE was pointless to my old employer since I was the only one.  What was a greater boon was all the partner certifications that I brought for unified communications, UCS implementation, and even project management.  Those certifications aren’t bound to a company.  In fact, I would probably be more marketable by going to a small partner with one CCIE or going to a silver partner with 3 CCIEs and telling them that I can bring in new lines of partner business while they are waiting for my number to clear escrow.  The smart partners will realize the advantage and hire me on and wait.  Only an impatient partner that wants to build a gold-level practice today would want to avoid number lock-in.

I don’t think we need to worry about removing the CCIE association restriction right now.  It serves to entice partners to fund CCIEs without worrying about them moving on as soon as they get certified.  Termination results in the number being freed up upon mutual agreement.  Most CCIEs that I’ve heard of that left their jobs soon after certification did it because their company told them they can’t afford to pay a CCIE.  Forcing small employers to let CCIEs walk away to bigger competitors with no penalty will prevent them from funding any more CCIE training.  They’ll say, “If the big partners want CCIEs so badly that they’ll pay bounties then let the big partners do all the training too.”  I don’t even think an employer non-compete would fix the issue as those aren’t enforceable in many states.  I think the program exists the way it does for a reason.  With all due deference to Eman and Paul, I don’t think we’ve reached the point where CCIE free agency is ready for prime time.

IOS X-Treme!

IOSXtreme

As a nerd, I’m a huge fan of science fiction. One of my favorite shows was Stargate SG-1. Inside the show, there was a joke involving an in-universe TV program called “Wormhole X-Treme” that a writer unintentionally created based on knowledge of the fictional Stargate program. Essentially, it’s a story that’s almost the same as the one we’re watching, with just enough differences to be a totally unique experience. In many ways, that’s how I feel about the new versions of Cisco’s Internetwork Operating System (IOS) that have been coming out in recent months. They may look very similar to IOS. They may behave similarly to IOS. But to mistake them for IOS isn’t right. In this post, I’m going to talk about the three most popular IOS-like variants – IOS XE, IOS XR, and NX-OS.

IOS XE

IOS XE is the most IOS-like of all the new IOS builds that have been released. That’s because the entire point of the IOS XE project was to rebuild IOS to future proof the technology. Right now, the IOS that runs on routers (which will henceforth be called IOS Classic) is a monolithic kernel that runs all of the necessary modules in the same memory space. This means that if something happens to the routing engine or the LED indicator, it can cause the whole IOS kernel to crash if it runs out of memory. That may have been okay years ago but today’s mission critical networks can’t afford to have a rogue process bringing down an entire chassis switch. Cisco’s software engineers set out on a mission to rebuild the IOS CLI on a more robust platform.

IOS XE runs as a system daemon on a “modern Linux platform.” Which one is anyone’s guess. Cisco also abstracted the system functions out of the main kernel and into separate processes. That means that if one of them goes belly up it won’t take the core kernel with it. One of the other benefits of running the kernel as a system daemon is that you can now balance the workload of the processes across multiple processor cores. This was one of the more exciting things to me when I saw IOS XE for the first time. Thanks to the many folks that pointed out to me that the ASR 1000 was the first device to run IOS XE. The Catalyst 4500 (the first switch to get IOS XE) is using a multi core processor to do very interesting things, like the ability to run inline Wireshark on a processor core while still letting IOS have all the processor power it needs. Here’s a video describing that:

Because you can abstract the whole operation of the IOS feature set, you can begin to do things like offer a true virtual router like the CSR 1000. As many people have recently discovered, the CSR 1000 is built on IOS XE and can be booted and operated in a virtualized environment (like VMware Fusion or ESXi). The RAM requirements are fairly high for a desktop virtualization platform (CSR requires 4GB of RAM to run), but the promise is there for those that don’t want to keep using GNS3/Dynamips or Cisco’s IOU to emulate IOS-like features. IOS XE is the future of IOS development. It won’t be long until the next generation of supervisor engines and devices will be using it exclusively instead of relying on IOS Classic.

IOS XR

In keeping with the sci-fi theme of this post, IOS XR is what the Mirror Universe version of IOS would look like. Much like IOS XE, IOS XR does away with the monolithic kernel and shared memory space of IOS Classic. XR uses an OS from QNX to serve as the base for the IOS functions. XR also segments the ancillary process in IOS into separate memory spaces to prevent system crashes from an errant bug. XR is aimed at the larger service provider platforms like the ASR 9000 and CRS series of routers. You can see that in the way that XR can allow multiple routing protocol processes to be executed at the same time in different memory spaces. That’s a big key to the service provider.

What makes IOS XR so different from IOS Classic? That lies in the configuration method. While the CLI may resemble the IOS that you’re used to, the change methodology is totally foreign to Cisco people. Instead of making live config changes on a live system, the running configuration is forked into a separate memory space. Once you have created all the changes that you need to make, you have to perform a sanity check on the config before it can be moved into live production. That keeps you from screwing something up accidentally. Once you have performed a sanity check, you have to explicitly apply the configuration via a commit command. In the event that the config you applied to the router does indeed contain errors that weren’t caught by the sanity checker (like the wrong IP), you can issue a command to revert to a previous working config in a process known as rollback. All of the previous configuration sets are retained in NVRAM and remain available for reversion.

If you’re keeping track at home, this sounds an awful lot like Junos. Hence my Mirror Universe analogy. IOS XR is aimed at service providers, which is a market dominated by Juniper. SPs have gotten very used to the sanity checking and rollback capabilities provided by Junos. Cisco decided to offer those features in an SP-specific IOS package. There are many that want to see IOS XR ported from the ASR/CSR lines down into more common SP platforms. Only time will tell if that will happen. Jeff Fry has an excellent series of posts on IOS XR that I highly recommend if you want to learn more about the specifics of configuration on that platform.

NX-OS

NX-OS is the odd man out from the IOS family. It originally started life as Cisco’s SAN-OS, which was responsible for running the MDS line of fibre channel switches. Once Cisco started developing the Nexus switching platform, they decided to use SAN-OS as the basis for the operating system, as it already contained much of the code that would be needed to allow networking and storage protocols to interoperate on the device, a necessity for a converged data center switch. Eventually, the new OS became known as NX-OS.

NX-OS looks similar to the IOS Classic interface that most engineers have become accustomed to. However, the underlying OS is very different from what you’re used to. First off, not every feature of classic IOS is available on demand. Yes, a lot of the more esoteric feature sets (like the DHCP server) are just plain unavailable. But even the feature sets that are listed as available in the OS may not be in the actual running code. You need to active each of these via use of the feature keyword when you want to enable them. This “opt in” methodology ensures that the running code only contains essential modules as well as the features you want. That should make the security people happy from an exploit perspective, as it lowers the available attack surface of your OS.

Another unique feature of NX-OS is the interface naming convention. In IOS Classic, each interface is named via the speed. You can have Ethernet, FastEthernet, GigabitEthernet, TenGigabit, and even FortyGigabit interfaces. In NX-OS, you have one – Ethernet. NX-OS treats all interfaces as Ethernet regardless of the underlying speed. That’s great for a modular switch because it allows you to keep the same configuration no matter which line cards are running in the device. It also allows you to easily port the configuration to a newer device, say from Nexus 5500 to Nexus 6000, without needed to do a find/replace operation on the config and risk changing a line you weren’t supposed to. Besides, most of the time the engineer doesn’t care about whether an interface is gigabit or ten gigabit. They just want to program the second port on the third line card.


Tom’s Take

No software program can survive without updates. Especially if it is an operating system. The hardware designed to run version 1.0 is never the same as the hardware that version 5.0 or even 10.0 utilizes. Everything evolves to become more efficient and useful. Think of it like seasons of sci-fi shows. Every season tells a story. There may be some similarities, but people overall want the consistency of the characters they’ve come to love coupled with new stories and opportunities to increase character development. Network operating systems like IOS are no different. Engineers want the IOS-like interface but they also want separated control planes, robust sanity checking, and modularized feature insertion. Much like the writers of sci-fi, Cisco will continue to provide new features and functionality while still retaining the things to which we’ve grown accustomed. However, if Cisco ever comes up with a hare-brained idea like the Ori, I can promise there’s no way I’ll ever run IOS-Origin.