CCIE Version 5: Out With The Old

Cisco announced this week that they are upgrading the venerable CCIE certification to version five.  It’s been about three years since Cisco last refreshed the exam and several thousand people have gotten their digits.  However, technology marches on.  Cisco talked to several subject matter experts (SMEs) and decided that some changes were in order.  Here are a few of the ones that I found the most interesting.

CCIEv5 Lab Schedule

Time Is On My Side

The v5 lab exam has two pacing changes that reflect reality a bit better.  The first is the ability to take some extra time on the troubleshooting section.  One of my biggest peeves about the TS section was the hard 2-hour time limit.  One of my failing attempts had me right on the verge of solving an issue when the time limit slammed shut on me.  If I only had five more minutes, I could have solved that problem.  Now, I can take those five minutes.

The TS section has an available 30 minute overflow window that can be used to extend your time.  Be aware that time has to come from somewhere, since the overall exam is still eight hours.  You’re borrowing time from the configuration section.  Be sure you aren’t doing yourself a disservice at the beginning.  In many cases, the candidates know the lab config cold.  It’s the troubleshooting the need a little more time with.  This is a welcome change in my eyes.

Diagnostics

The biggest addition is the new 30-minute Diagnostic section.  Rather than focusing on problem solving, this section is more about problem determination.  There’s no CLI.  Only a set of artifacts from a system with a problem: emails, log files, etc.  The idea is that the CCIE candidate should be an expert at figuring out what is wrong, not just how to fix it.  This is more in line with the troubleshooting sections in the Voice and Security labs.  Parsing log files for errors is a much larger part of my time than implementing routing.  Teaching candidates what to look for will prevent problems in the future with newly minted CCIEs that can diagnose issues in front of customers.

Some are wondering if the Diagnostic section is going to be the new “weed out” addition, like the Open Ended Questions (OEQs) from v3 and early v4.  I see the Diagnostic section as an attempt to temper the CCIE with more real world needs.  While the exam has never been a test of ideal design, knowing how to fix a non-ideal design when problems occur is important.  Knowing how to find out what’s screwed up is the first step.  It’s high time people learned how to do that.

Be Careful What You Wish For

The CCIE v5 is seeing a lot of technology changes.  The written exam is getting a new section, Network Principles.  This serves to refocus candidates away from Cisco specific solutions and more toward making sure they are experts in networking.  There’s a lot of opportunity to reinforce networking here and not idle trivia about config minimums and maximums.  Let’s hope this pays off.

The content of the written is also being updated.  Cisco is going to make sure candidates know the difference between IOS and IOS XE.  Cisco Express Forwarding is going to get a focus, as is ISIS (again).  Given that ISIS is important in TRILL this could be an indication of where FabricPath development is headed.  The written is also getting more IPv6 topics.  I’ll cover IPv6 in just a bit.

The biggest change in content is the complete removal of frame relay.  It’s been banished to the same pile as ATM and ISDN.  No written, no lab.  In it’s place, we get Dynamic Multipoint VPN (DMVPN).  I’ve talked about why Frame Relay is on the lab before.  People still complained about it.  Now, you get your wish.  DMVPN with OSPF serves the same purpose as Frame Relay with OSPF.  It’s all about Stupid Router Tricks.  Using OSPF with DMVPN requires use of mGRE, which is a Non-Broadcast Multi-Access (NBMA) network.  Just like Frame Relay.  The fact that almost every guide today recommends you use EIGRP with DMVPN should tell you how hard it is to do.  And now you’re forced to use OSPF to simulate NBMA instead of Frame Relay.  Hope all you candidates are happy now.

vCCIE

The lab is also 100% virtual now.  No physical equipment in either the TS or lab config sections.  This is a big change.  Cisco wants to reduce the amount of equipment that needs to be physically present to build a lab.  They also want to be able to offer the lab in more places than San Jose and RTP.  Now, with everything being software, they could offer the lab at any secured PearsonVUE testing center.  They’ve tried in the past, but the access requirements caused some disaster.  Now, it’s all delivered in a browser window.  This will make remote labs possible.  I can see a huge expansion of the testing sites around the time of the launch.

This also means that hardware-specific questions are out.  Like layer 2 QoS on switches.  The last reason to have a physical switch (WRR and SRR queueing) is gone.  Now, all you are going to get quizzed on is software functionality.  Which probably means the loss of a few easy points.  With the removal of Frame Relay and L2 QoS, I bet that services section of the lab is going to be really fun now.

IPv6 Is Real

Now, for my favorite part.  The JNCIE has had a robust IPv6 section for years.  All routing protocols need to be configured for IPv4 and IPv6.  The CCIE has always had a separate IPv6 section.  Not any more.  Going forward in version 5, all routing tasks will be configured for v4 and v6.  Given that RIPng has been retired to the written exam only (finally), it’s a safe bet that you’re going to love working with OSPFv3 and EIGRP for IPv6.

I think it’s great that Cisco has finally caught up to the reality of the world.  If CCIEs are well versed in IPv6, we should start seeing adoption numbers rise significantly.  Ensuring that engineers know to configure v4 and v6 simultaneously means dual stack is going to be the preferred transition method.  The only IPv6-related thing that worries me is the inclusion of an item on the written exam: IPv6 Network Address Translation.  You all know I’m a huge fan of NAT.  Especially NAT66, which is what I’ve been told will be the tested knowledge.

Um, why?!? 

You’ve removed RIPng to the trivia section.  You collapsed multicast into the main routing portions.  You’re moving forward with IPv6 and making it a critical topic on the test.  And now you’re dredging up NAT?!? We don’t NAT IPv6.  Especially to another IPv6 address.  Unique Local Addresses (ULA) is about the only thing I could see using NAT66.  Ed Horley (@EHorley) thinks it’s a bad idea.  Ivan Pepelnjak (@IOSHints) doesn’t think fondly of it either, but admits it may have a use in SMBs.  And you want CCIEs and enterprise network engineers to understand it?  Why not use LISP instead?  Or maybe a better network design for enterprises that doesn’t need NAT66?  Next time you need an IPv6 SME to tell you how bad this idea is, call me.  I’ve got a list of people.


Tom’s Take

I’m glad to see the CCIE update.  Getting rid of Frame Relay and adding more IPv6 is a great thing.  I’m curious to see how the Diagnostic section will play out.  The flexible time for the TS section is way overdue.  The CCIE v5 looks to be pretty solid on paper.  People are going to start complaining about DMVPN.  Or the lack of SDN-related content.  Or the fact that EIGRP is still tested.  But overall, this update should carry the CCIE far enough into the future that we’ll see CCIE 60,000 before it’s refreshed again.

More CCIE v5 Coverage:

Bob McCouch (@BobMcCouch) – Some Thoughts on CCIE R&S v5

Anthony Burke (@Pandom_) – Cisco CCIE v5

Daniel Dib (@DanielDibSWE) – RS v5 – My Thoughts

INE – CCIE R&S Version 5 Updates Now Official

IPExpert – The CCIE Routing and Switching (R&S) 5.0 Lab Is FINALLY Here!

Cisco CMX – Marketing Magic? Or Big Brother?

Cisco Logo

The first roundtable presenter at Interop New York was Cisco. Their Enterprise group always brings interesting technology to the table. This time, the one that caught my eye was the Connected Mobile Experience (CMX). CMX is a wireless mobility technology that allows a company to do some advanced marketing wizardry.

CMX uses your Cisco wireless network to monitor devices coming into the air space. They don’t necessarily have to connect to your wireless network for CMX to work. They just have to be beaconing for a network, which all devices do. CMX can then push a message to the device. This message can be a simple “thank you” for coming or something more advanced like a coupon or notification to download a store specific app. CMX can then store the information about that device, such as whether or not they joined the network, where they went, and how long they were there. This gives the company to pull some interesting statistics about their customer base. Even if they never hop on the wireless network.

I have to be honest here. This kind of technology gives me the bit of the creeps. I understand that user tracking is the hot new thing in retail. Stores want to know where you went, how long you stayed there, and whether or not you saw an advertisement or a featured item. They want to know your habits so as to better sell to you. The accumulation of that data over time allows for some patterns to emerge that can drive a retail operation’s decision making process.

A Thought Exercise

Think about an average person. We’ll call him Mike. Mike walks four blocks from his office to the subway station every day after work. He stops at the corner about halfway between to cross a street. On that street just happens to be a coffee shop using something like CMX. Mike has a brand new phone that uses wifi and bluetooth and Mike keeps them on all the time. CMX can detect when the device comes into range. It knows that Mike stays there for about 2 minutes but never joins the network. It then moves out of the WLAN area. The data cruncher for the store wants to drive new customers to the store. They analyze the data and find that lots of people stay in the area for a couple of minutes. They equate this to people stopping to decide if they want to have a cup of coffee from the shop. They decide to create a CMX coupon push notification that pops up after one minute on devices that have been seen in the database for the last month. Mike will see a coupon for $1 off a cup of coffee the next time he waits for the light in front of the coffee shop.

That kind of reach is crazy. I keep thinking back to the scenes in Minority Report where the eye scanners would detect you looking at an advertisement and then target a specific ad based on your retina scan. You may say that’s science fiction. But with products like CMX, I can build a pretty complete profile of your behavior even if I don’t have a retina scan. Correlating information provides a clear picture of who you are without any real identity information. Knowing that someone likes to spend their time in the supermarket in the snack aisles and frozen food aisles and less time in the infants section says a lot. Knowing the route a given device takes through the store can help designers place high volume items in the back and force shoppers to take longer routes past featured items.


Tom’s Take

I’m not saying that CMX is a bad product. It’s providing functionality that can be of great use to retail companies. But, just like VHS recorders and Bittorrent, good ideas can often be used to facilitate things that aren’t as noble. I suggested to the CMX developers that they could implement some kind of “opt out” message that popped up if I hadn’t joined the wireless network in a certain period of time. I look at that as a way of saying to shoppers “We know you aren’t going to join. Press the button and we’ll wipe our your device info.” It puts people at ease to know they aren’t being tracked. Even just showing them what you’re collecting is a good start. With the future of advertising and marketing focusing on instant delivery and data gathering for better targeting, I think the products like CMX will be powerful additions. But, great power requires even greater responsibility.

Tech Field Day Disclaimer

Cisco was a presenter at the Tech Field Day Interop Roundtable.  They did not ask for any consideration in the writing of this review nor were they promised any.  The conclusions and analysis contained in this post are mine and mine alone.

Disruption in the New World of Networking

This is the one of the most exciting times to be working in networking. New technologies and fresh takes on existing problems are keeping everyone on their toes when it comes to learning new protocols and integration systems. VMworld 2013 served both as an annoucement of VMware’s formal entry into the larger networking world as well as putting existing network vendors on notice. What follows is my take on some of these announcements. I’m sure that some aren’t going to like what I say. I’m even more sure a few will debate my points vehemently. All I ask is that you consider my position as we go forward.

Captain Over, Captain Under

VMware, through their Nicira acquisition and development, is now *the* vendor to go to when you want to build an overlay network. Their technology augments existing deployments to provide software features such as load balancing and policy deployment. In order to do this and ensure that these features are utilized, VMware uses VxLAN tunnels between the devices. VMware calls these constructs “virtual wires”. I’m going to call them vWires, since they’ll likely be called that soon anyway. vWires are deployed between hosts to provide a pathway for communications. Think of it like a GRE tunnel or a VPN tunnel between the hosts. This means the traffic rides on the existing physical network but that network has no real visibility into the payload of the transit packets.

Nicira’s brainchild, NSX, has the ability to function as a layer 2 switch and a layer 3 router as well as a load balancer and a firewall. VMware is integrating many existing technologies with NSX to provide consistency when provisioning and deploying a new sofware-based network. For those devices that can’t be virtualized, VMware is working with HP, Brocade, and Arista to provide NSX agents that can decapsulate the traffic and send it to an physical endpoint that can’t participate in NSX (yet). As of the launch during the keynote, most major networking vendors are participating with NSX. There’s one major exception, but I’ll get to that in a minute.

NSX is a good product. VMware wouldn’t have released it otherwise. It is the vSwitch we’ve needed for a very long time. It also extends the ability of the virtualization/server admin to provision resources quickly. That’s where I’m having my issue with the messaging around NSX. During the second day keynote, the CTOs on stage said that the biggest impediment to application deployment is waiting on the network to be configured. Note that is my paraphrasing of what I took their intent to be. In order to work around the lag in network provisioning, VMware has decided to build a VxLAN/GRE/STT tunnel between the endpoints and eliminate the network admin as a source of delay. NSX turns your network in a fabric for the endpoints connected to it.

Under the Bridge

I also have some issues with NSX and the way it’s supposed to work on existing networks. Network engineers have spent countless hours optimizing paths and reducing delay and jitter to provide applications and servers with the best possible network. Now, that all doesn’t matter. vAdmins just have to click a couple of times and build their vWire to the other server and all that work on the network is for naught. The underlay network exists to provide VxLAN transport. NSX assumes that everything working beneath is running optimally. No loops, no blocked links. NSX doesn’t even participate in spanning tree. Why should it? After all, that vWire ensures that all the traffic ends up in the right location, right? People would never bridge the networking cards on a host server. Like building a VPN server, for instance. All of the things that network admins and engineers think about in regards to keeping the network from blowing up due to excess traffic are handwaved away in the presentations I’ve seen.

The reference architecture for NSX looks pretty. Prettier than any real network I’ve ever seen. I’m afraid that suboptimal networks are going to impact application and server performance now more than ever. And instead of the network using mechanisms like QoS to battle issues, those packets are now invisible bulk traffic. When network folks have no visibility into the content of the network, they can’t help when performance suffers. Who do you think is going to get blamed when that goes on? Right now, it’s the network’s fault when things don’t run right. Do you think that moving the onus for server network provisioning to NSX and vCenter is going to forgive the network people when things go south? Or are the underlay engineers going to be take the brunt of the yelling because they are the only ones that still understand the black magic outside the GUI drag-and-drop to create vWires?

NSX is for service enablement. It allows people to build network components without knowing the CLI. It also means that network admins are going to have to work twice as hard to build resilient networks that work at high speed. I’m hoping that means that TRILL-based fabrics are going to take off. Why use spanning tree now? Your application and service network sure isn’t. No sense adding any more bells and whistles to your switches. It’s better to just tie them into spine-and-leaf CLOS fabrics and be done with it. It now becomes much more important to concentrate on the user experience. Or maybe the wirless network. As long as at least one link exists between your ESX box and the edge switch let the new software networking guys worry about it.

The Recumbent Incumbent?

Cisco is the only major networking manufacturer not publicly on board with NSX right now. Their CTO Padma Warrior has released a response to NSX that talks about lock-in and vertical integration. Still others have released responses to that response. There’s a lot of talk right now about the war brewing between Cisco and VMware and what that means for VCE. One thing is for sure – the landscape has changed. I’m not sure how this is going to fall out on both sides. Cisco isn’t likely to stop selling switches any time soon. NSX still works just fine with Cisco as an underlay. VCE is still going to make a whole bunch of money selling vBlocks in the next few months. Where this becomes a friction point is in the future.

Cisco has been building APIs into their software for the last year. They want to be able to use those APIs to directly program the network through devices like the forthcoming OpenDaylight controller. Will they allow NSX to program them as well? I’m sure they would – if VMware wrote those instructions into NSX. Will VMware demand that Cisco use the NSX-approved APIs and agents to expose network functionality to their software network? They could. Will Cisco scrap OnePK to implement NSX? I doubt that very much. We’re left with a standoff. Cisco wants VMware to use their tools to program Cisco networks. VMware wants Cisco to use the same tools as everyone else and make the network a commodity compared to the way it is now.

Let’s think about that last part for a moment. Aside from some speed differences, networks are largely going to be identical to NSX. It won’t care if you’re running HP, Brocade, or Cisco. Transport is transport. Someone down the road may build some proprietary features into their hardware to make NSX run better but that day is far off. What if a manufacturer builds a switch that is twice as fast as the nearest competition? Three times? Ten times? At what point does the underlay become so important that the overlay starts preferring it exclusively?


Tom’s Take

I said a lot during the Tuesday keynote at VMworld. Some of it was rather snarky. I asked about full BGP tables and vMotioning the machines onto the new NSX network. I asked because I tend to obsess over details. Forgotten details have broken more of my networks than grand design disasters. We tend to fuss over the big things. We make more out of someone that can drive a golf ball hundreds of yards than we do about the one that can consistently sink a ten foot putt. I know that a lot of folks were pre-briefed on NSX. I wasn’t, so I’m playing catch up right now. I need to see it work in production to understand what value it brings to me. One thing is for sure – VMware needs to change the messaging around NSX to be less antagonistic towards network folks. Bring us into your solution. Let us use our years of experience to help rather than making us seem like pariahs responsible for all your application woes. Let us help you help everyone.

Poaching CCIEs

CCIEIce

During the CCIE Netvet Reception at Cisco Live 2013, a curious question came up during our Q&A session with CEO John Chambers. Paul Borghese asked if it was time for the partner restriction on CCIE tenure to be lifted in order to increase the value of a CCIE in the larger market. For those not familiar, when a CCIE is hired by a Cisco partner, they need to attach their number to the company in order for the company to receive the benefits of having hired a CCIE. Right now, that means counting toward the CCIE threshold for Silver and Gold status. When a CCIE leaves the the first company and moves to another partner their number stays associated with the original company for one year and cannot be counted with the new company until the expiration of that year.

There are a multitude of reasons why that might be the case. It encourages companies to pay for CCIE training and certification if the company knows that the newly-minted CCIE will be sticking around for at least a year past their departure. It also provides a lifeline to a Cisco partner in the event a CCIE decides to move on. By keeping the number attached to the company for a specific time period, the original company has the time necessary to hire or train new resources to take over for the departed CCIE’s job role. If the original partner is up for any contracts or RFPs that require a CCIE on staff, that grace period could be the difference between picking up or losing that contract.

As indicated above, Paul asked if maybe that policy needed to change. In his mind, the restriction of the CCIE number was causing CCIEs to stay at their current companies because their inability to move their number to the new company in a timely manner made them less valuable. I know now that the question came on behalf of Eman Conde, the CCIE Agent, who is very active in making sure the rights and privileges of CCIEs everywhere are well represented. I remember meeting Eman for the first time back at Cisco Live 2008 at an IPExpert party, long before I was a CCIE. In that time, Eman has worked very hard to make sure that CCIEs are well represented in the job market.  It is also in Eman’s best interests to ensure that CCIEs can move freely between companies without restriction.

My biggest fear is that removing the one-year association restriction for Cisco Partners will cause partners to stop funding CCIE development.  I was very fortunate to have my employer pay the entire cost of my CCIE from beginning to end.  In return, I agreed in principle to stay with them for a period of time and not seek employment from anyone else.  There was no agreement in place.  There was no contract.  Just a handshake.  Even after I left to go work with Gestalt IT, my number is locked to them for the next year.  This doesn’t really bother me.  It does make them feel better about moving to a competitor.  What would happen if I could move my number freely to the next business without penalty?

Could you imagine a world where CCIEs were being paid top dollar to work at a company not for their knowledge but because it was cheaper to buy CCIEs that it was to build them?  Think of a sports team that doesn’t have a good minor league system but instead buys their talent for absurd amounts of money.  If you had pictures of the New York Yankees in your head, you probably aren’t far removed from my line of thinking.  When the only value of a CCIE is associating the number to your company then you’ve missed the whole point of the program.

CCIEs are more valuable than their number.  With the exception of the Gold/Silver partner status their number is virtually useless.  What is more important is the partner specializations they can bring it.  My CCIE was pointless to my old employer since I was the only one.  What was a greater boon was all the partner certifications that I brought for unified communications, UCS implementation, and even project management.  Those certifications aren’t bound to a company.  In fact, I would probably be more marketable by going to a small partner with one CCIE or going to a silver partner with 3 CCIEs and telling them that I can bring in new lines of partner business while they are waiting for my number to clear escrow.  The smart partners will realize the advantage and hire me on and wait.  Only an impatient partner that wants to build a gold-level practice today would want to avoid number lock-in.

I don’t think we need to worry about removing the CCIE association restriction right now.  It serves to entice partners to fund CCIEs without worrying about them moving on as soon as they get certified.  Termination results in the number being freed up upon mutual agreement.  Most CCIEs that I’ve heard of that left their jobs soon after certification did it because their company told them they can’t afford to pay a CCIE.  Forcing small employers to let CCIEs walk away to bigger competitors with no penalty will prevent them from funding any more CCIE training.  They’ll say, “If the big partners want CCIEs so badly that they’ll pay bounties then let the big partners do all the training too.”  I don’t even think an employer non-compete would fix the issue as those aren’t enforceable in many states.  I think the program exists the way it does for a reason.  With all due deference to Eman and Paul, I don’t think we’ve reached the point where CCIE free agency is ready for prime time.

IOS X-Treme!

IOSXtreme

As a nerd, I’m a huge fan of science fiction. One of my favorite shows was Stargate SG-1. Inside the show, there was a joke involving an in-universe TV program called “Wormhole X-Treme” that a writer unintentionally created based on knowledge of the fictional Stargate program. Essentially, it’s a story that’s almost the same as the one we’re watching, with just enough differences to be a totally unique experience. In many ways, that’s how I feel about the new versions of Cisco’s Internetwork Operating System (IOS) that have been coming out in recent months. They may look very similar to IOS. They may behave similarly to IOS. But to mistake them for IOS isn’t right. In this post, I’m going to talk about the three most popular IOS-like variants – IOS XE, IOS XR, and NX-OS.

IOS XE

IOS XE is the most IOS-like of all the new IOS builds that have been released. That’s because the entire point of the IOS XE project was to rebuild IOS to future proof the technology. Right now, the IOS that runs on routers (which will henceforth be called IOS Classic) is a monolithic kernel that runs all of the necessary modules in the same memory space. This means that if something happens to the routing engine or the LED indicator, it can cause the whole IOS kernel to crash if it runs out of memory. That may have been okay years ago but today’s mission critical networks can’t afford to have a rogue process bringing down an entire chassis switch. Cisco’s software engineers set out on a mission to rebuild the IOS CLI on a more robust platform.

IOS XE runs as a system daemon on a “modern Linux platform.” Which one is anyone’s guess. Cisco also abstracted the system functions out of the main kernel and into separate processes. That means that if one of them goes belly up it won’t take the core kernel with it. One of the other benefits of running the kernel as a system daemon is that you can now balance the workload of the processes across multiple processor cores. This was one of the more exciting things to me when I saw IOS XE for the first time. Thanks to the many folks that pointed out to me that the ASR 1000 was the first device to run IOS XE. The Catalyst 4500 (the first switch to get IOS XE) is using a multi core processor to do very interesting things, like the ability to run inline Wireshark on a processor core while still letting IOS have all the processor power it needs. Here’s a video describing that:

Because you can abstract the whole operation of the IOS feature set, you can begin to do things like offer a true virtual router like the CSR 1000. As many people have recently discovered, the CSR 1000 is built on IOS XE and can be booted and operated in a virtualized environment (like VMware Fusion or ESXi). The RAM requirements are fairly high for a desktop virtualization platform (CSR requires 4GB of RAM to run), but the promise is there for those that don’t want to keep using GNS3/Dynamips or Cisco’s IOU to emulate IOS-like features. IOS XE is the future of IOS development. It won’t be long until the next generation of supervisor engines and devices will be using it exclusively instead of relying on IOS Classic.

IOS XR

In keeping with the sci-fi theme of this post, IOS XR is what the Mirror Universe version of IOS would look like. Much like IOS XE, IOS XR does away with the monolithic kernel and shared memory space of IOS Classic. XR uses an OS from QNX to serve as the base for the IOS functions. XR also segments the ancillary process in IOS into separate memory spaces to prevent system crashes from an errant bug. XR is aimed at the larger service provider platforms like the ASR 9000 and CRS series of routers. You can see that in the way that XR can allow multiple routing protocol processes to be executed at the same time in different memory spaces. That’s a big key to the service provider.

What makes IOS XR so different from IOS Classic? That lies in the configuration method. While the CLI may resemble the IOS that you’re used to, the change methodology is totally foreign to Cisco people. Instead of making live config changes on a live system, the running configuration is forked into a separate memory space. Once you have created all the changes that you need to make, you have to perform a sanity check on the config before it can be moved into live production. That keeps you from screwing something up accidentally. Once you have performed a sanity check, you have to explicitly apply the configuration via a commit command. In the event that the config you applied to the router does indeed contain errors that weren’t caught by the sanity checker (like the wrong IP), you can issue a command to revert to a previous working config in a process known as rollback. All of the previous configuration sets are retained in NVRAM and remain available for reversion.

If you’re keeping track at home, this sounds an awful lot like Junos. Hence my Mirror Universe analogy. IOS XR is aimed at service providers, which is a market dominated by Juniper. SPs have gotten very used to the sanity checking and rollback capabilities provided by Junos. Cisco decided to offer those features in an SP-specific IOS package. There are many that want to see IOS XR ported from the ASR/CSR lines down into more common SP platforms. Only time will tell if that will happen. Jeff Fry has an excellent series of posts on IOS XR that I highly recommend if you want to learn more about the specifics of configuration on that platform.

NX-OS

NX-OS is the odd man out from the IOS family. It originally started life as Cisco’s SAN-OS, which was responsible for running the MDS line of fibre channel switches. Once Cisco started developing the Nexus switching platform, they decided to use SAN-OS as the basis for the operating system, as it already contained much of the code that would be needed to allow networking and storage protocols to interoperate on the device, a necessity for a converged data center switch. Eventually, the new OS became known as NX-OS.

NX-OS looks similar to the IOS Classic interface that most engineers have become accustomed to. However, the underlying OS is very different from what you’re used to. First off, not every feature of classic IOS is available on demand. Yes, a lot of the more esoteric feature sets (like the DHCP server) are just plain unavailable. But even the feature sets that are listed as available in the OS may not be in the actual running code. You need to active each of these via use of the feature keyword when you want to enable them. This “opt in” methodology ensures that the running code only contains essential modules as well as the features you want. That should make the security people happy from an exploit perspective, as it lowers the available attack surface of your OS.

Another unique feature of NX-OS is the interface naming convention. In IOS Classic, each interface is named via the speed. You can have Ethernet, FastEthernet, GigabitEthernet, TenGigabit, and even FortyGigabit interfaces. In NX-OS, you have one – Ethernet. NX-OS treats all interfaces as Ethernet regardless of the underlying speed. That’s great for a modular switch because it allows you to keep the same configuration no matter which line cards are running in the device. It also allows you to easily port the configuration to a newer device, say from Nexus 5500 to Nexus 6000, without needed to do a find/replace operation on the config and risk changing a line you weren’t supposed to. Besides, most of the time the engineer doesn’t care about whether an interface is gigabit or ten gigabit. They just want to program the second port on the third line card.


Tom’s Take

No software program can survive without updates. Especially if it is an operating system. The hardware designed to run version 1.0 is never the same as the hardware that version 5.0 or even 10.0 utilizes. Everything evolves to become more efficient and useful. Think of it like seasons of sci-fi shows. Every season tells a story. There may be some similarities, but people overall want the consistency of the characters they’ve come to love coupled with new stories and opportunities to increase character development. Network operating systems like IOS are no different. Engineers want the IOS-like interface but they also want separated control planes, robust sanity checking, and modularized feature insertion. Much like the writers of sci-fi, Cisco will continue to provide new features and functionality while still retaining the things to which we’ve grown accustomed. However, if Cisco ever comes up with a hare-brained idea like the Ori, I can promise there’s no way I’ll ever run IOS-Origin.

Cisco Borderless Idol

Cisco Logo

Day one of Network Field Day 5 (NFD5) included presentations from the Cisco Borderless team. You probably remember their “speed dating” approach at NFD4 which gave us a wealth of information in 15 minute snippets. The only drawback to that lineup is when you find a product or a technology that interests you there really isn’t any time to quiz the presenter before they are ushered off stage. Someone must have listened when I said that before, because this time they brought us 20 minute segments – 10 minutes of presentation, 10 minutes of demo. With the switching team, we even got to vote on our favorite to bring the back for the next round (hence the title of the post). More on that in a bit.

6500 Quad Supervisor Redundancy

First up on the block was the Catalyst 6500 team. I swear this switch is the Clint Howard of networking, because I see it everywhere. The team wanted to tell us about a new feature available in the ((verify code release)) code on the Supervisor 2T (Sup2T). Previously, the supervisor was capable of performing a couple of very unique functions. The first of these was Stateful Switch Over (SSO). During SSO, the redundant supervisor in the chassis can pick up where the primary left off in the event of a failure. All of the traffic sessions can keep on trucking even if the active sup module is rebooting. This gives the switch a tremendous uptime, as well as allowing for things like hitless upgrades in production. The other existing feature of the Sup2T is Virtual Switching System (VSS). VSS allows two Sup2Ts to appear as one giant switch. This is helpful for applications where you don’t want to trust your traffic to just one chassis. VSS allows for two different chassis to terminate Multi-Chassis EtherChannel (MLAG) connections so that distribution layer switches don’t have a single point of failure. Traffic looks like it’s flowing to one switch when in actuality it may be flowing to one or the other. In the event that a Supervisor goes down, the other one can keep forwarding traffic.

Enter the Quad Sup SSO ability. Now, instead of having an RPR-only failover on the members of a VSS cluster, you can setup the redundant Sup2T modules to be ready and waiting in the event of a failure. This is great because you can lose up to three Sup2Ts at once and still keep forwarding while they reboot or get replaced. Granted, anything that can take out 3 Sup2Ts at once is probably going to take down the fourth (like power failure or power surge), but it’s still nice to know that you have a fair amount of redundancy now. This only works on the Sup2T, so you can’t get this if you are still running the older Sup720. You also need to make sure that your linecards support the newer Distributed Forwarding Card 3 (DFC3), which means you aren’t going to want to do this with anything less than a 6700-series line card. In fact, you really want to be using the 6800 series or better just to be on the safe side. As Josh O’brien (@joshobrien77) commented, this is a great feature to have. But it should have been there already. I know that there are a lot of technical reasons why this wasn’t available earlier, and I’m sure the increase fabric speeds in the Sup2T, not to mention the increased capability of the DFC3, are the necessary component for the solution. Still, I think this is something that probably should have shipped in the Sup2T on the first day. I suppose that given the long road the Sup2T took to get to us that “better late than never” is applicable here.

UCS-E

Next up was the Cisco UCS-E series server for the ISR G2 platform. This was something that we saw at NFD4 as well. The demo was a bit different this time, but for the most part this is similar info to what we saw previously.


Catalyst 3850 Unified Access Switch

The Catalyst 3800 is Cisco’s new entry into the fixed-configuration switch arena. They are touting this a “Unified Access” solution for clients. That’s because the 3850 is capable of terminating up to 50 access points (APs) per stack of four. This think can basically function as a wiring closet wireless controller. That’s because it’s using the new IOS wireless controller functionality that’s also featured in the new 5760 controller. This gets away from the old Airespace-like CLI that was so prominent on the 2100, 2500, 4400, and 5500 series controllers. The 3850, which is based on the 3750X, also sports a new 480Gbps Stackwise connector, appropriately called Stackwise480. This means that a stack of 3850s can move some serious bits. All that power does come at a cost – Stackwise480 isn’t backwards compatible with the older Stackwise v1 and v2 from the 3750 line. This is only an issue if you are trying to deploy 3850s into existing 3750X stacks, because Cisco has announced the End of Sale (EOS) and End of Life (EOL) information for those older 3750s. I’m sure the idea is that when you go to rip them out, you’ll be more than happy to replace them with 3850s.

The 3850 wireless setup is a bit different from the old 3750 Access Controller that had a 4400 controller bolted on to it. The 3850 uses Cisco’s IOS-XE model of virtualizing IOS into a sort of VM state that can run on one core of a dual-core processor, leaving the second core available to do other things. Previously at NFD4, we’d seen the Catalyst 4500 team using that other processor core for doing inline Wireshark captures. Here, the 3850 team is using it to run the wireless controller. That’s a pretty awesome idea when you think about it. Since I no longer have to worry about IOS taking up all my processor and I know that I have another one to use, I can start thinking about some interesting ideas.

The 3850 does have a couple of drawbacks. Aside from the above Stackwise limitations, you have to terminate the APs on the 3850 stack itself. Unlike the CAPWAP connections that tunnel all the way back to the Airespace-style controllers, the 3850 needs to have the APs directly connected in order to decapsulate the tunnel. That does provide for some interesting QoS implications and applications, but it doesn’t provide much flexibility from a wiring standpoint. I think the primary use case is to have one 3850 switch (or stack) per wiring closet, which would be supported by the current 50 AP limitation. the othe drawback is that the 3850 is currently limited to a stack of four switches, as opposed to the increased six switch limit on the 3750X. Aside from that, it’s a switch that you probably want to take a look at in your wiring closets now. You can buy it with an IP Base license today and then add on the AP licenses down the road as you want to bring them online. You can even use the 3850s to terminate CAPWAP connections and manage the APs from a central controller without adding the AP license.

Here is the deep dive video that covers a lot of what Cisco is trying to do from a unified wired and wireless access policy standpoint. Also, keep an eye out for the cute Unifed Access video in the middle.

Private Data Center Mobility

I found it interesting this this demo was in the Borderless section and not the Data Center presentation. This presentation dives into the world of Overlay Transport Virtualization (OTV). Think of OTV like an extra layer of 802.1 q-in-q tunneling with some IS-IS routing mixed in. OTV is Cisco’s answer to extending the layer 2 boundary between data centers to allow VMs to be moved to other sites without breaking their networking. Layer 2 everywhere isn’t the most optimal solution, but it’s the best thing we’ve got to work with the current state of VM networking (until Nicira figures out what they’re going to do).

We loved this session so much that we asked Mostafa to come back and talk about it more in depth.

The most exciting part of this deep dive to me was the introduction of LISP. To be honest, I haven’t really been able to wrap my head around LISP the first couple of times that I saw it. Now, thanks to the Borderless team and Omar Sultan (@omarsultan), I’m going to dig into a lot more in the coming months. I think there are some very interesting issues that LISP can solve, including my IPv6 Gordian Knot.


Tom’s Take

I have to say that I liked Cisco’s approach to the presentations this time.  Giving us discussion time along with a demo allowed us to understand things before we saw them in action.  The extra five minutes did help quite a bit, as it felt like the presenters weren’t as rushed this time.  The “Borderless Idol” style of voting for a presentation to get more info out of was brilliant.  We got to hear about something we wanted to go into depth about, and I even learned something that I plan on blogging about later down the line.  Sure, there was a bit of repetition in a couple of areas, most notably UCS-E, but I can understand how those product managers have invested time and effort into their wares and want to give them as much exposure as possible.  Borderless hits all over the spectrum, so keeping the discussion focused in a specific area can be difficult.  Overall, I would say that Cisco did a good job, even without Ryan Secrest hosting.

Tech Field Day Disclaimer

Cisco was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, Cisco provided me with a breakfast and lunch at their offices.  They also provided a Moleskine notebook, a t-shirt, and a flashlight toy.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Cisco Data Center Duel

Cisco Logo

Network Field Day 5 started off with a full day at Cisco. The Data Center group opened and closed the day, with the Borderless team sandwiched in between. Omar Sultan (@omarsultan) greeted us as we settled in for a continental breakfast before getting started.

The opening was a discussion of onePK, a popular topic as of late from Cisco. While the topic du jour in the networking world is software-defined networking (SDN), Cisco steers the conversation toward onePK. This, at its core, is API access to all the flavors of the Internetwork Operating System (IOS). While other vendors discuss how to implement protocols like OpenFlow or how to expose pieces of their underlying systems to developers, Cisco has built a platform to allow access into pieces and parts of the OS. You can write applications in Java or Python to pull data from the system or push configurations to it. The process is slowly being rolled out to the major Cisco platforms. The support for the majority of the Nexus switching line should give the reader a good idea of where Cisco thinks this technology will be of best use.

One of the specific applications that Cisco showed off to us using onePK is the use of Puppet to provision switches from bare metal to functioning with a minimum of human effor. Puppet integration was a big underlying topic at both Cisco and Juniper (more on that in the Juniper NFD5 post). Puppet is gaining steam in the netowrking industry as a way to get hardware up and running quickly with the least amount of fuss. Server admins have enjoyed the flexibility of Puppet for a some time. It’s good to see well-tested and approved software like this being repurposed for similar functionality in the world of routing and switching.

Next up was a discussion about the Cisco ONE network controller. Controllers are a very hot topic in the network world today. OpenFlow allows a central management and policy server to push information and flow data into switches. This allows network admins to get a “big picture” of the network and how the packets are flowing across it. Having the ability to view the network in its entirity also allows admins to start partitioning it in a process called “slicing.” This was one of the first applications that the Stanford wiz kids used OpenFlow to accomplish. It makes sense when you think about how universities wanted to partition off their test networks to prevent this radical OpenFlow idea from crashing the production hardware. Now, we’re looking at using slicing for things like multi-tenancy and security. The building blocks are there to make some pretty interesting leaps. The real key is that the central controller have the ability to keep up with the flows being pushed through the network. Cisco’s ONE controller not only speaks OpenFlow, but onePK as well. This means that while the ONE controller can talk to disparate networking devices running OpenFlow, it will be able to speak much more clearly to any Cisco devices you have lying around. That’s a pretty calculated play from Cisco, given that the initial target for their controller will be networks populated primarily by Cisco equipment. The use case that was given to us for the Cisco ONE controller was replacing large network taps with SDN options. Fans of NFD may remember our trip to Gigamon. Cisco hadn’t forgotten, as the network tap they used as an example in their slide looked just like the orange Gigamon switch we saw at a previous NFD.

After the presentations from the Borderless team, we ended the day with an open discussion around a few topics. This is where the real fun started. Here’s the video:

The first hour or so is a discussion around hybrid switching. I had some points in here about the standoff between hardware and software people not really wanting to get along right now. I termed it a Mexican Standoff because no one really wants to flinch and go down the wrong path. The software people just want to write overlays and things like and make it run on everything. The entrenched hardware vendors, like Cisco, want to make sure their hardware is providing better performance than anyone else (because that’s where their edge is). Until someone decides to take a chance and push things in different directions, we’re not going to see much movement. Also, around 1:09:00 is where we talked a bit about Cisco jumping into the game with a pure OpenFlow switch without much more on top of it. This concept seemed a bit foreign to some of the Cisco folks, as they can’t understand why people wouldn’t want IOS and onePK. That’s where I chimed in with my “If I want a pickup truck, I don’t take a chainsaw to a school bus.” You shouldn’t have to shed all the extra stuff to get the performance you want. Start with a smaller platform and work your way up instead of starting with the kitchen sink and stripping things away.

Shortly after this is where the fireworks started. One of Cisco’s people started arguing that OpenFlow isn’t the answer. He said that the customer he was talking to didn’t want OpenFlow. He even went so far as to say that “OpenFlow is a fantasy because it promises everything and there’s nothing in production.” (about 1:17:00) Folks, this was one of the most amazing conversations I’ve ever seen at a Network Field Day event. The tension in the room was palpable. Brent and Greg were on this guy the entire time about how OpenFlow was solving real problems for customers today, and in Brent’s case he’s running it in production. I really wonder how the results of this are going to play out. If Cisco hears that their customers don’t care that much about OpenFlow and just want their gear to do SDN like in onePK then that’s what they are going to deliver. The question then becomes whether or not network engineers that believe that OpenFlow has a big place in the networks of tomorrow can convince Cisco to change their ways.

If you’d like to learn more about Cisco, you can find them at http://www.cisco.com/go/dc.  You can follow their data center team on Twitter as @CiscoDC.


Tom’s Take

Cisco’s Data Center group has a lot of interesting things to say about programmability in the network. From discussions about APIs to controllers to knock down, drag out aruguments about what role OpenFlow is going to play, Cisco has the gamut covered. I think that their position at the top of the network heap gives them a lot of insight into what’s going on. I’m just worried that they are going to use that to push a specific agenda and not embrace useful technologies down the road that solve customer problems. You’re going to hear a lot more from Cisco on software defined networking in the near future as they begin to roll out more and more features to their hardware in the coming months.

Tech Field Day Disclaimer

Cisco was a sponsor of Network Field Day 5.  As such, they were responsible for covering a portion of my travel and lodging expenses while attending Network Field Day 5.  In addition, Cisco provided me with a breakfast and lunch at their offices.  They also provided a Moleskine notebook, a t-shirt, and a flashlight toy.  At no time did they ask for, nor where they promised any kind of consideration in the writing of this review.  The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Additional NFD5 Blog Posts

NFD5: Cisco onePK – Terry Slattery

NFD5: SDN and Unicorn Blood – Omar Sultan