This WAN Is Your WAN, This WAN Is My WAN

Straw Bales on Hill Landscape, Tuscany, Italy

Straw Bales on Hill Landscape, Tuscany, Italy

Ideas coalesce all the time in every vertical. You don’t really notice it until you wake up one day and suddenly everything around you looks identical. Wireless becoming the new access layer. Flash storage taking hold of the high end performance crown. And in networking we have the dominance of all things software defined. One recent development has coming along much faster than anyone could have predicted: Software Defined Wide Area Networking (SD-WAN).

Automatic For The People

SD-WAN is a force in modern networking because people want simplicity. While Ivan does a great job of decoupling marketing from reality, people still believe that SD-WAN is the silver bullet that will fix all of their WAN woes. Even during the original discussions of SD-WAN technology at conferences like ONUG, the overriding idea wasn’t around tying sites together or driving down costs to the point of feasibility. It was all about making life easier.

How does SD-WAN manage to accomplish this? It’s all black box networking. Just like the fuel injector in your car. There’s no crying about interoperability or standards-based protocols. You just plug things in and it all works, even if you can’t exactly plug one vendor solution into a competitor. Lock in wins again.

The ideas behind SD-WAN aren’t exactly new. Cisco talked about SD-WAN quite a bit at Networking Field Day 10. Here’s Jeff Reed on it:

The rest of the two hour session details how Cisco is using their Intelligent WAN (IWAN) product to drive SD-WAN. The names of the components all sound very familiar to networkers: DMVPN, NBAR, PfR, and so on. That’s because SD-WAN uses a lot of tried-and-true techniques to tie the concept together. There’s nothing earth-shattering about SD-WAN under the hood. In fact, a fair number of people that work at the “pioneering” SD-WAN startups all seem to have their roots in one or more traditional networking companies.

Fables of Reconstruction

Look at the other presenters at Networking Field Day 10. Two of them announced SD-WAN solutions even though they aren’t really known for expertise in SD-WAN. One of them wasn’t even known as a branch office acceleration solution. So why the SD-WAN land rush all of the sudden? What’s behind the need to have a solution?

You probably wouldn’t be surprised to learn that a lot of investors are backing expansion into SD-WAN technologies. It’s a hot property. But why? As above, customers aren’t interested in the technical wizardry that goes into SD-WAN. They aren’t clamoring for it to supplant their current WAN solution and offer a Rosetta Stone of inter-vendor WAN cooperation. What’s behind the push?

It probably goes something like this:

  1. Technologist needs to implement WAN architecture. Is dismayed that things are so difficult.
  2. Technologist starts searching for solutions about WAN. They probably start asking friends about it.
  3. Analyst firm hears that technologists are asking about WAN solutions. Releases a questionnaire asking which technologies you’d like to learn more about.
  4. Responses to questionnaires are loaded into a graph or report that people buy because they don’t know who to talk to.
  5. Companies realize customers want WAN solutions. They break their necks to offer those solutions to keep up with demand.
  6. Investors see companies beginning to offer WAN solutions and think there’s a huge untapped market. They start funding anyone that mentions WAN in a meeting.

By the way, you can replace “WAN” with any technology above and it still works.

Thanks to customers needing a solution for something they can’t configure easily they are going to be inundated with SD-WAN options by the time they turn around. And the biggest concern no long becomes “Who has the easiest solution?” but instead, “Who is still going to be here in six months?”

Collapse Into Now

The reckoning is coming in the SD-WAN market. If a company doesn’t already have an SD-WAN solution in development or if their solution won’t see daylight for another nine months, they are going to exercise the second “B” of innovation and buy it. And they have a lot of prime targets to choose from.

Investors get cagey without an exit strategy. How are they going to win at this game? They either have to get paid with an IPO, with a later round of funding, or by having someone buy out the investment. If an investor thinks they can get their money back (plus a bit of interest) by having this little startup bought by a traditional networking vendor you can better believe they will be advising the startup to sell.

The customers are the real losers in the case of a buyout, or worse a bankruptcy. Those highly proprietary solutions become dead weight if there isn’t any support for them any longer. Black box networking falls apart when the little magical creatures inside the box go away. Which means customers will be skittish of supporting a solution that is likely to go away any time soon.

Who will you support? An established vendor slow to roll out a solution? Or an up-and-coming company with new ideas but at risk of being snapped up by a big bank account?

Tom’s Take

I loved seeing all the SD-WAN discussion at Networking Field Day 10. SD-WAN is no longer magic sauce that aggregates DSL and MPLS circuits with encryption. Nuage Networks showed off deploying Docker apps to remote sites. Riverbed talked about using their WAN optimization experience to deploy SaaS solutions through SD-WAN.

We’ve heard from SD-WAN companies in the past at Networking Field Day. It’s interesting to hear the comparisons between the upstarts and the old geezers. It’s clear there is a ton of money that is being invested in SD-WAN. The trick is to find out your needs and pick the best solution for you. Otherwise you may find yourself losing your SD-WAN religion.


The Packet Flow Duality


Quantum physics is a funny thing. It seeks to solve all the problems in the physical world by breaking everything down into the most basic unit possible. That works for a lot of the observable universe. But when it comes to light, quantum physics has issues. Thanks to experiments and observations, most scientists understand that light isn’t just a wave and it’s not just a collection of particles either. It’s both. This concept is fundamental to understanding how light behaves. But can it also explain how data behaves?

Moving Things Around

We tend to think about data as a series of discrete data units being pushed along a path. While these units might be frames, packets, or datagrams depending on the layer of the OSI model that you are operating at, the result is still the same. A single unit is evaluated for transmission. A brilliant post from Greg Ferro (@EtherealMind) sums up the forwarding thusly:

  • Frames being forwarded by MAC address lookup occur at layer 2 (switching)
  • Packets being forwarded by IP address lookup occur at layer 3 (routing)
  • Data being forwarded at higher levels is a stream of packets (flow forwarding)

It’s simple when you think about it. But what makes it a much deeper idea is that lookup at layer 2 and 3 requires a lot more processing. Each of the packets must be evaluated to be properly forwarded. The forwarding device doesn’t assume that the destination is the same for a group of similar packets. Each one must be evaluated to ensure it arrives at the proper location. By focusing on the discrete nature of the data, we are forced to expend a significant amount of energy to make sense of it. As anyone that studied basic packet switching can tell you, several tricks were invented to speed up this process. Anyone remember store-and-forward versus cut-through switching?

Flows behave differently. They contain state. They have information that helps devices make intelligent forwarding decisions. Those decisions don’t have to be limited by destination MAC or IP addresses. They can be labels or VLANs or other pieces of identifying information. They can be anything an application uses to talk to another device, like a DNS entry. It allows us to make a single forwarding decision per flow and implement it quickly and efficiently. Think about a stateful firewall. It works because the information for a given packet stream (or flow) can be programmed into the device. The firewall is no longer examining every individual packet, but instead evaluates the entire group of packets when making decisions.

Consequently, stageful firewalls also give us a peek at how flows are processed. Rather than having a CAM table or an ARP table, we have a group of rules and policies. Those policies can say “given a group of packets in a flow matching these characteristics, execute the following actions”. That’s a far cry from trying to figure out where each one goes.

It’s All About Scale

A single drop of water is discrete. Just like a single data packet, it represents an atomic unit of water. Taken in this measurement, a single drop of water does little good. It’s only when those drops start to form together that their usefulness becomes apparent. Think of a river or a firehose. Those groups of droplets have a vector. They can be directed somewhere to accomplish something, like putting out a fire or cutting a channel across the land.

Flows should be the atomic unit that we base our networking decisions upon. Flows don’t require complex processing on a per-unit basis. Flows carry additional information above and beyond a 48-bit hex address or a binary address representing an IP entry. Flows can be manipulated and programmed. They can have policies applied. Flows can scale to great heights. Packets and frames are forever hampered by the behaviors necessary to deliver them to the proper locations.

Data is simultaneously a packet and a flow. We can’t separate the two. What we can do is change our frame of reference for operations. Just like experiments with light, we must choose one aspect of the duality to act until such time as the other aspect is needed. Light can be treated like a wave the majority of the time. It’s only when things like the photoelectric effect happen that our reference must change. In the same way, data should be treated like a flow for the majority of cases. Only when the very basic needs of packet/frame/datagram forwarding are needed should we abandon our flow focus and treat it as a group of discrete packets.

Tom’s Take

The idea of data flows isn’t new. And neither is treating flows as the primary form of forwarding. That’s what OpenFlow has been doing for quite a while now. What makes this exciting is when people with new networking ideas start using the flow as an atomic unit for decisions. When you remove the need to do packet-by-packet forwarding and instead focus on the flow, you gain a huge insight into the world around the packet. It’s not much a stretch to think that the future of networking isn’t as concerned with the switching of frames or routing of packets. Instead, it’s the forwarding of a flow of packets that will be exciting to watch. As long as you remember that data can be both packet and flow you will have taken your first step into a larger world of understanding.


Cisco Just Killed The CLI


Gallons of virtual ink have been committed to virtual paper in the last few days with regards to Cisco’s lawsuit against Arista Networks.  Some of it is speculating on the posturing by both companies.  Other writers talk about the old market vs. the new market.  Still others look at SDN as a driver.

I didn’t just want to talk about the lawsuit.  Given that Arista has marketed EOS as a “better IOS than IOS” for a while now, I figured Cisco finally decided to bite back.  They are fiercely protective of IOS and they have to be because of the way the trademark laws in the US work.  If you don’t go after people that infringe you lose your standing to do so and invite others to do it as well.  Is Cisco’s timing suspect? One does have to wonder.  Is this about knocking out a competitor? It’s tough to say.  But one thing is sure to me.  Cisco has effectively killed the command line interface (CLI).

“Industry Standards”

EOS is certainly IOS-like.  While it does introduce some unique features (see the NFD3 video here), the command syntax is very much IOS.  That is purposeful.  There are two broad categories of CLIs in the market:

  • IOS-like – EOS, HP Procurve, Brocade, FTOS, etc
  • Not IOS-like – Junos, FortiOS, D-Link OS, etc

What’s funny is that the IOS-like interfaces have always been marketed as such.  Sure, there’s the famous “industry standard” CLI comment, followed by a wink and a nudge.  Everyone knows what OS is being discussed.  It is a plus point for both sides.

The non-Cisco vendors can sell to networking teams by saying that their CLI won’t change.  Everything will be just as easy to configure with just a few minor syntax changes.  Almost like speaking a different dialect of a language.  Cisco gains because more and more engineers become familiar with the IOS syntax.  Down the line, those engineers may choose to buy Cisco based on familiarity with the product.

If you don’t believe that being IOS-like is a strong selling point, take a look PIX and Airespace.  The old PIX OS was transformed into something that looked a lot more like traditional IOS.  In ASA 8.2 they even changed the NAT code to look like IOS.  With Airespace it took a little longer to transform the alien CLI into something IOS-like.  They even lost functionality in doing so, simply to give networking teams an interface that is more friendly to them.  Cisco wants all their devices to run a CLI that is IOS-like.  Junos fans are probably snickering right now.

In calling out Arista for infringing on the “generic command line interface” in patent #7,047,526, Cisco has effectively said that they will start going after companies that copy the IOS interface too well.  This leaves companies in a bit of conundrum.  How can you continue to produce an OS with an “industry standard” CLI and hope that you don’t become popular enough to get noticed by Cisco?  Granted, it seems that all network switching vendors are #2 in the market somehow.  But at what point does being a big enough #2 get the legal hammer brought to bear?  Do you have to be snarky in marketing messages? Attack the 800-pound gorilla enough that you anger them?  Or do you just have to have a wildly successful quarter?

Laid To REST

Instead, what will happen is a tough choice.  Either continue to produce the same CLI year and year and hope that you don’t get noticed or overhaul the whole system.  Those that choose not to play Russian Roulette with the legal system have a further choice to make.  Should we create a new, non-infringing CLI from the ground up? Or scrap the whole idea of a CLI moving forward?  Both of those second choices are going to involve a lot of pain and effort.  One of them has a future.

Rewriting the CLI is a dead-end road.  By the time you’ve finished your Herculean task you’ll find the market has moved on to bigger and better things.  The SDN revolution is about making complex networks easier to program and manage.  Is that going to be accomplished via yet another syntax?  Or will it happen because of REST APIs and programing interfaces?  Given an equal amount of time and effort on both sides, the smart networking company will focus their efforts on scrapping the CLI and building programmability into their devices.  Sure, the 1.0 release is going to sting a little.  It’s going to require a controller and some rough interface conventions.  But building the seeds of a programmable system now means it will be growing while other CLIs are withering on the vine.

It won’t be easy.  It won’t be fun.  And it’s a risk to alienate your existing customer base.  But if your options are to get sued or spend all your effort on a project that will eventually go the way of the dodo your options don’t look all that appealing anyway.  If you’re going to have to go through the upheaval of rewriting something from the ground up, why not choose to do it with an eye to the future?

Tom’s Take

Cisco and Arista won’t be finished for a while.  There will probably be a settlement or a licensing agreement or some kind of capitulation on both sides in a few years time.  But by that point, the fallout from the legal action will have finally finished off the CLI for good.  There’s no sense in gambling that you won’t be the next target of a process server.  The solution will involve innovative thinking, blood, sweat, and tears on the part of your entire development team.  But in the end you’ll have a modern system that works with the new wave of the network.  If nothing else, you can stop relying on the “industry standard” ploy when selling your interface and start telling your customers that you are setting the new standard.


The Trap of Net Neutrality


The President recently released a video and statement urging the Federal Communications Commission (FCC) to support net neutrality and ensure that there will be no “pay for play” access to websites or punishment for sites that compete against a provider’s interests.  I wholeheartedly support the idea of net neutrality.  However, I do like to stand on my Devil’s Advocate soapbox every once in a while.  Today, I want to show you why a truly neutral Internet may not be in our best interests.

Lawful Neutral

If the FCC mandates a law that the Internet must remain neutral, it will mean that all traffic must be treated equally.  That’s good, right?  It means that a provider can’t slow my Netflix stream or make their own webmail service load faster than Google or Yahoo.  It also means that the provider can’t legally prioritize packets either.

Think about that for a moment.  We, as network and voice engineers, have spent many an hour configuring our networks to be as unfair as possible.  Low-latency queues for voice traffic.  Weighted fair queues for video and critical applications.  Scavenger traffic classes and VLANs for file sharers and other undesirable bulk noise.  These plans take weeks to draw up and even longer to implement properly.  It helps us make sense out of the chaos in the network.

By mandating a truly neutral net, we are saying that those carefully marked packets can’t escape from the local network with their markings intact.  We can’t prioritize voice packets once they escape the edge routers.  And if we move applications to the public cloud, we can’t ensure priority access.  Legally, the providers will be forced to remark all CoS and DSCP values at the edge and wash their hands of the whole thing.

And what about provider MPLS circuits?  If the legally mandated neutral provider is administering your MPLS circuits (as they do in small and medium enterprise), can they copy the DSCP values to the MPLS TE field before forwarding the packet?  Where does the law stand on prioritizing private traffic transiting a semi-public link?

Chaotic Neutral

The idea of net neutrality is that no provider should have the right to decide how your traffic should be handled.  But providers will extend that idea to say they can’t deal with any kind of marking.  They won’t legally be able to offer you differentiated service even if you were wiling to pay for it.  That’s the double-edge sword of neutrality.

You can be sure that the providers will already have found a “solution” to the problem.  Today, quality of service (QoS) only becomes an issue when the link becomes congested.  Packets don’t queue up if there’s bandwidth available to use.  So the provider solution is simple.  If you need differentiated service, you need to buy a bigger pipe.  Over provision your WAN circuits!  We can’t guarantee delivery unless you have more bandwidth than you need!  Who cares what the packets are marked?  Which, of course, leads to a little gem from everyone’s favorite super villain:


Of course, the increased profits from these services will line the pockets of the providers instead of going to build out the infrastructure necessary to support these overbuilt networks.  The only way to force providers to pony up the money to build out networks is to make it so expensive to fail that the alternative is better.  That requires complex negotiation and penalty-laden, iron-clad service level agreements (SLAs).

The solution to the issue of no prioritized traffic is to provide a list of traffic that should be prioritized.  Critical traffic like VoIP should be allowed to be expedited, as the traffic characteristics and protections we afford it make sense.  Additionally, traffic destined for a public cloud site that function as internal traffic of a company should be able to be prioritized across the provider network.  Tunneling or other forms of traffic protection may be necessary to ensure this doesn’t interfere with other users.  Exempt traffic should definitely be the exception, not the rule.  And it should never fall on the providers to determine which traffic should be exempted from neutrality rules.

Tom’s Take

Net neutrality is key to the future of society.  The Internet can’t function properly if someone else with a vested interest in profits decides how we consume content.  It’s like the filter bubble of Google.  A blind blanket policy doesn’t do us any good, either.  Everyone involved in networking knows there are types of traffic that can be prioritized without having a detrimental effect.  We need to make smart decisions about net neutrality and know when to make exceptions.  But that power needs to be in the hands of the users and customers.  They will make decisions in their best interest.  The providers should have the capability to implement the needs of their customers.  Only then will the Internet be truly neutral.

Overlay Transport and Delivery


The difference between overlay networks and underlay networks still causes issues with engineers everywhere.  I keep trying to find a visualization that boils everything down to the basics that everyone can understand.  Thanks to the magic of online ordering, I think I’ve finally found it.

Candygram for Mongo

Everyone on the planet has ordered something from Amazon (I hope).  It’s a very easy experience.  You click a few buttons, type in a credit card number, and a few days later a box of awesome shows up on your doorstep.  No fuss, no muss.  Things you want show up with no effort on your part.

Amazon is the world’s most well-known overlay network.  When you place an order, a point-to-point connection is created between you and Amazon.  Your item is tagged for delivery to your location.  It’s addressed properly and finds its way to you almost by magic.  You don’t have to worry about your location.  You can have things shipped to a home, a business, or a hotel lobby halfway across the country.  The magic of an overlay is that the packets are going to get delivered to the right spot no matter what.  You don’t need to worry about the addressing.

That’s not to say there isn’t some issue with the delivery.  With Amazon, you can pay for expedited delivery.  Amazon Prime members can get two-day shipping for a flat fee.  In overlays, your packets can take random paths depending on how the point-to-point connection is built.  You can pay to have a direct path provided the underlay cooperates with your wishes.  But unless a full mesh exists, your packet delivery is going to be at the mercy of the most optimized path.

Mongo Only Pawn In Game Of Life

Amazon only works because of the network of transports that live below it.  When you place an order, your package could be delivered any number of ways.  UPS, FedEx, DHL, and even the US Postal Service can be the final carrier for your package.  It’s all a matter of who can get your package there the fastest and the cheapest.  In many ways, the transport network is the underlay of physical shipping.

Routes are optimized for best forwarding.  So are UPS trucks.  Network conditions matter a lot to both packets and packages.  FedEx trucks stuck in traffic jams at rush hour don’t do much good.  Packets that traverse slow data center interconnects during heavy traffic volumes risk slow packet delivery.  And if the road conditions or cables are substandard?  The whole thing can fall apart in an instant.

Underlays are the foundation that higher order services are built on.  Amazon doesn’t care about roads.  But if their shipping times get drastically increased due to deteriorating roadways you can bet their going to get to the bottom of it.  Likewise, overlay networks don’t directly interact with the underlay but if packet delivery is impacted people are going to take a long hard look at what’s going on down below.

Tom’s Take

I love Amazon.  It beats shopping in big box stores and overpaying for things I use frequently.  But I realize that the infrastructure in place to support the behemoth that is Amazon is impressive.  Amazon only works because the transport system in place is optimized to the fullest.  UPS has a computer system that eliminates left turns from driver routes.  This saves fuel even if it means the routes are a bit longer.

Network overlays work the same way.  They have to rely on an optimized underlay or the whole system crashes in on itself.  Instead of worrying about the complexity of introducing an overlay on top of things, we need to work on optimizing the underlay to perform as quickly as possible.  When the underlay is optimized, the whole thing works better.

Is LISP The Answer to Multihoming?


One of the biggest use cases for Locator/Identifier Separation Protocol (LISP) that will benefit small and medium enterprises is the ability to multihome to different service providers without needing to run Border Gateway Protocol (BGP). It’s the answer to a difficult and costly problem. But is it really the best solution?

Current SMB users may find themselves in a situation where they can’t run BGP. Perhaps their upstream ISP blocks the ability to establish a connection. In many cases, business class service is required with additional fees necessary to multihome. In order to take full advantage of independent links to different ISPs, two (or more) NAT configurations are required to send and receive packets correctly across the balanced connections. While technically feasible, it’s a mess to troubleshoot. It also doesn’t scale when multiple egress connections are configured. And more often that not, the configuration to make everything work correctly exists on a single router in the network, eliminating the advantages of multihoming.

LISP seeks to solve this by using a mapping database to send packets to the correct Ingress Tunnel Router (ITR) without the need for BGP. The diagram of a LISP packet looks a lot like an overlay. That’s because it is in many ways. The LISP packets are tunneled from an Egress Tunnel Router (ETR) to a LISP speaking decapsulation point. Depending on the deployment policies of LISP for a given ISP, it could be the next hop router on a connection. It could also be a router several hops upstream. LISP is capable of operating over non-LISP speaking connections, but it does eventually need decapsulation.

Where’s the Achille’s Heel in this design? LISP may solve the issue without BGP, but it does introduce the need for the LISP session to terminate on a single device (or perhaps a group of devices). This creates issues in the event the link goes down and the backup link needs to be brought online. That tunnel state won’t be preserved across the failover. It’s also a gamble to assume your ISP will support LISP. Many large ISPs should give you options to terminate LISP connections. But what about the smaller ISP that services many SMB companies? Does the local telephone company have the technical ability to configure a LISP connection? Let along making it redundant and highly available?

Right Tool For The Job

I think back to a lesson my father taught me about tools. He told me, “Son, you can use a screwdriver as a chisel if you try hard enough. But you’re better off spending the money to buy a chisel.” The argument against using BGP to multihome ISP connections has always come down to cost. I’ve gotten into heated discussions with people that always come back to the expense of upgrading to a business-class connection to run BGP or ensure availability. NAT may allow you to multihome across two residential cable modems, but why do you need 99.999% uptime across those two if you’re not willing to pay for it?

LISP solves one issue only to introduce more. I see LISP being misused the same way NAT has been. LISP was proposed by David Meyer to solve the exploding IPv4 routing table and the specter of an out-of-control IPv6 routing table.  While multihoming is certainly another function that it can serve, I don’t think that was Meyer’s original idea.  BGP might not be perfect, but it’s what we’ve got.  We’ve been using it for a while and it seems to get the job done.  LISP isn’t going to replace BGP by a long shot.  All you have to do it look at LISP ALternate Topology (LISP-ALT), which was the first iteration of the mapping database before the current LISP-TREE.  Guess what LISP-ALT used for mapping?  That’s right, BGP.

Tom’s Take

LISP multihoming for IPv4 or IPv6 in SMEs isn’t going to fix the problem we have today with trying to create redundancy from consumer-grade connections.  It is another overlay that will create some complexity and eventually not be adopted because there are still enough people out there that are willing to forgo an interesting idea simply because it came from Cisco.  IPv6 multihoming can be fixed at the protocol level.  Tuning router advertisements or configuring routes at the edge with BGP will get the job done, even if it isn’t as elegant as LISP.  Using the right tool for the right job is the way to make multihoming happen.

CCIE Version 5: Out With The Old

Cisco announced this week that they are upgrading the venerable CCIE certification to version five.  It’s been about three years since Cisco last refreshed the exam and several thousand people have gotten their digits.  However, technology marches on.  Cisco talked to several subject matter experts (SMEs) and decided that some changes were in order.  Here are a few of the ones that I found the most interesting.

CCIEv5 Lab Schedule

Time Is On My Side

The v5 lab exam has two pacing changes that reflect reality a bit better.  The first is the ability to take some extra time on the troubleshooting section.  One of my biggest peeves about the TS section was the hard 2-hour time limit.  One of my failing attempts had me right on the verge of solving an issue when the time limit slammed shut on me.  If I only had five more minutes, I could have solved that problem.  Now, I can take those five minutes.

The TS section has an available 30 minute overflow window that can be used to extend your time.  Be aware that time has to come from somewhere, since the overall exam is still eight hours.  You’re borrowing time from the configuration section.  Be sure you aren’t doing yourself a disservice at the beginning.  In many cases, the candidates know the lab config cold.  It’s the troubleshooting the need a little more time with.  This is a welcome change in my eyes.


The biggest addition is the new 30-minute Diagnostic section.  Rather than focusing on problem solving, this section is more about problem determination.  There’s no CLI.  Only a set of artifacts from a system with a problem: emails, log files, etc.  The idea is that the CCIE candidate should be an expert at figuring out what is wrong, not just how to fix it.  This is more in line with the troubleshooting sections in the Voice and Security labs.  Parsing log files for errors is a much larger part of my time than implementing routing.  Teaching candidates what to look for will prevent problems in the future with newly minted CCIEs that can diagnose issues in front of customers.

Some are wondering if the Diagnostic section is going to be the new “weed out” addition, like the Open Ended Questions (OEQs) from v3 and early v4.  I see the Diagnostic section as an attempt to temper the CCIE with more real world needs.  While the exam has never been a test of ideal design, knowing how to fix a non-ideal design when problems occur is important.  Knowing how to find out what’s screwed up is the first step.  It’s high time people learned how to do that.

Be Careful What You Wish For

The CCIE v5 is seeing a lot of technology changes.  The written exam is getting a new section, Network Principles.  This serves to refocus candidates away from Cisco specific solutions and more toward making sure they are experts in networking.  There’s a lot of opportunity to reinforce networking here and not idle trivia about config minimums and maximums.  Let’s hope this pays off.

The content of the written is also being updated.  Cisco is going to make sure candidates know the difference between IOS and IOS XE.  Cisco Express Forwarding is going to get a focus, as is ISIS (again).  Given that ISIS is important in TRILL this could be an indication of where FabricPath development is headed.  The written is also getting more IPv6 topics.  I’ll cover IPv6 in just a bit.

The biggest change in content is the complete removal of frame relay.  It’s been banished to the same pile as ATM and ISDN.  No written, no lab.  In it’s place, we get Dynamic Multipoint VPN (DMVPN).  I’ve talked about why Frame Relay is on the lab before.  People still complained about it.  Now, you get your wish.  DMVPN with OSPF serves the same purpose as Frame Relay with OSPF.  It’s all about Stupid Router Tricks.  Using OSPF with DMVPN requires use of mGRE, which is a Non-Broadcast Multi-Access (NBMA) network.  Just like Frame Relay.  The fact that almost every guide today recommends you use EIGRP with DMVPN should tell you how hard it is to do.  And now you’re forced to use OSPF to simulate NBMA instead of Frame Relay.  Hope all you candidates are happy now.


The lab is also 100% virtual now.  No physical equipment in either the TS or lab config sections.  This is a big change.  Cisco wants to reduce the amount of equipment that needs to be physically present to build a lab.  They also want to be able to offer the lab in more places than San Jose and RTP.  Now, with everything being software, they could offer the lab at any secured PearsonVUE testing center.  They’ve tried in the past, but the access requirements caused some disaster.  Now, it’s all delivered in a browser window.  This will make remote labs possible.  I can see a huge expansion of the testing sites around the time of the launch.

This also means that hardware-specific questions are out.  Like layer 2 QoS on switches.  The last reason to have a physical switch (WRR and SRR queueing) is gone.  Now, all you are going to get quizzed on is software functionality.  Which probably means the loss of a few easy points.  With the removal of Frame Relay and L2 QoS, I bet that services section of the lab is going to be really fun now.

IPv6 Is Real

Now, for my favorite part.  The JNCIE has had a robust IPv6 section for years.  All routing protocols need to be configured for IPv4 and IPv6.  The CCIE has always had a separate IPv6 section.  Not any more.  Going forward in version 5, all routing tasks will be configured for v4 and v6.  Given that RIPng has been retired to the written exam only (finally), it’s a safe bet that you’re going to love working with OSPFv3 and EIGRP for IPv6.

I think it’s great that Cisco has finally caught up to the reality of the world.  If CCIEs are well versed in IPv6, we should start seeing adoption numbers rise significantly.  Ensuring that engineers know to configure v4 and v6 simultaneously means dual stack is going to be the preferred transition method.  The only IPv6-related thing that worries me is the inclusion of an item on the written exam: IPv6 Network Address Translation.  You all know I’m a huge fan of NAT.  Especially NAT66, which is what I’ve been told will be the tested knowledge.

Um, why?!? 

You’ve removed RIPng to the trivia section.  You collapsed multicast into the main routing portions.  You’re moving forward with IPv6 and making it a critical topic on the test.  And now you’re dredging up NAT?!? We don’t NAT IPv6.  Especially to another IPv6 address.  Unique Local Addresses (ULA) is about the only thing I could see using NAT66.  Ed Horley (@EHorley) thinks it’s a bad idea.  Ivan Pepelnjak (@IOSHints) doesn’t think fondly of it either, but admits it may have a use in SMBs.  And you want CCIEs and enterprise network engineers to understand it?  Why not use LISP instead?  Or maybe a better network design for enterprises that doesn’t need NAT66?  Next time you need an IPv6 SME to tell you how bad this idea is, call me.  I’ve got a list of people.

Tom’s Take

I’m glad to see the CCIE update.  Getting rid of Frame Relay and adding more IPv6 is a great thing.  I’m curious to see how the Diagnostic section will play out.  The flexible time for the TS section is way overdue.  The CCIE v5 looks to be pretty solid on paper.  People are going to start complaining about DMVPN.  Or the lack of SDN-related content.  Or the fact that EIGRP is still tested.  But overall, this update should carry the CCIE far enough into the future that we’ll see CCIE 60,000 before it’s refreshed again.

More CCIE v5 Coverage:

Bob McCouch (@BobMcCouch) – Some Thoughts on CCIE R&S v5

Anthony Burke (@Pandom_) – Cisco CCIE v5

Daniel Dib (@DanielDibSWE) – RS v5 – My Thoughts

INE – CCIE R&S Version 5 Updates Now Official

IPExpert – The CCIE Routing and Switching (R&S) 5.0 Lab Is FINALLY Here!