There Won’t Be A CCIE: SDN. Here’s Why

There’s a lot of work that’s been done recently to bring the CCIE up to modern network standards. Yusuf and his team are working hard to incorporate new concepts into the written exam. Candidates are broadening their horizons and picking up new ideas as they learn about industry stalwarts like OSPF and spanning tree. But the biggest challenge out there is incorporating the ideas behind software defined networking (SDN) into the exam. I don’t believe that this will ever happen. Here’s why.

Take This Broken Network

If you look at the CCIE and what it’s really testing, the exam is really about troubleshooting and existing network integration. The CCIE introduces and tests on concepts like link aggregation, routing protocol redistribution, and network service implementation. These are things that professionals are expected to do when they walk in the door, either as a consultant or as someone advising on the incorporation of a new network.

The CCIE doesn’t deal with the design of a network from the ground up. It doesn’t task someone with coming up with the implementation of a greenfield network from scratch. The CCIE exam, especially the lab component, only tests a candidate on their ability to work on something that has already exists. That’s been one of the biggest criticisms of the CCIE for a very long time. Since the knowledge level of a CCIE is at the highest level, they are often drafted to design networks rather than implementing them.

That’s the reason why the CCDE was created. CCDEs create networks from nothing. Their coursework focuses on taking requirements and making a network out of it. That’s why their practical exam focuses less on command lines and more on product knowledge and implementation details. The CCDE is where people that build networks prove they know their trade.

The Road You Must Design For

When you look at the concepts behind SDN, it’s not really built for troubleshooting and implementation without thought. Yes, automation does help implementation. Orchestration helps new devices configure themselves on the fly. API access allows us to pull all kinds of useful information out of the network for the purposes of troubleshooting and management. But each and every one of these things is not in the domain of the CCIE.

Can SDN solve the thorny issues behind redistributing EIGRP into OSPF? How about creating Multiple Spanning Tree instances for odd numbered VLANs? Will SDN finally help me figure out how to implement Frame Relay Traffic Shaping without screwing up the QoS policies? The answer to almost every one of these questions is no.

SDNs major advantages can only be realized with forethought and guidelines. Orchestration and automation make sense when implemented in pods or with new greenfield deployments. Once they have been tested and proven, these concepts can be spread across the entire network and used to ease design woes.

Does it make more sense to start using Ansible and Jinja at the beginning? Or halfway through a deployment? Would you prefer to create Python scripts to poll against APIs after you’ve implemented a different network monitoring system (NMS)? Or would it make more sense to do it right from the start?

CCIEs may see SDN in practice as they start using things like APIC-EM to roll out polices in the network, but CCDEs are the real SDN gatekeepers. They alone can make the decisions to incorporate these ideas from the very beginning to leverage capabilities to ease deployment and make troubleshooting easier. Even though CCIEs won’t see SDN, they will reap the benefits from it being baked in to everything they do.


Tom’s Take

Rather than asking when the CCIE is going to get SDN-ified, a better question would be “Should the CCIE worry?” The answer, as explained above, is no. SDN isn’t something that a CCIE needs to study for. CCDEs, on the other hand, will be hugely impacted by SDN and it will make a big difference to them in the long run. Rather than forcing CCIEs into a niche role that they aren’t necessarily suited for, we should instead let them do what they do best. We should incorporate SDN concepts into the CCDE and let them do what they do best and make the network a better place for CCIEs. Everyone will be better in the long run.

Building Reliability

Systems are inherently reliable. Until they aren’t. On a long enough timeline, even the most reliable system will eventually fail. How you manage that failure says a lot about the way your build your system or application. So, why is it then that we’re so focused on failing?

Ten Feet Tall And Bulletproof

No system is infallible. Networks go down. Cloud services get knocked offline. Even Facebook, which represents “the Internet” for a large number of people, has days when it’s unreachable. When we examine these outages, we often find issues at the core of the system that cause services to be unreachable. In the most recent case of Amazon’s cloud system, it was a typo in a script that executed faster than it could be stopped.

It could also be a failure of the system to anticipate increased loads when minor failures happen. If systems aren’t built to take on additional load when the worst happens, you’re going to see bigger outages. That is a particular thorn in the side of large cloud providers like Amazon and Google. It’s also something that network architects need to be aware of when building redundant pathways to handle problems.

Take, for example, a recent demo during Aruba Atmosphere 2017. During the Day 2 keynote, CTO Partha Narasimhan wowed the crowd in the room when he disclosed that they had been doing a controller upgrade during the morning talk. Users had been tweeting, surfing, and using the Internet without much notice from anyone aside from the most technical wireless minds in the room. Even they could only see some strange AP roaming behavior as an indicator of the controllers upgrading the APs.

Aruba showed that they built a resilient network that could survive a simulated major outage cause by a rolling upgrade. They’ve done everything they can to ensure uptime no matter what happens. But the bigger question for architects and engineers is “why are we solving the problem for others?”

Why Dodge Bullets When You Don’t Have To?

As amazing as it is to build a system that can survive production upgrades with no impact on users, what are we really building when we create these networks? Are we encouraging our users to respect our technology advantage in the network or other systems? Are we telling our application developers that they can count on us to keep the lights on when anything goes wrong? Or are we instead sending the message that we will keep scrambling to prevent issues in applications from being noticeable?

Building a resilient network is easy. Making something reliable isn’t rocket science. But create a network that is going to stay up for a long, long time without any outages is very expensive and process intensive. Engineering something to never be down requires layers of exception handling and backup systems that are as reliable as their primary counterparts.

A favorite story from the storage world involves recovery. When you initially ask a customer what their recovery point objective (RPO) is in a system, the answer is almost always “zero” or “as low as you can make it”. When the numbers are put together to include redundant or dual-active systems with replication and data assurance, the price tag of the solution is usually enough to start a new round of discussion along the lines of “how reliable can you make it for this budget?”

In the networking and systems world, we don’t have the luxury of sticker shock when it comes to creating reliability. Storage systems can have longer RPOs because lost data is gone forever. Taking the time to ensure it is properly recovered is important. But data in transmission can be retransmitted. That’s at the heart of TCP. So it’s expected that networks have near-instantaneous RPOs for no extra costs. If you don’t believe that, ask yourself what happens when you tell someone the network went down because there’s only one router or switch connecting devices together.

Instead of making systems ultra-reliable and absolving users and developers from thought, networks and systems should be as reliable as they can be made without huge additional costs. That reliability should be stated emphatically without wiggle room. These constraints should inform developers writing code so that exception handling can be built in to prevent issues when the inevitable outage occurs. Knowing your limitations is the first step to creating an atmosphere to overcome them.

A lesson comes from the programmers of old. When you have a limited amount of RAM, storage, or compute cycles, you can write very tight code. DOS programs didn’t need access to a cloud worth of compute. Mainframes could execute programs written on punch cards. The limitations were simple and could be overcome with proper problem solving. As compute and memory resources have exploded, so too have code bases. Rather than giving developers the limitless capabilities of the cloud without restraint, perhaps creating some limits is the proper way to ensure that reliability stays in the app instead of being bolted on to the network.


Tom’s Take

We had a lot of fun recording this roundtable. We talked about Aruba’s controller upgrade and building reliable wireless networks. But I think we also need to make sure we’re aware that continually creating protocols and other constructs in the underlay won’t solve application programming problems. Things like vMotion set networking and application development back a decade. Giving developers a magic solution to avoid building proper exception handling doesn’t make better developers. Instead, it puts the burden of uptime back on the networking team. And we would rather build the best network we can instead of building something that can solve every problem that could every possibly be created.

Can I Question You An Ask?

question-mark-706906_1280

Words mean things! — Justin Warren (@JPWarren)

 

As a reader of my blog, you know that words are my tradecraft. Picking the right word to describe a topic or a technical idea is very important. Using incorrect grammar can cause misunderstandings and lead to issues later on. You’re probably all familiar with my dissection of the Premise vs. Premises issue in IT, but today’s post is all about interrogatives.

A Question, You Say?

One would think that the basic question is something that doesn’t need to be explained. It is one of the four basic types of sentences that we learn in grade school. It’s the easiest one of the bunch to pick out because it ends in a question mark. Other languages, like Japanese, have similar signals for making a statement into an interrogative declaration.

Asking a question is important because it allows us to understand our world. We learn when we ask questions. We grow as people and as professionals. Kids learn to question everything around them at an early age to figure out how the world works. Questions are a cornerstone of society.

However, how do you come up with question? In what manner do you call for an answer to an interrogative statement? How do you make a request? Or seek information? How do we know how to relay a question to someone at all?

We. Ask.

Note that ask is a verb. It can be transitive or intransitive. It’s something that we do so transparently that it never even crosses our minds. We ask for directions. We ask for help. We asking for a lunch suggestion. But every time we do, we are using the word to perform an action. Until we aren’t.

Ask Not

A trend in IT that dates all the way back to at least 2004 is the use of ask as noun. Note that this would take the form of the following:

What’s the ask here?

That’s a mighty big ask of the engineering department.

I’m still looking for the ask here.

Even though this practice has roots that stretch back even further, the primary use of ask as a noun is in the IT space. The same group that thinks on-premise refers to a location believes that asks are really questions or requests. Are they using it in the same way that they shorten premises by one syllable? Do they need to save time by using a one-syllable word in place of a two-syllable one?

Raymond Chen’s article linked above does have a bit of insight from even a decade ago. The idea behind using ask as a noun really comes from trying to wrap a demand in a more palatable coating. Think back to the number of times that some has an ask and substitute the word request or demand and see if it is really appropriate there. Odds are good that it fits seamlessly.

If we go back to the idea that words still mean things and that precision is the key to saving time instead of shortening words, then why are we using ask instead of the other words? Is it, as Raymond says, because the speaker is trying be passive-aggressive? Are they trying to avoid using a better, more inflammatory word? Or do they truly believe that using ask is a better way to convey things? Maybe they just hope it makes them sound cool and futuristic?


Tom’s Take

Hearing ask as a noun makes my ears crawl. Do we question asks? Or do we ask questions? To we make requests? Or do we request makes? Despite the fact that the use of ask as a noun comes back time after time from history, it quickly goes away as being awkward and non-specific in conversation. I think it’s due time for this generation of ask as a noun to disappear and be relegated to the less important questions of history.

Networking Grows To Invisibility

Cat5

Networking is done. The way you have done things before is finished. The writing has been on the wall for quite a while now. But it’s going to be a good thing.

The Old Standard

Networking purchase models look much different today than they have in the past. Enterprises no longer buy a switch or a router. Instead, they buy solution packages. The minimum purchase unit is a networking pod or rack. Perhaps your proof-of-concept minimum is a leaf-spine of no less than 3 switches. Firewalls are purchased in pairs. Nowhere in networking is something simple any longer.

With the advent of software, even the deployment of these devices is different. Automation and orchestration systems provide provisioning as the devices are brought online. Network Monitoring Systems ensure the devices are operating correctly via API call instead of relying on SNMP. Analytics and telemetry systems can pull statistics on the fly and create datasets that give you insight into all manner of network traffic. The intelligence built into the platform supporting the hardware is more apparent than ever before.

Networking is no longer about fast connectivity speed. Instead, networking is about stability. Providing a transport network that stays healthy instead of growing by leaps and bounds every few years. Organizations looking to model their IT departments after service providers and cloud providers care more about having a reliable system than the most cutting edge technology.

This is nothing new in IT. Both storage and virtualization have moved in this direction for a while. Hardware wizardry has been replaced by software intelligence. Custom hardware is now merchant-based and easy to replace and build. The expertise in deployment and operations has more to do with integration and architecture than in simple day-to-day setup.

The New Normal

Where does that leave networkers? Are we a dying breed, soon to join the Unix admins of the word and telco experts on a beach in retirement? The reality is that things aren’t as dire for us as one might believe.

It is true that we have shifted our thinking away from operations and more toward system building. Rather than worry if the switch ports have been provisioned, we instead look at creating resilient constructs that can survive outages and traffic spikes. Networks are becoming the utility service we’ve always hoped they would be.

This is not the end. It’s the beginning. As networks join storage and compute as utilities in the data center, the responsibilities for our sphere of wizardry are significantly reduced. Rather than spending our time solving crazy user or developer problems, we can instead focus on the key points of stability and availability.

This is going to be a huge shift for the consumers of IT as well. As cloud models have already shown us, people really want to get their IT on their schedules. They want to “buy” storage and networking when it’s needed without interruption. Creating a utility resource is the best way to accomplish that. No longer will the blame for delays be laid at the feet of IT.

But at the same time, the safety net of IT will be gone as well. Unlike Chief Engineer Scott, IT can’t save the day when a developer needs to solve a problem outside of their development environment. Things like First Hop Reachability Protocols (FHRP), multipathing, and even vMotion contribute to bad developer behavior. Without these being available in a utility IT setup, application writers are going to have to solve their own problems with their own tools. While the network team will end up being leaner and smarter, it’s going to make everything run much more smoothly.


Tom’s Take

I live for the day when networking is no different than the electrical grid. I would rather have a “dumb” network that provides connectivity rather than hoping against hope that my “smart” network has all the tricks it needs to solve everyone’s problem. When the simplicity of the network is the feature and we don’t solve problems outside the application stack, stability and reliability will rule the day.

The Rising Tide of CCIE Written Costs

CCIELogo

In CCIE news this week, Cisco has raised the price of their exams across the board. The CCNA has moved up to $325, and the CCIE Written moves from $400 to $450. It goes without saying that there is quite a bit of outcry in the community. Why is the price of the CCIE Written exam surging so high?

No Such Thing As A Free Test

The most obvious answer is that the amount of work going in to development of the exam has increased. The number of people working behind the scenes to create a better exam has caused the amount of outlay to go up, hence the need to recover those costs. This is the simplest explanation of all the cost increases.

As Cisco pours more and more technology into the tests, the amount of hands and fingers touching them has gone down. At the same time, the quality of the eyeballs that do look at the exam has gone up. It’s a lot like going to a specialist doctor. The quality of the care you receive for your condition is high, but the costs associated with that doctor are higher than a regular general practice doctor. Cisco’s headcount is now focused on keeping exam quality high. That kind of expertise is always more expensive per capita, even if the number of those people is fewer.

The odd thing here is that even if the costs of the people doing the work are going up, the amount that the test is increasing doesn’t seem to correlate. It’s been less than two years since the formal introduction of the current version of the CCIE written exam at the then-unheard of price point of $400. We’re two and a half years removed from the CCIE 4.0 Written exam and it’s lofty $350 price point. Has the technology changed so much in less than three years?

The Great Barrier Test

Going back to the introduction of the 5.0 version of the CCIE Written, there was also a retake policy change introduced. Cisco wanted to create a “backoff timer” to reduce the amount of times that a person could take the exam before needing to wait. The change still allowed you to take the second attempt after 30 days, but then the third attempt must wait an additional 90 days after that. So, instead of being able to get three exam attempts in 60 days, those same three attempts would have taken 120 days.

This change was rolled back about six months ago due to outcry from the community. CCIEs trying to recertify were stymied by the exam and forced to wait longer and longer to pass it, with their certification hanging in the balance. With the increased timeouts and limit of four retakes per year, some long time CCIEs were in danger of exhausting their attempts and watching their certification slide away without any recourse to fix it.

Now, the increased price behind the CCIE Written could indeed be attributed to the increased overhead. But it could also be an attempt to keep people from rushing in to take the test every 30 days. Making a policy change to keep people out the exam is one way to do it. But making the exam financially painful to continually fail is another. If you’re willing to drop $1350 in three months to try and pass then you either have money to burn or you’re desperate to pass.

In addition, a higher exam fee would cause test takers to be absolutely certain of their knowledge level before attempting the exam. Creating an initial barrier to entry that will make people think twice before scheduling an exam on a whim does create a situation where the first-time pass rate will improve significantly. This will also help drive funding to certification materials and classes, as candidates will want to know that they will pass before stepping into a certification exam center.


Tom’s Take

I’d really like to think that Cisco is just trying to cover their overhead with the recent price increases. Everything goes up in price. Some things go up faster than others. But the conspiracy theorist in me wonders if Cisco isn’t trying to use the increased price of the exam to help raise the pass rates and discourage folks from rushing the test repeatedly to see the exam question pool. $450 is a tough pill to swallow even if you pass. I think we’re going to see a lot more people taking advantage of the free Cisco Live exam as well as the half price cert exams there. And I sincerely hope the rumored options for recertification take flight soon. Because I don’t know how ready I am to go all out to study when there’s that much money on the line.

Intel’s Ticking Atom Bomb

clock

It started somewhat innocently. Cisco released a field notice that there was an issue with some signal clocks on a range of their networking devices. This by itself was a huge issue. There had been rumblings about this issue for a few months. Some proactive replacement of affected devices to test things. Followed by panicked customer visits when the news broke on February 2nd. Cisco looked like they were about to get a black eye.

The big question that arose was whether or not this issue was specific to Cisco devices or if it was an issue that was much bigger. Some investigative work from enterprising folks like Tony Mattke (@tonhe) found that there was a spec document from Intel that listed a specific issue with the Intel Atom C2000 System on Chip (SoC) that caused it to fail to provide clock signal for onboard chips. The more digging that was done, the more dire this issue turned out to be.

Tick, Tick, Tick

Clock signaling is very important in modern electronics. It ensures that all the chips on the board are using the correct timing to process electronic impulses. If the clock signal starts drifting, you start getting “glitches” in the system. Those glitches are unpredictable results in the outputs. The further out of phase the clock drifts, the more unpredictable the results in the output. Clock signals are setup to keep things on task.

In this case, when the Intel Atom C2000 is installed in a system, it’s providing the clock signal for the Low Pin Count bus. This is the bus that tends to contain simple connections, like serial ports or console ports. But, one of the other things that often connects to the LPC bus is the boot ROM for a system. If the C2000 dies, it denies access to the boot ROM of the device. That’s why the problem is usually not apparent until the device reboots. The boot ROM wouldn’t be accessed otherwise.

The other problem here is that the issue doesn’t have any telltale markers. It just happens one day. When the C2000 dies, it’s gone for good. Intel is working on a way to try and fix the chips already in the wild, but even they are admitting that the fix is only temporary and the real solution is getting new chips and boards in place. Cisco has already embarked on a replacement program, as have the numerous other manufacturers that have used the C2000 chips.

Laying Landmines

This does cast some shade on the future of merchant silicon usage in devices. Back when this appeared to be a simple issue with Cisco devices only, people were irritated at Cisco’s insistence that custom fabricated silicon was the way to go. But now that the real culprit appears to be Intel, it should give switch vendors pause about standardizing on a specific platform.

What if this issue had been present in Broadcom Trident or Tomahawk chips? What if a component of OCP or Wedge had a hidden fault? The larger the install base of a particular chip, the more impactful the outage could be. Imagine a huge Internet of Things (IoT) deployment that relies on a weak link in the chain that can fail at any point and brick the device in question. The recall and replacement costs would be astronomical. Even for Cisco in this case, replacing the subset of affected devices is going to be very costly.


Tom’s Take

Reliance on single components for these kinds of applications is a huge risk, but it’s also good business. Intel provides the C2000 at low cost because it’s designed to be widely deployed. Just like any of the other components you’d find at a Fry’s or old Radio Shack, they are mass produced to serve a purpose. As we move more toward merchant silicon and whitebox as the starting point for deliver value in switching, we have to realize that we need to stay on top of the components themselves instead of just taking the hardware for granted. Because one little glitch here and there can lead to a lot of trouble down the road. And the clock is always ticking on things like this.

Connecting SMBs The Easy Way With Aerohive Connect

Aerohive

Wireless is hard. When you’re putting together large deployments of access points in challenging environments with tons of security on top of it all you realize the difficulty. That’s why most major wireless deployments require a lot of time, planning, and documentation to pull off correctly. But what if things are on the small side?

A Small World Without Wires

The average small business (SMB) is stuck in a wireless limbo. They have requirements that far exceed the performance profile of standard consumer wireless devices. Most SMBs have more than three or four devices connecting at a time. They have reliability issues that need to be dealt with. And they need it all in a package that doesn’t need constant minding to work appropriately.

When you look at the market for consumer wireless today, the real push is to get rid of any configuration at all. Even the old Apple Airport, which was simplistic in its day, is too “complicate” for modern users. Solutions like Google Wifi aim to be the kind of solution that just requires a cable plugged in. No additional configuration beyond that. Which works wonders if you’re a consumer at home that needs to enable some tablets and a smart TV. But for businesses, there needs to be a level of control above that.

At the same time, wireless solutions for SMBs need to offer a limited choice of options. When you give someone a huge list of choices with no real direction on how to use them, you get something I’ve started calling Freestyle Syndrome, after the infamous Coke Freestyle machines. Too many choices cause indecision. Even Coke has finally figured this out by creating guides on the first page of the machine to guide people to Low Calorie options or Fruit Flavored drinks. They realize that the best way to give people tons of choices is to artificially limit those choices in such a way as to give the average user more direction on how to use them.

Buzzing With Opportunity

Enter the newest offering from Aerohive. Announced yesterday, Aerohive has a new 2×2:2 AP on the market, the AP 122. They are combining this new AP with a unique software offering, Aerohive Connect. Aerohive Connect solves the above issues with by providing enhanced capabilities for SMBs without overwhelming them wth pointless options.

Aerohive Connect is a version of the HiveManager software that is optimized to deliver the features that most SMBs need. Included is basic RF planning to find the best place to put your APs, guided deployment and configuration to ensure that you set those APs up correctly, and health monitoring to make sure they are working correctly into the future. You also get features to help create guest access networks to keep your traffic segmented between employees and customers.

What you don’t get with Aerohive Connect is some of the more advanced features of deploying multiple branch sites, advanced security profiles, and other advanced enterprise features of HiveManager. That’s how Aerohive is able to provide these features at a lower price point to stay attractive for SMBs.

Another thing that you won’t see from Aerohive is something common to other solutions like this. Instead of pushing you into “upgrading” to a full-featured version of the software by limiting the number of APs that can be connected, Aerohive Connect does not have a limit on the number of connected APs. You can use it with 2 APs or 25 APs with no limits. If the basic feature set is all you ever need, that’s all you’ll ever pay for. There’s no hidden uplift to recover costs, which essentially turns the SMB solution into an extended trial.


Tom’s Take

As far as solutions for SMBs go, I think Aerohive is on track with Aerohive Connect. They are giving a reduced feature offering that’s perfect for the target market with none of the traditional “gotchas” that I see from other solutions that are simply trying to upset users into a more expensive and more useless solution. Rather than trying to get the mom-and-pop convenience store chain on a full-blown enterprise wireless control system, why not target them with the best solution for them rather than a one-size-fits-all-but-not-really offering?

I think Aerohive is going to get a lot of traction with Aerohive Connect in the market. I will be curious to get an update from them in the coming months to see just how popular things have become.

Sorting Through SD-WAN

lightspeed

SD-WAN has finally arrived. We’re not longer talking about it in terms of whether or not it is a thing that’s going to happen, but a thing that will happen provided the budgets are right. But while the concept of SD-WAN is certain, one must start to wonder about what’s going to happen to the providers of SD-WAN services.

Any Which Way You Can

I’ve written a lot about SDN and SD-WAN. SD-WAN is the best example of how SDN should be marketed to people. Instead of talking about features like APIs, orchestration, and programmability, you need to focus on the right hook. Do you see a food processor by talking about how many attachments it has? Or do you sell a Swiss Army knife by talking about all the crazy screwdrivers it holds? Or do you simply boil it down to “This thing makes your life easier”?

The most successful companies have made the “easier” pitch the way forward. Throwing a kitchen sink at people doesn’t make them buy a whole kitchen. But showing them how easy and automated you can make installation and management will sell boxes by the truckload. You have to appeal the opposite nature that SD-WAN was created to solve. WANs are hard, SD-WANs make them easy.

But that only works if your SD-WAN solution is easy in the first place. The biggest, most obvious target is Cisco IWAN. I will be the first to argue that the reason that Cisco hasn’t captured the SD-WAN market is because IWAN isn’t SD-WAN. It’s a series of existing technologies that were brought together to try and make and SD-WAN competitor. IWAN has all the technical credibility of a laboratory full of parts of amazing machines. What it lacks is any kind of ability to tie all that together easily.

IWAN is a moving target. Which platform should I use? Do I need this software to make it run correctly? How do I do zero-touch deployments? Or traffic control? How do I plug a 4G/LTE modem into the router? The answers to each of these questions involves typing commands or buying additional software features. That’s not the way to attack the complexity of WANs. In fact, it feeds into that complexity even more.

Cisco needs to look at a true SD-WAN technology. That likely means acquisition. Sure, it’s going to be a huge pain to integrate an acquisition with other components like APIC-EM, but given the lead that other competitors have right now, it’s time for Cisco to come up with a solution that knocks the socks off their longtime customers. Or face the very real possibility of not having longtime customers any longer.

Every Which Way But Loose

The first-generation providers of SD-WAN bounced onto the scene to pick up the pieces from IWAN. Names like Viptela, VeloCloud, CloudGenix, Versa Networks, and more. But, aside from all managing to build roughly the same platform with very similar features, they’ve hit a might big wall. They need to start making money in order for these gambles to pay off. Some have customers. Others are managing the migration into other services, like catering their offerings toward service providers. Still others are ripe acquisition targets for companies that lack an SD-WAN strategy, like HPE or Dell. I expect to see some fallout from the first generation providers consolidating this year.

The second generation providers, like Riverbed and Silver Peak, all have something in common. They are building on a business they’ve already proven. It’s no coincidence that both Riverbed and Silver Peak are the most well-known names in WAN optimization. How well known? Even major Cisco partners will argue that they sell these two “best of breed” offerings over Cisco’s own WAAS solution. Riverbed and Silver Peak have a definite advantage because they have a lot of existing customers that rely on WAN optimization. That market alone is going to net them a significant number of customers over the next few years. They can easily sell SD-WAN as the perfect addition to make WAN optimization even easier.

The third category of SD-WAN providers is the late comers. I still can’t believe it, but I’ve been reading about providers that aren’t traditional companies trying to get into the space. Talk about being the ninth horse in an eight horse race. Honestly, at this point you’re better off plowing your investment money into something else, like Internet of Things or Virtual Reality. There’s precious little room among the existing first generation providers and the second generation stalwarts. At best, all you can hope for is a quick exit. At worst, your “novel” technology will be snapped up for pennies after you’re bankrupt and liquidating everything but the standing desks.


Tom’s Take

Why am I excited about the arrival of SD-WAN? Because now I can finally stop talking about it! In all seriousness, when the boardroom starts talking about things that means it’s past the point of being a hobby project and now has become a real debate. SD-WAN is going to change one of the most irritating aspects of networking technology for us. I can remember trying to study for my CCNP and cramming all the DSL and T1 knowledge a person could fit into a brain in my head. Now, it’s all point-and-click and done. IPSec VPNs, traffic analytics, and application identification are so easy it’s scary. That’s the power of SD-WAN to me. Easy to use and easy to extend. I think that the landscape of providers of SD-WAN technologies is going to look vastly different by the end of 2017. But SD-WAN is going to be here for the long haul.

Two Takes On ASIC Design

Making ASICs is a tough task. We learned this last year at Cisco Live Berlin from this conversation with Dave Zacks:

Cisco spent 6 years building the UADP ASIC that powers their next generation switches. They solved a lot of the issues with ASIC design and re-spins by creating some programmability in the development process.

Now, watch this video from Nick McKeown at Barefoot Networks:

Nick says many of the same things that Dave said in his video. But Nick and Barefoot took a totally different approach from Cisco. Instead of creating programmable elements in the ASIC design, then abstracted the entire language of function definition from the ASIC. By using P4 as the high level language and making the system compile the instruction sets down to run in the ASIC, they reduced the complexity, increased the speed, and managed to make the system flexible and capable of implementing new technologies even after the ASIC design is set in stone.

Oh, and they managed to do it in 3 years.

Sometimes, you have to think outside the box in order to come up with some new ideas. Even if that means you have to pull everything out of the box. By abstracting the language from the ASIC, Barefoot not only managed to find a way to increase performance but also to add feature sets to the switch quickly without huge engineering costs.

Some food for thought.

Culling The Community

exclusion

By now, you may have seen some bit of drama in the VMUG community around the apparent policy change that disqualified some VMUG leaders based on their employer. Eric Shanks (@Eric_Shanks) did a great job of covering it on his blog as did Matt Crape (@MattThatITGuy)with his post. While the VMUG situation has its own unique aspects, the question for me boils down to something simple: How do you remove people from an external community?

Babies And Bathwater

Removing unauthorized people from a community is nothing new under the sun. I was a Cisco Champion once upon a time. During the program’s second year I participated in briefings and events with the rest of the group, including my good friend Amy Arnold (@AmyEngineer). When the time came to reapply to the program for Year 3, I declined to apply again for my own reasons. Amy, however, was told that she couldn’t reapply. She and several other folks in the program were being disqualified for “reasons”. It actually took us a while to figure out why, and the answer still wasn’t 100% clear. To this day the best we can figure out is that there is some kind of conflict between anyone working with the public sector or government and the terms and conditions of the Champions program.

The lack of communication about the rules was the biggest issue by far with the whole transition. People don’t like being excluded. They especially don’t like being excluded from a group they were previously a member of. It takes time and careful explanation to help them understand why they are no longer able to be a part of a community. Hiding behind vague statements and pointing to rule sections doesn’t really help.

In the case of the VMUG issue above, the answer as to why the dismissed leaders were disqualified still isn’t clear. At least, it isn’t clear according to the official rules. There is still some debate as to the real reasoning behind everything, as the comments on Matt’s blog indicate. However, the community has unofficially settled on the reasoning being that those leaders were employed by someone that VMware, who is more-than-loosely affiliated with VMUG, has deemed a direct competitor.

I’m no stranger to watching companies go from friends to frenemies to competitors in the blink of an eye. VMware and Cisco. VMware and Scale Computing. Cisco and HP. All of these transitions took two aligned companies and put them on opposite sides of the firing line. And in a lot of cases, the shift in messaging was swift. Last week they were both great partners. The next week shifted to “We have always been at war with Eurasia.” Which didn’t bode well for people that were caught in the middle.

Correcting The Position

How do you correctly go about affecting changes in membership? How can you realistically make things work when a rule change suddenly excludes people? It’s not an easy path, but here are some helpful hints:

  • COMMUNICATION! – Above all else, it is absolutely critical to communicate at every step of the process. Don’t leave people guessing as to your reasoning. If you are contemplating a rule change, let everyone know. If you are looking to enforce a rule that was previously not enforced, warn everyone well in advance. Don’t let people come up with their own theories. Don’t make people write blogs asking for clarification on a situation.
  • If a person is being excluded because of a rule change, give the a bit of grace period to exit on their own terms. If that person is a community leader, they will need time to transition a new person into their role. If that person is a well-liked member of the community, give them a chance to say goodbye instead of being forced out. That grace period doesn’t need to be months long. Usually by the next official meeting or briefing time is enough. Giving someone the chance to say goodbye is much better than telling everyone they left. It provides closure and gives everyone a chance to discuss what the next steps will be.
  • If a rule change is in order that excludes members of the community, weigh it carefully. Ask yourself what you are gaining from it. Is it a legal reason? Does it need to be made to comply with some kind of regulation? Those are valid reasons and should be communicated with enough warning. People will understand. But if the reasoning behind your rule change is spite or retaliation for something, carefully consider your next steps. Realize that every rank-and-file member of the community has their own opinions and vision. Just because Evil CEO made your CEO mad doesn’t mean that his Local SE has the same feelings. And it absolutely doesn’t mean that Local SE is going to subvert your community for their own ends. These are the kinds of decisions that divide people at the expense of keeping your community free of “influences”.

It can’t be said enough that you need to talk to the community before you even begin debating action. There are no community organizations that blindly follow orders from on high. These are places where thinking people interact and share. And if they are suddenly told how things are going to be without any discussion or debate, you can better believe they are going to try and get to the bottom of it. Whether you want them to or not.


Tom’s Take

Kicking people out of something is never easy. Tech Field Day has rules about delegates being employed by presenting vendors. More than once I’ve had conversations with people about being disqualified from being a delegate. Most of them understand why that’s the case beforehand because our policy is straightforward. But if it’s ever changed, you can better believe that we’re going to let everyone know well in advance.

Communities run on communication. Discussion, debate, and ultimately acceptance are all driven by knowing what’s happening at all times. If you make rules under the cloak of secrecy for reasons which aren’t readily apparent, you risk alienating more than just the people you’re looking to exclude.