SDN 101 at ONUG Academy

300x250_TFD10_V2

Software defined networking is king of the hill these days in the greater networking world.  Vendors are contemplating strategies.  Users are demanding functionality.  And engineers are trying to figure out what it all means.  What’s needed is a way for vendor-neutral parties to get together and talk about what SDN represents and how best to implement it.  Most of the talk so far has been at vendor-specific conferences like Cisco Live or at other conferences like Interop.  I think a third option has just presented itself.

Nick Lippis (@NickLippis) has put together a group of SDN-focused people to address concerns about implementation and usage.  The Open Networking User Group (ONUG) was assembled to allow large companies using SDN to have a semi-annual meeting to discuss strategy and results.  It allows Facebook to talk to JP Morgan about what they are doing to simplify networking through use of things like OpenFlow.

This year, ONUG is taking it a step further by putting on the ONUG Academy, a day-long look at SDN through the eyes of those that implement it.  They have assembled a group of amazing people, including the founder of Cumulus Networks and Tech Field Day’s own Brent Salisbury (@NetworkStatic).  There will be classes about optimizing networks for SDN as well as writing SDN applications for the most popular controllers on the market.  Nick shares more details about the ONUG academy here:

If you’re interested in attending ONUG either for the academy or for the customer-focused meetings, you need to register today.  As a special bonus, if you use the code TFD10 when you sign up, you can take 10% of the cost of registration.  Use that extra cash to go out and buy a cannoli or two.

I’ll be at ONUG with Tech Field Day interviewing customers and attendees about their SDN strategies as well as where they think the state of the industry is headed.  If you’re there, stop by and say hello.  And be sure to bring me one of those cannolis.

Know the Process, Not the Tool

rj45process

If there is one thing that amuses me as of late, it’s the “death of CLI” talk that I’m starting to see coming from many proponents of software defined networking. They like to talk about programmatic APIs and GUI-based provisioning and how everything that network engineers have learned is going to fall by the wayside.  Like this Network World article. I think reports of the death of CLI are a bit exaggerated.

Firstly, the CLI will never go away. I learned this when I stared working with an Aerohive access point I got at Wireless Field Day 2. I already had a HiveManager account provisioned thanks to Devin Akin (@DevinAkin), so all I needed to do was add the device to my account and I would be good to go. Except it never showed up. I could see it on my local network, but it never showed up in the online database. I rebooted and reset several times before flipping the device over and finding a curious port labeled “CONSOLE”. Why would a cloud-based device need a console port. In the next hour, I learned a lot about the way Aerohive APs are provisioned and how there were just some commands that I couldn’t enter in the GUI that helped me narrow down the problem. After fixing a provisioning glitch in HiveManager the next day, I was ready to go. The CLI didn’t fix my problem, but I did learn quite a bit from it.

Basic interfaces give people a great way to see what’s going on under the hood. Given that most folks in networking are from the mold of “take it apart to see why it works” the CLI is great for them. I agree that memorizing a 10-argument command to configure something like route redistribution is a pain in the neck, but that doesn’t come from the difficulty of networking. Instead, the difficulty lies in speaking the language.

I’ve traveled to a foreign country once or twice in my life. I barely have a grasp of the English language at times. I can usually figure out some Spanish. My foreign language skills have pretty much left me at this point. However, when I want to make myself understood to people that speak another language, I don’t focus on syntax. Instead, I focus on ideas. Pointing at an object and making gestures for money usually gets the point across that I want to buy something. Pantomiming a drinking gesture will get me to a restaurant.

Networking is no different. When I started trying to learn CLI terminology for Brocade, Arista, and HP I found they were similar in some respects but very different in others. When you try to take your Cisco CLI skills to a Juniper router, you’ll find that you aren’t even in the neighborhood when it comes to syntax. What becomes important is *what* you’re trying to do. If you can think through what you’re trying to accomplish, there’s usually a help file or a Google search that can pull up the right way to do things.

This extends its way into a GUI/API-driven programming interface as well. Rather than trying to intuit the interface just think about what you want to do instead. If you want two hosts to talk to each other through a low-cost link with basic security you just have to figure out what the drag-and-drop is for that. If you want to force application-specific traffic to transit a host running an intrusion prevention system you already know what you want to do. It’s just a matter of find the right combination of interface programming to accomplish it. If you’re working on an API call using Python or Java you probably have to define the constraints of the system anyway. The hard part is writing the code to interface to accomplish the task.


Tom’s Take

Learning the process is the key to making it in networking. So many entry level folks are worried about *how* to do something. Configuring a route or provisioning a VLAN are the end goal. It’s only when those folks take a step back and think about their task without the commands that they begin to become real engineers. When you can visualize what you want to do without thinking about the commands you need to enter to do it, you are taking the logical step beyond being tied to a platform. Some of the smartest people I know break a task down into component parts and steps. When you spend more time on *what* you are doing and less on *how* you are doing it, you don’t need to concern yourself with radical shifts in networking, whether they be SDN, NFV, or the next big thing. Because the process will never change even if the tools might.

Disruption in the New World of Networking

This is the one of the most exciting times to be working in networking. New technologies and fresh takes on existing problems are keeping everyone on their toes when it comes to learning new protocols and integration systems. VMworld 2013 served both as an annoucement of VMware’s formal entry into the larger networking world as well as putting existing network vendors on notice. What follows is my take on some of these announcements. I’m sure that some aren’t going to like what I say. I’m even more sure a few will debate my points vehemently. All I ask is that you consider my position as we go forward.

Captain Over, Captain Under

VMware, through their Nicira acquisition and development, is now *the* vendor to go to when you want to build an overlay network. Their technology augments existing deployments to provide software features such as load balancing and policy deployment. In order to do this and ensure that these features are utilized, VMware uses VxLAN tunnels between the devices. VMware calls these constructs “virtual wires”. I’m going to call them vWires, since they’ll likely be called that soon anyway. vWires are deployed between hosts to provide a pathway for communications. Think of it like a GRE tunnel or a VPN tunnel between the hosts. This means the traffic rides on the existing physical network but that network has no real visibility into the payload of the transit packets.

Nicira’s brainchild, NSX, has the ability to function as a layer 2 switch and a layer 3 router as well as a load balancer and a firewall. VMware is integrating many existing technologies with NSX to provide consistency when provisioning and deploying a new sofware-based network. For those devices that can’t be virtualized, VMware is working with HP, Brocade, and Arista to provide NSX agents that can decapsulate the traffic and send it to an physical endpoint that can’t participate in NSX (yet). As of the launch during the keynote, most major networking vendors are participating with NSX. There’s one major exception, but I’ll get to that in a minute.

NSX is a good product. VMware wouldn’t have released it otherwise. It is the vSwitch we’ve needed for a very long time. It also extends the ability of the virtualization/server admin to provision resources quickly. That’s where I’m having my issue with the messaging around NSX. During the second day keynote, the CTOs on stage said that the biggest impediment to application deployment is waiting on the network to be configured. Note that is my paraphrasing of what I took their intent to be. In order to work around the lag in network provisioning, VMware has decided to build a VxLAN/GRE/STT tunnel between the endpoints and eliminate the network admin as a source of delay. NSX turns your network in a fabric for the endpoints connected to it.

Under the Bridge

I also have some issues with NSX and the way it’s supposed to work on existing networks. Network engineers have spent countless hours optimizing paths and reducing delay and jitter to provide applications and servers with the best possible network. Now, that all doesn’t matter. vAdmins just have to click a couple of times and build their vWire to the other server and all that work on the network is for naught. The underlay network exists to provide VxLAN transport. NSX assumes that everything working beneath is running optimally. No loops, no blocked links. NSX doesn’t even participate in spanning tree. Why should it? After all, that vWire ensures that all the traffic ends up in the right location, right? People would never bridge the networking cards on a host server. Like building a VPN server, for instance. All of the things that network admins and engineers think about in regards to keeping the network from blowing up due to excess traffic are handwaved away in the presentations I’ve seen.

The reference architecture for NSX looks pretty. Prettier than any real network I’ve ever seen. I’m afraid that suboptimal networks are going to impact application and server performance now more than ever. And instead of the network using mechanisms like QoS to battle issues, those packets are now invisible bulk traffic. When network folks have no visibility into the content of the network, they can’t help when performance suffers. Who do you think is going to get blamed when that goes on? Right now, it’s the network’s fault when things don’t run right. Do you think that moving the onus for server network provisioning to NSX and vCenter is going to forgive the network people when things go south? Or are the underlay engineers going to be take the brunt of the yelling because they are the only ones that still understand the black magic outside the GUI drag-and-drop to create vWires?

NSX is for service enablement. It allows people to build network components without knowing the CLI. It also means that network admins are going to have to work twice as hard to build resilient networks that work at high speed. I’m hoping that means that TRILL-based fabrics are going to take off. Why use spanning tree now? Your application and service network sure isn’t. No sense adding any more bells and whistles to your switches. It’s better to just tie them into spine-and-leaf CLOS fabrics and be done with it. It now becomes much more important to concentrate on the user experience. Or maybe the wirless network. As long as at least one link exists between your ESX box and the edge switch let the new software networking guys worry about it.

The Recumbent Incumbent?

Cisco is the only major networking manufacturer not publicly on board with NSX right now. Their CTO Padma Warrior has released a response to NSX that talks about lock-in and vertical integration. Still others have released responses to that response. There’s a lot of talk right now about the war brewing between Cisco and VMware and what that means for VCE. One thing is for sure – the landscape has changed. I’m not sure how this is going to fall out on both sides. Cisco isn’t likely to stop selling switches any time soon. NSX still works just fine with Cisco as an underlay. VCE is still going to make a whole bunch of money selling vBlocks in the next few months. Where this becomes a friction point is in the future.

Cisco has been building APIs into their software for the last year. They want to be able to use those APIs to directly program the network through devices like the forthcoming OpenDaylight controller. Will they allow NSX to program them as well? I’m sure they would – if VMware wrote those instructions into NSX. Will VMware demand that Cisco use the NSX-approved APIs and agents to expose network functionality to their software network? They could. Will Cisco scrap OnePK to implement NSX? I doubt that very much. We’re left with a standoff. Cisco wants VMware to use their tools to program Cisco networks. VMware wants Cisco to use the same tools as everyone else and make the network a commodity compared to the way it is now.

Let’s think about that last part for a moment. Aside from some speed differences, networks are largely going to be identical to NSX. It won’t care if you’re running HP, Brocade, or Cisco. Transport is transport. Someone down the road may build some proprietary features into their hardware to make NSX run better but that day is far off. What if a manufacturer builds a switch that is twice as fast as the nearest competition? Three times? Ten times? At what point does the underlay become so important that the overlay starts preferring it exclusively?


Tom’s Take

I said a lot during the Tuesday keynote at VMworld. Some of it was rather snarky. I asked about full BGP tables and vMotioning the machines onto the new NSX network. I asked because I tend to obsess over details. Forgotten details have broken more of my networks than grand design disasters. We tend to fuss over the big things. We make more out of someone that can drive a golf ball hundreds of yards than we do about the one that can consistently sink a ten foot putt. I know that a lot of folks were pre-briefed on NSX. I wasn’t, so I’m playing catch up right now. I need to see it work in production to understand what value it brings to me. One thing is for sure – VMware needs to change the messaging around NSX to be less antagonistic towards network folks. Bring us into your solution. Let us use our years of experience to help rather than making us seem like pariahs responsible for all your application woes. Let us help you help everyone.

SDN and NFV – The Ups and Downs

TopSDNBottomNFV

I was pondering the dichotomy between Software Defined Networking (SDN) and Network Function Virtualization (NFV) the other day.  I’ve heard a lot of vendors and bloggers talking about how one inevitably leads to the other.  I’ve also seen a lot of folks saying that the two couldn’t be further apart on the scale of software networking.  The more I thought about these topics, the more I realized they are two sides of the coin.  The problem, at least in my mind, is the perspective.

SDN – Planning The Paradigm

Software Defined Networking telegraphs everything about what it is trying to accomplish right there in the name.  Specifically, the “Definition” part of the phrase.  I’ve made jokes in the past about the lack of definition in SDN as vendors try to adapt their solutions to fit the buzzword mold.  What I finally came to realize is that the SDN folks are all about definition. SDN is the Top Down approach to planning.  SDN seeks to decompose the network into subsystems that can be replaced or reprogrammed to suit the needs of those things which utilize the network.

As an example, SDN breaks the idea of switch down into things like “forwarding plane” and “control plane” and seeks to replace the control plane with alternative software, whether it be a controller-based architecture like OpenFlow or an overlay network similar to that of VMware/Nicira.  We can replace the OS of a switch with a concept like OpenFlow easily.  It’s just a mechanism for determining which entries are populated in the Content Addressable Memory (CAM) tables of the forwarding plane.  In top down design, it’s easy to create a stub entry or “black box” to hold information that flows into it.  We don’t particularly care how the black box works from the top of the design, just that it does its job when called upon.

Top Down designs tend to run into issues when those black boxes lack detail or are missing some critical functionality.  What happens when OpenFlow isn’t capable of processing flows fast enough to keep the CAM table of a campus switch populated with entries?  Is the switch going to fall back to process switching the packets?  That could be an big issue.  Top Down designs are usually very academic and elegant.  They also have a tendency to lack concrete examples and real world experience.  When you think about it, that does speak a lot about the early days of SDN – lots of definition of terminology and technology, but a severe lack of actual packet forwarding.

NFV – Working From The Ground Up

Network Function Virtualization takes a very different approach to the idea of turning hardware networks into software networks.  The driving principle behind NFV is replication of existing technology in a software state.  This is classic Bottom Up design.  Rather than spending a large amount of time planning and assembling the perfect system, Bottom Up designers tend to build as they go.  They concentrate on making something work first, then making their things work together second.

NFV is great for hands-on folks because it gives concrete, real results almost immediately. Once you’ve converted an load balancer or a router to a purely software-based construct you can see right away how it works and what the limitations might be.  Does it consume too many resources on the hypervisor?  Does it excel at forwarding small packets?  Does switching a large packet locally cause a fault?  These are problems that can be corrected in the individual system rapidly rather than waiting to modify the overall plan to account for difficulties in the virtualization process.

Bottom Up design does suffer from some issues as well.  The focus in Bottom Up is on getting things done on a case-by-case basis.  What do you do when you’ve converted all your hardware to software?  Do your NFV systems need to talk to one another?  That’s usually where Bottom Up design starts breaking down.  Without a grand plan at a higher level to ensure that systems can talk to each other this design methodology falls back to a series of “hacks” to get them connected.  Units developed in isolation aren’t required to play nice with everyone else until they are forced to do so.  That leads to increasing complex and fragile interconnection systems that could fail spectacularly should the wrong thread be yanked with sufficient force.


Tom’s Take

Which method is better?  Should we spend all our time planning the system and hope that our Powerpoint Designs work the right way when someone codes them in a few months?  Or should we say “damn the torpedoes” and start building things left and right and hope that someone will figure out a way to tie all these individual pieces together at some point?

Surprisingly, the most successful design requires elements of both.  People need to have a basic plan at the least when starting out on a plan to change the networking world.  Once the ideas are sketched out, you need a team of folks willing to burn the midnight oil and get the ideas implemented in real life to ensure that the plan works the right way.  The guidance from the top is essential to making sure everything works together in the end.

Whether you are leading from the top or the bottom, remember that everything has to meet in the middle sooner or later.

A Guide to SDN Spirit Animals

The world of computers and IT has always been linked with animals.  Whether you are referring to Tux the Penguin from the world of Linux or the various zoological specimens that have graced the covers of the O’Reilly Media library you can find almost every member of the animal kingdom represented.  Many of these icons have become mascots for their users.  In the world of software defined networking (SDN), we have our own mascot as well.  However, I’m going to propose that we start considering a few more as well.

The Horned Wonder

If you’ve read any kind of blog post about SDN in the last year, you’ve probably seen reference to a unicorn at some point.  Unicorns are mythical creatures that are full of magic and wonder.  I referenced them once in a post concerning a network where I had trouble understanding how untagged packets were traversing VLANs without causing a meltdown.  When the network admin asked me how it was happening I replied, “They must be getting ferried around on the backs of unicorns!”  That started my association of magical things happening in networks and their subsequent attribution to unicorns.  Greg Ferro (@etherealmind) is fond of saying that new protocols without sufficient documentation must be powered by “unicorn tears”.  Ivan Pepelnjak (@ioshints) is also a huge fan of the unicorn, as evidenced by this picture:

Ivan rides his steed into battle

Ivan rides his steed into battle

The unicorn is popular because it represents a fantastic explanation for a difficult problem.  However, people that I’ve talked to recently are getting tired of attributing mythical properties of various SDN-related technologies to the mighty unicorn.  I thought about it and realized that there are more suitable animals depending on what technology you’re talking about.

King of Beasts

griffin

If you ask most SDN companies, they’ll tell you that their spirit animal is the griffin.  The griffin is a mythical creature with the body and hindquarters of a lion combined with the head, wings, and front legs of an eagle.  This regal beast is regarded as a stately amalgam of the king of beasts and the king of birds.  It typically guards important and sacred treasures.  It is also a popular animal in heraldry, where it represents courage and boldness.

You can tell from that description that anyone writing an API for their existing OS or networking stack probably has one of these things hanging in their cubicle.  It stands for the best possible joining of two great ideas.  Those APIs guard the sacred treasures for those that have always wanted insight into the inner workings of a network operating system.  The griffin is the best case scenario for those that want to write an effective API or access methodology for enabling SDN.  But as we all know, something the best strategies are sometimes poorly implemented.

Design by Committee

Chimera

The opposite of the griffin would have to be the chimera.  A chimera is a mythical beast that has the body, head, and front legs of lion.  It has a goat’s head jutting from the middle of the body and a snake’s head for a tail, although some sources say this is a dragon head with the associated dragon wings as well.  This nightmarish beast comes from Greek mythology where it was an omen of disaster when spotted.

The chimera represents what happens when you try to combine things and end up with the worst possible combination.  Why is there a goat’s head in the middle?  What good does a snake head for a tail really do?  In much the same way, companies that are trying to create SDN strategies by throwing everything they can into the mix will have end results that should use a chimera for a mascot.  Rather than taking the approach of building the product with the best and most useful features, some designers feel the need to attach every thing they can in an effort to replicate existing non-useful functionality.  “Better to have it and not need it” is the rallying cry most often heard.  This leads to the kind of unwieldy and bloated applications that scare people away from SDN and back to traditional networking methodology.

Tom’s Take

Every project needs a mascot.  Every product needs an icon or a fancy drawing on the product page.  Sooner or later, those mascots come to symbolize everything the project stands for.  Content penguins aside, most projects are looking for something cute or cuddly.  Security vendors are notorious for using scary looking animals to get the point across that they aren’t to be messed with.  I think that using mythologic creatures other than the unicorn to symbolize SDN projects is the way to go.  It focuses the developers to ground themselves in real features.  Hopefully it helps them avoid the mentality that could create nightmarish creatures like the chimera.

SDN and Toilets

401186_AutoFlushN5I’ve been thinking a lot about SDN recently, as you can no doubt tell from the number of blog posts that I’ve been putting out about it.  A lot of my thinking is coming from the idea that we need to find better ways to relate SDN to real world objects and processes to help people understand better what the advantages and disadvantages of all the various parts can be.

One example of the apprehension that some feel with SDN occurred to me the other day when I was in a conference center restroom.  Despite all the joking about doing the best thinking in a bathroom I found a nice example based on retrofitted old technology.  You’ve no doubt seen that many restrooms are starting to install touchless flush sensors on their toilets and urinals.  There are a myriad of health and sanitation benefits as well as water cost savings, not to mention saving maintenance costs on the handles of these units.

The part that made me curious during this trip was the complete lack of any buttons on the unit for triggering a manual flush.  Most of the touchless toilets and urinals that I’ve seen have some sort of small button used to flush the unit at the behest of the user.  While these buttons are probably not used all that often, it is a bit reassuring to know they exist if needed.  Imagine my surprise when I found the units in this particular convention center with no button whatsoever.  A completely closed system.  While I was able to finish my business without further incident, it made me start thinking about these kinds of systems in relation to SDN constructs.

Black Boxes

My go-to example for this type of issue used to be an automotive one – the carburetor and the modern fuel injection system.  Carburetors are great ways to deliver a fuel/air mixture into an engine.  They also offer a multitude of customization options and performance tuning capabilities.  They also represent the type of arcane knowledge that’s need to make one work right in the first place.  If you misalign a jet or don’t put things back together correctly, you can very easily cause your engine to run improperly or even cause your car not to start.  The customization ability exists along with the possibility of causing damage if you aren’t properly trained.

A fuel injection system, on the other hand, is tuned perfectly when it is installed.  Once it’s bolted on to the engine, it becomes a cryptic black box that does its job without any further input.  In fact, if something does go wrong with the fuel injection system there’s likely no way you’re going to be able to work on it unless you are an S.A.E. mechanic or fuel injector designer.  The system does its job without input because of the initial tuning.

How do both of these examples relate to SDN?  There are some that say that a properly functioning SDN system will use analysis and inputs to determine the best way to install flows into a device or build overlays in a way to maximize bandwidth to critical systems.  It’s a steady state machine just like a fuel injection system or a buttonless toilet.  It offers no way for people to provide inputs into the system to influence behavior.  You might say that a system of this nature is far fetched and fantastic.  Yet we seem to be leveraging a multitude of technologies for the purpose of removing as much input and decision making from the network as we can.  Is it that much of a leap to decide that we want to remove external variables totally from the equation?  I think that will be a focus on the next wave of SDN once the baselines have been established.

People don’t like steady state black boxes.  They like having an override switch or a manual activation button.  It reassures them to know that they can have an impact on the system no matter how small.  It’s a lot like the crosswalk buttons on street corners.  Even if they are programmed to have no effect at all on the light schedule pedestrians feel more comfortable having them around.  The average engineer hates having no input into a system.  That’s why full network automation is so scary.  What happens when things go off the rails?


Tom’s Take

If you really want to make sure that people feel comfortable with the idea of a fully automated SDN solution, the key is to give them meaningless input.  Make a button or a field that lets them think they are having an impact without really taking anything into account to create the best path through the network.  Routing protocols show what happens when people think they are smarter than algorithms.  Imagine what would happen if that level of interference would happen in a data center.  The fix might not be as easy as backing out a static route.  In truth, I don’t think the data center world is quite ready for a fully automated SDN solution right now.  Maybe once we’ve gotten them used to the idea of buttonless flush toilets, we can introduce the idea of a buttonless data center.

The SDNquisition

Inquisition

Network Engineer: Trouble in the data center.
Junior Admin: Oh no – what kind of trouble?
Network Engineer: VLAN PoC for VIP is SNAFU.
Junior Admin: Pardon?
Network Engineer: VLAN PoC for VIP is SNAFU.
Junior Admin: I don’t understand what you’re saying.
Network Engineer: [slightly irritatedly and with exaggeratedly clear accent] Virtual LAN Proof of Concept for the Very Important Person is…messed up.
Junior Admin: Well what on earth does that mean?
Network Engineer: *I* don’t know – the CIO just told me to come in here and say that there was trouble in the data center that’s all – I didn’t expect a kind of Spanish Inquisition.

[JARRING CHORD]

[The door flies open and an SDN Developer  enters, flanked by two junior helpers. An SDN Assistant [Jones] has goggles pushed over his forehead. An SDN Blogger [Gilliam] is taking notes for the next article]

SDN Developer: NOBODY expects the SDNquisition! Our chief weapon is orchestration…orchestration and programmability…programmability and orchestration…. Our two weapons are programmability and orchestration…and Open Source development…. Our *three* weapons are programmability, orchestration, and Open Source development…and an almost fanatical devotion to disliking hardware…. Our *four*…no… *Amongst* our weapons…. Amongst our weaponry…are such elements as programmability, orchestration…. I’ll come in again.

[The Inquisition exits]

Network Engineer: I didn’t expect a kind of Inquisition.

[JARRING CHORD]

[The cardinals burst in]

SDN Developer: NOBODY expects the SDNquisition! Amongst our weaponry are such diverse elements as: programmability, orchestration, Open Source development, an almost fanatical devotion to disliking hardware, and nice slide decks – Oh damn!
[To Cardinal SDN Assistant] I can’t say it – you’ll have to say it.
SDN Assistant: What?
SDN Developer: You’ll have to say the bit about ‘Our chief weapons are …’
SDN Assistant: [rather horrified]: I couldn’t do that…

[SDN Developer bundles the cardinals outside again]

Network Engineer: I didn’t expect a kind of Inquisition.

[JARRING CHORD]

[The cardinals enter]

SDN Assistant: Er…. Nobody…um….
SDN Developer: Expects…
SDN Assistant: Expects… Nobody expects the…um…the SDN…um…
SDN Developer: SDNquisition.
SDN Assistant: I know, I know! Nobody expects the SDNquisition. In fact, those who do expect –
SDN Developer: Our chief weapons are…
SDN Assistant: Our chief weapons are…um…er…
SDN Developer: Orchestration…
SDN Assistant: Orchestration and —
SDN Developer: Okay, stop. Stop. Stop there – stop there. Stop. Phew! Ah! … our chief weapons are Orchestration…blah blah blah. Cardinal, read the paradigm shift.
SDN Blogger: You are hereby charged that you did on diverse dates claim that hardware forwarding is preferred to software definition of networking…
SDN Assistant: That’s enough.
[To Junior Admin] Now, how do you plead?
Junior Admin: We’re innocent.
SDN Developer: Ha! Ha! Ha! Ha! Ha!

[DIABOLICAL LAUGHTER]

SDN Assistant: We’ll soon change your mind about that!

[DIABOLICAL ACTING]

SDN Developer: Programmability, orchestration, and Open Source– [controls himself with a supreme effort] Ooooh! Now, Cardinal — the API!

[SDN Assistant produces an API design definition. SDN Developer looks at it and clenches his teeth in an effort not to lose control. He hums heavily to cover his anger]

SDN Developer: You….Right! Open the IDE.

[SDN Blogger and SDN Assistant make a pathetic attempt to launch a cross-platform development kit]

SDN Developer: Right! What function will you software enable?
Junior Admin: VLAN creation?
SDN Developer: Ha! Right! Cardinal, write the API [oh dear] start a NETCONF parser.

[SDN Assistant stands their awkwardly and shrugs his shoulders]

SDN Assistant: I….
SDN Developer: [gritting his teeth] I *know*, I know you can’t. I didn’t want to say anything. I just wanted to try and ignore your dependence on old hardware constructs.
SDN Assistant: I…
SDN Developer: It makes it all seem so stupid.
SDN Assistant: Shall I…?
SDN Developer: No, just pretend for Casado’s sake. Ha! Ha! Ha!

[SDN Assistant types on an invisible keyboard at the IDE screen]

[Cut to them torturing a dear old lady, Marjorie Wilde]

SDN Developer: Now, old woman — you are accused of heresy on three counts — heresy by having no API definition, heresy by failure to virtualize network function, heresy by not purchasing an SDN startup for your own needs, and heresy by failure to have a shipping product — *four* counts. Do you confess?
Wilde: I don’t understand what I’m accused of.
SDN Developer: Ha! Then we’ll make you understand! SDN Assistant! Fetch…THE POWERPOINT!

[JARRING CHORD]

[SDN Assistant launches a popular presentation program]

SDN Assistant: Here it is, lord.
SDN Developer: Now, old lady — you have one last chance. Confess the heinous sin of heresy, reject the works of the hardware vendors — *two* last chances. And you shall be free — *three* last chances. You have three last chances, the nature of which I have divulged in my previous utterance.
Wilde: I don’t know what you’re talking about.
SDN Developer: Right! If that’s the way you want it — Cardinal! Animate the slides!

[SDN Assistant carries out this rather pathetic torture]

SDN Developer: Confess! Confess! Confess!
SDN Assistant: It doesn’t seem to be advancing to the next slide, lord.
SDN Developer: Have you got all the slides using the window shade dissolve?
SDN Assistant: Yes, lord.
SDN Developer [angrily closing the application]: Hm! She is made of harder stuff! Cardinal SDN Blogger! Fetch…THE NEEDLESSLY COMPLICATED VISIO DIAGRAM!

[JARRING CHORD]

[Zoom into SDN Blogger’s horrified face]

SDN Blogger [terrified]: The…Needlessly Complicated Visio Diagram?

[SDN Assistant produces a cluttered Visio diagram — a really cluttered one]

SDN Developer: So you think you are strong because you can survive the Powerpoint. Well, we shall see. SDN Assistant! Show her the Needlessly Complicated Visio Diagram!

[They shove the diagram into her face]

SDN Developer [with a cruel leer]: Now — you will study the Needlessly Complicated Visio Diagram until lunch time, with only a list of approved Open Flow primitives. [aside, to SDN Assistant] Is that really all it is?
SDN Assistant: Yes, lord.
SDN Developer: I see. I suppose we make it worse by shouting a lot, do we? Confess, woman. Confess! Confess! Confess! Confess
SDN Assistant: I confess!
SDN Developer: Not you!

Software Defined Cars

CarLights

I think everything in the IT world has been tagged as “software defined” by this point. There’s software defined networking, software defined storage, the software defined data center, and so on. Given that the definitions of the things I just enumerated are very hard to nail down, it’s no surprise that many in the greater IT community just roll their eyes when they start hearing someone talk about SD.

I try to find ways to discuss advanced topics like this with people that may not understand what a hypervisor is or what a forwarding engine is really supposed to be doing. The analogies that I come up usually relate to everyday objects that are familiar to my readers. If I can frame the Internet as a highway and help people “get it,” then I’ve succeeded.

During one particularly interesting discussion, I started trying to relate SDN to the automobile. The car is a fairly stable platform that has been iterated upon many times in the 150 years that it has been around. We’ve seen steam-powered single seat models give way to 8+ passenger units capable of hauling tons of freight. It is a platform that is very much defined by the hardware. Engines and seating are the first things that spring to mind, but also wheels and cargo areas. The difference between a sports car and an SUV is very apparent due to hardware, much in the same way that a workgroup access switch only resembles a core data center switch in the most basic terms.

This got me to thinking: what would it take to software define a car? How could I radically change the thinking behind an automobile with software. At first, I thought about software programs running in the engine that assist the driver with things like fuel consumption or perhaps an on-demand traction and ride handling system. Those are great additional features for sure but they don’t really add anything to the base performance of a car beyond a few extra tweaks. Even the most advanced “programming” tools that are offered for performance specialists that allow for the careful optimization of transmission shifting patterns and fuel injector mixture recalibration don’t really fall into the software defined category. While those programs offer a way to configure the car in a manner different from the original intent they are difficult to operate and require a great deal of special knowledge to configure in the first place.

That’s when it hit me like a bolt out of the blue. We already have a software defined car. Google has been testing it for years. Only they call it a Driverless Car. Think about it in terms of our favorite topic of SDN. Google has taken the hardware that we are used to (the car) and replaced the control plane with a software construct (the robot steering mechanism). The software is capable of directing the forwarding of the hardware with no user intervention, as illustrated in this video:

That’s a pretty amazing feat when you think about it. Of course, programming a car to drive itself isn’t easy. There’s a ton of extra data that is generated as a car learns to drive itself that must be taken into account. In much the same way, the network is going to generate mountains of additional data that needs to be captured by some kind of sensor or management platform. That extra data represents the network feedback that allows you to do things like steer around obstacles, whether they be a deer in the road or a saturated uplink to a cloud provider.

In addition, the idea of a driverless software defined car is exciting because of the disruption that it represents. Once we don’t need a cockpit with a steering mechanism or access to propulsion mechanisms directly at our fingertips (or feet), we can go about breaking about the historical construction of a car and make it a more friendly concept. Why do I need to look forward when my car does all the work? Why can’t I twist the seats 90 degrees and facilitate conversation among passengers while the automation is occuring? Why can’t I put in an uplink to allow me to get work done or a phone to make calls now that my attention doesn’t need to be focused on the road? When the car is doing all the driving, there are a multitude of ideas that need to be reconsidered for how we design the automobile.

When I started bouncing this idea off of some people, Stephen Foskett (@SFoskett) mentioned to me that some people would take issue with my idea of a software defined car because it’s a self-contained, closed ecosystem. What about a software defined network that collects data and provides for greater visibility to the management layer? Doesn’t it need to be a larger system in order to really take advantage of software definition? That’s the beauty of the software defined piece. Once we have a vehicle generating large amounts of actionable data, we can now collect that and do something with it. Google has traffic data in their Maps application. What if that data was being fed in real time by the cars themselves? What if the car could automatically recognize traffic congestion and reroute on the fly instead of merely suggesting that the driver take an alternate path? What if we could load balance our highway system efficiently because the car is getting real time data about conditions. Now Google has the capability to use their software defined endpoints to reconfigure as needed.

What if that same car could automatically sense that you were driving to the airport and check you into your flight based on arrival time without the need to intervene? How about inputting a destination, such as a restaurant or a sporting event and having the car instantly reserve a parking spot near the venue based on reports from cars already in the lot or from sensors that report the number of empty spots in a parking garage nearby? The possibilities are really limitless even in this first or second stage. The key is that we capture the generated data from the software pieces that we install on top of existing hardware. We never knew we could do this because the interface into the data never existed prior to creating a software layer that we could interact with.  When you look at what Google has already done with their recent acquisition of Waze, the social GPS and map application it does look like Google is starting down this path.  Why rely on drivers to update the Waze database when the cars can do it for you?


Tom’s Take

I have spent a very large portion of my IT career driving to and from customer sites. The idea of a driverless car is appealing, but it doesn’t really help me to just sit over in the passenger seat and watch a computer program do my old job. I still like driving long distances to a certain extent. I don’t want to lose that. It’s when I can start using the software layer to enable things that I never thought possible that I start realizing the potential. Rather than just looking as the driverless software defined car as a replacement for drivers, the key is to look at the potential that it unlocks to be more efficient and make me more productive on the road. That’s the key take away for us all. Those lessons can also be applied to the world of software defined networking/storage/data center as well. We just have to remember to look past the hype and marketing and realize what the future holds in store.

Why Facebook’s Open Compute Switches Don’t Matter. To You.

Facebook_iconFacebook announced at Interop that they are soliciting ideas for building their own top-of-rack (ToR) switch via the Open Compute Project.  This sent the tech media into a frenzy.  People are talking about the end of the Cisco monopoly on switches.  Others claimed that the world would be a much different place now that switches are going to be built by non-vendors and open sourced to everyone.  I yawned and went back to my lunch.  Why?

BYO Networking Gear

As you browse the article that you’re reading about how Facebook is going to destroy the networking industry, do me a favor and take note of what kind of computer you’re using.  Is is a home-built desktop?  Is it something ordered from a vendor?  Is it a laptop or mobile device that you built? Or bought?

The idea that Facebook is building switches isn’t far fetched to me.  They’ve been doing their own servers for a while.  That’s because their environment looks wholly different than any other enterprise on the planet, with the exception of maybe Google (who also builds their own stuff).  Facebook has some very specialized needs when it comes to servers and to networking.  As they mention at conferences, the amount of data rushing into their network on an hourly, let alone daily, basis is mind boggling.  Shaving milliseconds off query times or reducing traffic by a few KB per flow translates into massive savings when you consider the scale they are operating at.

To that end, anything they can do to optimize their equipment to meet their needs is going to be a big deal.   They’ve got a significant motivation to ensure that the devices doing the heavy lifting for them are doing the best job they can.  That means they can invest a significant amount of capital into building their own network devices and still get a good return on the investment.  Much like the last time I built my own home desktop.  I didn’t find a single machine that met all of my needs and desires.  So I decided to cannibalize some parts out of an old machine and just build the rest myself.  Sure, it took me about a month to buy all the parts, ship them to my house, and then assemble the whole package together.  But in the end I was very happy with the design.  In fact, I still use it at home today.

That’s not to say that my design is the best for everyone, or anyone for that matter.  The decisions I made in building my own computer were one’s that suited me.  In much the same way, Facebook’s ToR switches probably serve very different needs than existing data centers.  Are your ToR switches optimized for east-west traffic flow?  I don’t see a lot of data at Facebook directed to other internal devices.  I think Facebook is really pushing their systems for north-south flow.  Data requests coming in from users and going back out to them are more in line with what they’re doing.  If that’s the case, Facebook will have a switch optimized for really fast data flows.  Only they’ll be flowing in the wrong direction for what most people are using data flows for today.  It’s like having a Bugatti Veyron and living in a city with dirt roads.

Facebook admitted that there are things about networking vendors they don’t like.  They don’t want to be locked into a proprietary OS like IOS, EOS, or Junos.  They want a whitebox solution that will run any OS on the planet efficiently.  I think that’s because they don’t want to get locked into a specific hardware supplier either.  They want to buy what’s cheapest at the time and build large portions of their network rapidly as needed to embrace new technology and data flows.  You can’t get married to a single supplier in that case.  If you do, a hiccup in the production line or a delay could cost you thousands, if not millions.  Just look at how Apple ensures diversity in the iPhone supply chain to get an idea of what Facebook is trying to do.  If Apple were to lose a single part supplier there would be chaos in the supply chain.  In order to ensure that everything works like a well-oiled machine, they have multiple companies supplying each outsourced part.  I think Facebook is driving for something simliar in their switch design.

One Throat To Choke

The other thing that gives me pause here is support.  I’ve long held that one of the reasons why people still buy computers from vendors or run Windows and OS X on machines is because they don’t want the headache of fixing things.  A warranty or support contract is a very reassuring thing.  Knowing that you can pick up the phone and call someone to get a new power supply or tell you why you’re getting a MAC flap error lets you go to sleep at night.  When you roll your own devices, the buck stops with you when you need to support something.  Can’t figure out how to get your web server running on Ubuntu?  Better head to the support forums.  Wondering why your BYOSwitch is dropping frames under load?  Hope you’re a Wireshark wizard.  Most enterprises don’t care that a support contract costs them money.  They want the assurance that things are going to get fixed when they break.  When you develop everything yourself, you are putting a tremendous amount of faith into those developers to ensure that bugs are worked out and hardware failures are taken care of.  Again, when you consider the scale of what Facebook is doing, the idea of having purpose-build devices makes sense.  It also makes sense that having people on staff that can fix those specialized devices is cost effective for you.

Face it.  The idea that Facebook is going to destroy the switching market is ludicrous.  You’re never going to buy a switch from Facebook.  Maybe you want to tinker around with Intel’s DPDK with a lab switch so you can install OpenFlow or something similar.  But when it comes time to forklift the data center or populate a new campus building with switches, I can almost guarantee that you’re going to pick up the phone and call Cisco, Arista, Juniper, Brocade, or HP.  Why?  Because they can build those switches faster than you can.  Because even though they are a big captial expenditure (capex), it’s still cheaper in the long run if you don’t have the time to dedicate to building your own stuff.  And when something blows up (and something always blows up), you’re going to want a TAC engineer on the phone sharing the heat with you when the CxOs come headhunting in the data center when everything goes down.

Facebook will go on doing their thing their way with their own servers and switches.  They’ll do amazing things with data that you never dreamed possible.  But just like buying a Sherman tank for city driving, their solution isn’t going to work for most people.  Because it’s built by them for them.  Just like Google’s server farms and search appliances.  Facebook may end up contributing a lot to the Open Compute Project and advancing the overall knowledge and independence of networking hardware.  But to think they’re starting a revolution in networking is about as far fetched as thinking that Myspace was going to be the top social network forever.

Brocade’s Pragmatically Defined Network

logo-brocadeMost of the readers of my blog would agree that there is a lot of discussion in the networking world today about software defined networking (SDN) and the various parts and pieces that make up that umbrella term.  There’s argument over what SDN really is, from programmability to orchestration to network function virtualization (NFV).  Vendors are doing their part to take advantage of some, all, or in some cases none of the above to push a particular buzzword strategy to customers.  I like to make sure that everything is as clear as possible before I start discussing the pros and cons.  That’s why I jumped at the chance to get a briefing from Brocade around their new software and hardware releases that were announced on April 30th.

I spoke with Kelly Harrell, Brocade’s new vice president and general manager of the Software Business Unit.  If that name sounds somewhat familiar, it might be because Mr. Harrell was formerly at Vyatta, the software router company that was acquired by Brocade last year.  We walked through a presentation and discussion of the direction that Brocade is taking their software defined networking portfolio.  According to Brocade, the key is to be pragmatic about the new network.  New technologies and methodologies need to be introduced while at the same time keeping in mind that those ideas must be implemented somehow.  I think that a large amount of the frustration with SDN today comes from a lot of vaporware presentations and pie-in-the-sky ideas that aren’t slated to come to fruition for months.  Instead, Brocade talked to me about real products and use cases that should be shipping very soon, if not already.

The key to Brocade is to balance SDN against network function virtualization, something I referred to a bit in my Network Field Day 5 post about Brocade.  Back then, I called NFV “Networking Done (by) Software,” which was my sad attempt to point out how NFV is just the opposite of what I see SDN becoming.  During our discussion, Harrell pointed out that NFV and SDN aren’t totally dissimilar after all.  Both are designed to increase the agility with which a company can execute on strategy and create value for shareholders.  SDN is primarily focused on programmability and orchestration.  NFV is tied more toward lowering costs by implementing existing technology in a flexible way.

NFV seeks to take existing appliances that have been doing tasks, such as load balancers or routers, and free their workloads from being tied to a specific piece of hardware.  In fact, there has been an explosion of these types of migrations from a variety of vendors.  People are virtualizing entire business lines in an effort to remove the reliance on specialized hardware or reduce the ongoing support costs.  Brocade is seeking to do this with two platforms right now.  The first is the Vyatta vRouter, which is the extension what came over in the Vyatta acquisition.  It’s a router and a firewall and even a virtual private networking (VPN) device that can run on just about anything.  It is hypervisor agnostic and cloud platform agnostic as well.  The idea is that Brocade can include a copy of the vRouter with application packages that can be downloaded from an enterprise cloud app store.  Once downloaded and installed, the vRouter can be fired up and pull a predefined configuration from the scripts included in the box.  By making it agnostic to the underlying platform, there’s no worry about support down the road.

The second NFV platform Brocade told me about is the virtual ADX application delivery switch.  It’s basically a software load balancer.  That’s not really the key point of the whole idea of applying the NFV template to an existing hardware platform.  Instead, the idea is that we’re taking something that’s been historically huge and hard to manage and moving it closer to the edge where it can be of better use.  Rather that sticking a huge load balancer at the entry point to the data center to ensure that flows are separated, the vADX allows the load balancer to be deployed very close to the server or servers that need to have the information flow metered.  Now, the agility of SDN/NFV allows these software devices to be moved and reconfigured quickly without needing to worry about how much reprogramming is going to be necessary to pull the primary load balancer out or change a ton of rules to take reroute traffic to a vMotioned cluster.  In fact, I’m sure that we’re going to see a new definition of the “network edge” being to emerge as more software-based NFV devices begin to be deployed closer and closer to the devices that need them.

On the OpenFlow front, Brocade told me about their new push toward something they are calling “Hybrid Port OpenFlow.”  OpenFlow is a great disruptive SDN technology that is gaining traction today, largely in part because of companies like Brocade and NEC that have embraced it and started pushing it out to their customer base well ahead of other manufacturers.  Right now, OpenFlow support really consists to two modes – ON and OFF.  OFF is pretty easy to imagine.  ON is a bit more complicated.  While a switch can be OpenFlow enabled and still forward normal traffic, the practice has always been to either dedicate the switch to OpenFlow forwarding, in effect turning it into a lab switch, or to enable OpenFlow selectively for a group of ports out of the whole switch, kind of like creating a lab VLAN for testing on a production box.  Brocade’s Hybrid Port OpenFlow model allows you to enable OpenFlow on a port and still allow it to do regular traffic forwarding sans OpenFlow.  That may be the best model for adopters going forward due to one overriding factor – cost.  When you take a switch or a group of ports on a switch and dedicate them for OpenFlow, you are cost the enterprise something.  Every port on the switch costs a certain amount of money.  Every minute an engineer spends working on a crazy lab project incurs a cost.  By enabling the network engineers to turn on OpenFlow at will without disrupting the existing traffic flow, Brocade can reduce the opportunity cost of enabling OpenFlow to almost zero.  If OpenFlow just becomes something that works as soon as you enable it, like IPv6 in Windows 7, you don’t have to spend as much time planning for your end node configuration.  You just build the core and let the end nodes figure out they have new capabilities.  I figure that large Brocade networks will see their OpenFlow adoption numbers skyrocket simply because Hybrid Port mode turns the configuration into Easy Mode.

The last interesting software piece that Brocade showed me is a prime example of the kinds of things that I expect SDN to deliver to us in the future.  Brocade has created an application called the Application Resource Broker (ARB).  It sits above the fray of the lower network layers and monitors indicators of a particular application’s health, such as latency and load.  When one of those indicators hits a specific threshold, ARB kicks in to request more resources from vCenter to balance things out.  If the demand on the application continues to rise beyond the available resources, ARB can dynamically move the application to a public cloud instance with a much deeper pool of resources, a process known as cloudbursting.  All of this can happen automatically without the intervention of IT.  This is one of the things that shows me what SDN can really do.  Software can take care of itself and dynamically move things around when abnormal demand happens.  Intelligent choices about the network environment can be made on solid data.  No guess what about what “might” be happening.  ARB removes doubt and lag in response time to allow for seamless network repair.  Try doing that with a telnet session.

There’s a lot more to the Brocade announcement than just software.  You can check it out at http://www.brocade.com.  You can also follow them on Twitter as @BRCDComm.


Tom’s Take

The future looks interesting at first.  Flying cars, moving sidewalks, and 3D user interfaces are all staples of futuristic science fiction.  The problem for many arises when we need to start taking steps to build those fanciful things.  A healthy dose of pragmatism helps to figure out what we need to do today to make tomorrow happen.  If we root our views of what we want to do with what we can do, then the future becomes that much more achievable.  Even the amazing gadgets we take for granted today have a basis in the real technology of the time they were first created.  By making those incremental steps, we can arrive where we want to be a whole lot sooner with a better understanding of how amazing things really are.