Facebook Wedge 100 – The Future of the Data Center?

 

FBLike

Facebook is back in the news again. This time, it’s because of the release of their new Wedge 100 switch into the Open Compute Project (OCP). Wedge was already making headlines when Facebook announced it two years ago. A fast, open sourced 40Gig Top-of-Rack (ToR) switch was huge. Now, Facebook is letting everyone in on the fun of a faster Wedge that has been deployed into production at Facebook data centers as well as being offered for sale through Edgecore Networks, which is itself a division of Accton. Accton has been leading the way in the whitebox switching market and Wedge 100 may be one of the ways it climbs to the top.

Holy Hardware!

Wedge 100 is pretty impressive from the spec sheet. They paid special attention to making sure the modules were expandable, especially for faster CPUs and special purpose devices down the road. That’s possible because Wedge is a highly specialized micro server already. Rather than rearchitecting the guts of the whole thing, Facebook kept the CPU and the monitoring stack and just put newer, faster modules on it to ramp to 32x100Gig connectivity.

12809187_1676340369272065_1831349201_n

As suspected in the above image, Facebook is using Broadcom Tomahawk as the base connectivity in their switch, which isn’t surprising. Tomahawk is the roadmap for all vendors to get to 100Gig. It also means that the downlink connectivity for these switches could conceivably work in 25/50Gig increments. However, given the enormous amount of east/west traffic that Facebook must generate, Facebook has created a server platform they call Yosemite that has 100Gig links as well. Given the probably backplane there, you can imagine the data that’s getting thrown around the data centers.

That’s not all. Omar Baldonado has said that they are looking at going to 400Gig connectivity soon. That’s the kind of mind blowing speed that you see in places like Google and Facebook. Remember that this hardware is built for a specific purpose. They don’t just have elephant flows. They have flows the size of an elephant herd. That’s why they fret about the operating temperature of optics or the rack design they want to use (standard versus Open Racks). Because every little change matters a thousand fold at that scale.

Software For The People

The other exciting announcement from Facebook was on the software front. Of course, FBOSS has been updated to work with Wedge 100. I found it very interesting in the press release that much of the programming in FBOSS went into interoperability with Wedge 40 and with fixing the hardware side of things. This makes some sense when you realize that Facebook didn’t need to spend a lot of time making Wedge 40 interoperate with anything, since it was a wholesale replacement. But Wedge 100 would need to coexist with Wedge 40 as the rollout happens, so making everything play nice is a huge point on the checklist.

The other software announcement that got the community talking was support for third-party operating systems running on Wedge 100. The first one up was Open Network Linux from Big Switch Networks. ONL ran on the original Wedge 40 and now runs on the Wedge 100. This means that if you’re familiar with running BSN OSes on your devices, you can drop in a Wedge 100 in your spine or fabric and be ready to go.

The second exciting announcement about software comes from a new company, Apstra. Apstra announced their entry into OCP and their intent to get their Apstra Operating System (AOS) running on Wedge 100 by next year. That has a big potential impact for Apstra customers that want to deploy these switches down the road. I hope to hear more about this from Apstra during their presentation at Networking Field Day 13 next month.


Tom’s Take

Facebook is blazing a trail for fast ToR switches. They’ve got the technical chops to build what they need and release the designs to the rest of the world to be used for a variety of ideas. Granted, your data center looks nothing like Facebook. But the ideas they are pioneering are having an impact down the line. If Open Rack catches on you may see different ideas in data center standardization. If the Six Pack catches on as a new chassis concept, it’s going to change spines as well.

If you want to get your hands dirty with Wedge, build a new 100Gig pod and buy one from Edgecore. The downlinks can break out into 10Gig and 25Gig links for servers and knowing it can run ONL or Apstra AOS (eventually) gives you some familiar ground to start from. If it runs as fast as they say it does, it may be a better investment right now than waiting for Tomahawk II to come to your favorite vendor.

 

 

Tomahawk II – Performance Over Programmability

tomahawk2

Broadcom announced a new addition to their growing family of merchant silicon today. The new Broadcom Tomahawk II is a monster. It doubles the speed of it’s first-generation predecessor. It has 6.4 Tbps of aggregate throughout, divided up into 256 25Gbps ports that can be combined into 128 50Gbps or even 64 100Gbps ports. That’s fast no matter how you slice it.

Broadcom is aiming to push these switches into niches like High-Performance Computing (HPC) and massive data centers doing big data/analytics or video processing to start. The use cases for 25/50Gbps haven’t really changed. What Broadcom is delivering now is port density. I fully expect to see top-of-rack (ToR) switches running 25Gbps down to the servers with new add-in cards connected to 50Gbps uplinks that deliver them to the massive new Tomahawk II switches running in the spine or end-of-row (EoR) configuration for east-west traffic disbursement.

Another curious fact of the Tomahawk II is the complete lack of 40Gbps support. Granted, the support was only paid lip service in the Tomahawk I. The real focus was on shifting to 25/50Gbps instead of the weird 10/40/100Gbps split we had in Trident II. I talked about this a couple of years ago and wasn’t very high on it back then, but I didn’t know the level of apathy people had for 40Gbps uplinks. The push to 25/50Gbps has only been held up so far by the lack of availability of new NICs for servers to enable faster speeds. Now that those are starting to be produced in volume expect the 40Gbps uplinks to be a relic of the past.

A Foot In The Door

Not everyone is entirely happy about the new Broadcom Tomahawk II. I received an email today with a quote from Martin Izzard of Barefoot Networks, discussing their new Tofino platform. He said in part:

Barefoot led the way in June with the introduction of Tofino, the world’s first fully programmable switches, which also happen to be the fastest switches ever built.

It’s true that Tofino is very fast. It was the first 6.4 Tbps switch on the market. I talked a bit about it a few months ago. But I think that Barefoot is a bit off on its assessment here and has a bit of an axe to grind.

Barefoot is pushing something special with Tofino. They are looking to create a super fast platform with programmability. P4 is not quite an FPGA and it’s not an ASIC. It’s a switch stripped to its core and rebuilt with a language all its own. That’s great if you’re a dev shop or a niche market that has to squeeze every ounce of performance out of a switch. In the world of cars, the best analogy would be looking at Tofino like a specialized sports car like a Koenigsegg Agera. It’s very fast and very stylish, but it’s purpose built to do one thing – drive really fast on pavement and carry two passengers.

Broadcom doesn’t really care about development shops. They don’t worry about niche markets. Because those users are not their customers. Their customers are Arista, Cisco, Brocade, Juniper and others. Broadcom really is the Intel of the switching world. Their platforms power vendor offerings. Buying a basic Tomahawk II isn’t something you’re going to be able to do. Broadcom will only sell these in huge lots to companies that are building something with them. To keep the car analogy, Tomahawk II is more like the old F-body cars produced by Chevrolet that later went on to become Camaros, Firebirds, and Trans Ams. Each of these cars was distinctive and had their fans, but the chassis was the same underneath the skin.

Broadcom wants everyone to buy their silicon and use it to power the next generation of switches. Barefoot wants a specialist kit that is faster than anything else on the market, provided you’re willing to put the time into learning P4 and stripping out all the bits they feel are unnecessary. Your use case determines your hardware. That hasn’t changed, nor is it likely to change any time soon.


Tom’s Take

The data center will be 25/50/100Gbps top to bottom when the next switch refresh happens. It could even be there sooner if you want to move to a pod-based architecture instead of more traditional designs. The odds are very good that you’re going to be running Tomahawk or Tomahawk II depending on which vendor you buy from. You’re probably only going to be running something special like Tofino or maybe even Cavium if you’ve got a specific workload or architecture that you need performance or programmability.

Don’t wait for the next round of hardware to come out before you have an upgrade plan. Write it now. Think about where you want to be in 4 years. Now double your requirements. Start investigating. Ask your vendor of choice what their plans are. If their plans stink, as their competitor. Get quotes. Get ideas. Be ready for the meeting when it’s scheduled. Make sure you’re ready to work with your management to bury the hatchet, not get a hatchet jobbed network.

Thoughts on Theft

ShareArrows

It’s been a busy week for me. In fact, it’s been a busy few weeks. I’ve had lots of time to enjoy NetApp Insight, Cloud Field Day, and Storage Field Day. I’ve also been doing my best to post interesting thoughts and ideas. Whether it’s taking on the CCIE program or keynote speakers, I feel like I owe a debt to the community and my readers to talk about topics that are important to them, or at least should be. Which is why I’m irritated right now about those ideas being stolen.

Beg, Borrow, and Steal

A large part of my current job is finding people that are writing great things and shining a spotlight on them. I like reading interesting ideas. And I like sharing those ideas with people. But when I share those ideas with people, I make absolutely sure that everyone knows where those ideas came from originally. And if I use those ideas for writing my own content, I make special care to point out where they came from and try to provide the context for the original statement in the first place.

What annoys me to no end is when people take ideas as their own and try to use them for their own ends. It’s not all that difficult. You can use weasel words like “sources” or “I heard once” or even “I read this article”. Those are usually good signs that content is going to be appropriated for some purpose. It’s also a sign that research isn’t being done or attributed properly. It’s lazy journalism at best.

What really grinds my gears is when my ideas are specifically taken and used elsewhere without attribution. Luckily, I haven’t had to deal with it much so far. I have a fairly liberal policy about sharing my work. I just want people to recognize the original author. But when my words end up in someone else’s mouth, that’s when the problems start.

Credit Where It Is Due

Taking ideas given freely without offering a clue as to where they come from is theft. Plain and simple. It takes the hard work that someone has put in to thinking through an issue and wraps it up in a cloudy mess. Now, who is to say (beyond dates) who was the originator of the idea? It’s just as easy to say that someone else came up with it. That’s what makes the tracing the origin of things so difficult. Proper attribution for ideas is important in a society where knowledge carries so much weight.

I don’t expect to make millions of dollars from my ideas. I have opinions. I have thoughts. Sometimes people agree with them. Just as often, people disagree. The point is not to be right or wrong or rich. The true point is to make sure that the thoughts and ideas of a person are placed where they belong when the threads are all unwound.

Honestly, I don’t even really want a ton of credit. It does me little good to have someone shouting from the rooftops that I was the first person to talk about something. Or that I was right when everyone else was wrong. But when the butcher’s bill comes due, I’d at least like to have my name attached to my thoughts.


Tom’s Take

I’ve luckily been able to have most of my appropriated content taken down. Some have used it as fuel for a link bait scheme to get paid. Others have used it as a way to build a blog for readership for some strange purpose. Thankfully, I’ve never run into anyone that was vocally taking credit for my writing and passing it off as their own. If you are a smart person and willing to writing things down, do the best you can with what you have. You don’t need to take something else that someone has written and attempt to make it you own. That just tarnishes what you’re trying to do and makes all your writing suspect. Be the best you can be and no one will ever question who you are.

Keystone Keynotes

keystonekeynotepatrol

My distaste for keynotes is well known. With the possible exception of Justin Warren (@JPWarren) there may not be a person that dislikes them more than I do. I’ve outlined my reasons for it before, so I won’t go into much depth about it here. But I do want to highlight a few recent developments that are doing a great job of helping me find new things to dislike.

Drop The “Interviews”

When you walk into a keynote ballroom or arena and see two comfy chairs on stage, you know what’s coming. As someone told me recently, “This is when I know the next hour is going to suck.” The mock interview style of keynote speech is not good. It’s a thinly-veiled attempt to push an agenda. Perhaps it’s about innovation. Or transformation. Or some theme of the conference. Realistically, it’s mostly a chance for a keynote host (some form of VP) to provide forced banter with a celebrity that’s being paid to be there.

These “interviews” are rarely memorable. They seem self serving and very plastic. The only ones that even stand out to me in recent memory are the ones that went off the rails. The time when Elon Musk was “interviewed” on stage at Dell World and responded with clipped answers and no elaboration. Or the time Richard Branson was hitting on the host at Cisco Live. Or the Cisco Live when William Shatner started taking shots at Cisco on stage!

It’s time to drop the fake interviews. Let the speakers tell their stories. Kevin Spacey at Cisco Live 2016 was a breath of fresh air. He was compelling. Invigorating. Engaging. People around me said it was the best keynote they’d heard in year. It was easily the best one I’d see since John Cleese in Orlando in 2008. Give the people who spend their time telling stories a chance to shine. Don’t inject yourself into the process. Because actors and celebrity storytellers don’t play. They live their stories.

All By My Selfie

If the keynote involves talking about community or the power of the user base or some other contrite platitude, you can almost guarantee that the host VP is going to pause at some point, usually during the big celebrity interview, to take selfie with their guest and the whole audience in the background. It’s a nod to how hooked in and in the know with the community. Think back to Ellen Degeneres and her infamous Oscars selfie:

Except it’s not. It’s a big steaming pile of patronizing behavior. Hey everyone that paid $1,500 to hear our transformation strategy! Let me take a picture of myself on a stage with blurry, badly lit faces in the audience! Let me post it to Facegram and Instabook and Snapfilter Stories! Let me have my social team repost it and heart/like/favorite it as many times as it takes for me to look like I “get” social. And after the conference is over, my InstaFaceSnapgrambookfilter feed will go back to auto posting the content fed to it by a team of people trying to make me seem human but not be controversial or get us sued.

Don’t take a selfie with 4,000 people in a hall. Meet those users. The ones that paid you. The ones that run your hardware even though your competitor is knocking on the door every week trying to get them to dump you. The users and customers that are supporting your efforts to cut your nose off to spite your face as you transform yourself into a software company. Or a cloud provider. Or an app company. Don’t pretend that the little people matter in a selfie that needs Super Troopers-style ENHANCE to find my shining freckles in the dark. Be a human and take a selfie with one user that has stuck by you through thick and thin. Make their day. Don’t make yours.

Distrupting Disruption

“We’re like the Uber of….”

No. You aren’t. If you are a part of the market, you aren’t disrupting it. You may be shifting your ideas or repositioning your strategies, but that’s not disruption. You still support your old equipment and software. You’re not prepared to jettison your existing business models to move somewhere new. A networking company building networking software isn’t disruption. A server company buying a networking startup isn’t disruption. It’s strategy.

Uber is the Business School case study for disruption. Every keynote in the last two years has mentioned them. Expect their disruption of the transportation market is far from total or completely impressive. Sure, they are farming out taxi services. They’re also cutting rates to drive business to increase profits without helping drivers with new lower rates. They are bullying municipalities to get laws passed to protect them. They’re driving other companies out of business to reduce competition. Does that sound like the Disruptors of Taxis? Or does is sound like the very cab companies that are getting run out of business by the very tactics they themselves have used?

Don’t tell me how you’re disrupting digital or accelerating change. Tired cliches are tired. Tell me what you’re doing. Tell me how you’re going to head off your competitors. Tell me how you’re addressing a shrinking market for hardware or a growing landscape of people doing it faster, cheaper, and better. This is one of the things that I enjoy about being an analyst. These briefings are generally a little more focused on the concrete and less on the cheerleading, which is a very pleasant surprise to me given my distaste for professional analyst firms.

If you’re tempted to say that you’re the Uber of your industry, do us all a favor and request one to drive you off the stage.


Tom’s Take

Does my dislike of keynotes show yet? Are some you sitting in your chairs cheering? Good. Because it’s all a show for you. It’s a hand-holding, happy hugging reinforcement of how awesome we are. Outside of a few dynamic speakers (who are usually made CTO or VP of Technology), we don’t get the good stuff any more.

If you’re sitting in your chair and getting offended that I’m picking on your event, you should know two things. First, I’m not singling anyone out. EVERY keynote I’ve seen in the last two years is guilty of these things. And if you think yours is, you’re probably right. Fix it. Transform and Disrupt your own keynote. Let story tellers talk. Cut down on the attempts to relate to people. Tell your story. Tell people why they should be excited again. Don’t use cliches. Or funny videos. Or cameraphones. Get back to the business of telling people why you’re in business. Ditch the Keystone Keynotes and I promise you’ll have happier audiences. Including me.

Apple Watch Unlock, 802.11ac, and Time

applewatchface

One of the benefits of upgrading to MacOS 10.12 Sierra is the ability to unlock my Mac laptop with my Apple Watch. Yet I’m not able to do that. Why? Turns out, the answer involves some pretty cool tech.

Somebody’s Watching You

The tech specs list the 2013 MacBook and higher as the minimum model needed to enable Watch Unlock on your Mac. You also need a few other things, like Bluetooth enabled and a Watch running WatchOS 3. I checked my personal MacBook against the original specs and found everything in order. I installed Sierra and updated all my other devices and even enabled iCloud Two-Factor Authentication to be sure. Yet, when I checked the Security and Privacy section, I didn’t see the checkbox for the Watch Unlock to be enabled. What gives?

It turns out that Apple quietly modified the minimum specs during the Sierra beta period. Instead of early 2013 MacBooks being support, the shift moved support to mid-2013 MacBooks instead. I checked the spec sheets and mine is almost identical. The RAM, drive, and other features are the same. Why does Watch Unlock work on those Macs and not mine? The answer, it appears, is wireless.

Now AC The Light

The mid-2013 MacBook introduced Apple’s first 802.11ac wireless chipset. That was the major reason to upgrade over the earlier models. The Airport Extreme also supported 11ac starting in mid-2013 to increase speeds to more than 500Mbps transfer rates, or Wave 1 speeds.

While the majority of the communication that the Apple Watch uses with your phone and your MacBook is via Bluetooth, it’s not the only way it communicates. The Apple Watch has a built-in wireless radio as well. It’s a 2.4GHz b/g/n radio. Normally, the 11ac card on the MacBook can’t talk to the Watch directly because of the frequency mismatch. But the 11ac card in the 2013 MacBook enables a different protocol that is the basis for the unlocking feature.

802.11v has been used for a while as a fast roaming feature for mobile devices. Support for it has been spotty before wider adoption of 802.11ac Wave 1 access points. 802.11v allows client devices to exchange information about network topology. 11v also allows for clients to measure network latency information by timing the arrival of packets. That means that a client can ping an access point or another client and get a precise timestamp of the arrival of that packet. This can be used for a variety of things, most commonly location services.

Time Is On Your Side

The 802.11v timestamp has been proposed to be used as a “time of flight” calculation all the back since 2008. Apple has decided to use Time of Flight as a security mechanism for the Watch Unlock feature. Rather than just assume that the Watch is in range because it’s communicating over Bluetooth, Apple wanted to increase the security of the Watch/Mac connection. When the Mac detects that the Watch is within 3 meters of the Mac it is connected to via Handoff it is in the right range to trigger an unlock. This is where the 11ac card works magic.

When the Watch sends a Bluetooth signal to trigger the unlock, the Mac sends an additional 802.11v request to the watch via wireless. This request is then timed for arrival. Since the Mac knows the watch has to be within 3 meters, the timestamp on the packet has a very tight tolerance for delay. If the delay is within the acceptable parameters, the Watch unlock request is approved and your Mac is unlocked. If there is more than the acceptable deviation, such as when used via a Bluetooth repeater or some other kind of nefarious mechanism, the unlock request will fail because the system realizes the Watch is outside the “safe” zone for unlocking the Mac.

Why does the Mac require an 802.11ac card for 802.11v support? The simple answer is because the Broadcom BCM43xx card in the early 2013 MacBooks and before doesn’t support the 802.11v time stamp field (page 5). Without support for the timestamp field, the 802.11v Time of Flight packet won’t work. The newer Broadcom 802.11ac compliant BCM43xx card in the mid-2013 MacBooks does support the time stamp field, thus allowing the security measure to work.


Tom’s Take

All cool tech needs a minimum supported level. No one could have guess 3-4 years ago that Apple would need support for 802.11v time stamp fields in their laptop Airport cards. So when they finally implemented it in mid-2013 with the 802.11ac refresh, they created a boundary for support for a feature on a device that was in the early development stages. Am I disappointed that my Mac doesn’t support watch unlock? Yes. But I also understand why now that I’ve done the research. Unforeseen consequences of adoption decisions really can reach far into the future. But the technology that Apple is building into their security platform is cool no matter whether it’s support on my devices or not.

DevOps and the Infrastructure Dumpster Fire

dumpsterfire2

We had a rousing discussion about DevOps at Cloud Field Day this week. The delegates talked about how DevOps was totally a thing and it was the way to go. Being the infrastructure guy, I had to take a bit of umbrage at their conclusions and go on a bit of a crusade myself to defend infrastructure from the predations of developers.

Stable, Boy

DevOps folks want to talk about continuous integration and continuous deployment (CI/CD) all the time. They want the freedom to make changes as needed to increase bandwidth, provision ports, and rearrange things to fit development timelines and such. It’s great that they have they thoughts and feelings about how responsive the network should be to their whims, but the truth of infrastructure today is that it’s on the verge of collapse every day of the week.

Networking is often a “best effort” type of configuration. We monkey around with something until it works, then roll it into production and hope it holds. As we keep building more patches on to of patches or try to implement new features that require something to be disabled or bypassed, that creates a house of cards that is only as strong as the first stiff wind. It’s far too easy to cause a network to fall over because of a change in a routing table or a series of bad decisions that aren’t enough to cause chaos unless done together.

Jason Nash (@TheJasonNash) said that DevOps is great because it means communication. Developers are no longer throwing things over the wall for Operations to deal with. The problem is that the boulder they were historically throwing over in the form of monolithic patches that caused downtime was replaced by the storm of arrows blotting out the sun. Each individual change isn’t enough to cause disaster, but three hundred of them together can cause massive issues.

arrows-blot-out-the-sun-800x500

Networks are rarely stable. Yes, routing tables are mostly stabilized so long as no one starts withdrawing routes. Layer 2 networks are stable only up to a certain size. The more complexity you pile on networks, the more fragile they become. The network really only is one fat-fingered VLAN definition or VTP server mode foul up away from coming down around our ears. That’s not a system that can support massive automation and orchestration. Why?

The Utility of Stupid Networking

The network is a source of pain not because of finicky hardware, but because of applications and their developers. When software is written, we have to make it work. If that means reconfiguring the network to suit the application, so be it. Networking pros have been dealing with crap like this for decades. Want proof?

  1. Applications can’t take to multiple gateways at a time on layer 2 networks. So lets create a protocol to make two gateways operate as one with a fake MAC address to answer requests and ensure uptime. That’s how we got HSRP.
  2. Applications can’t survive having the IP address of the server changed. Instead of using so many other good ideas, we create vMotion to allow us to keep a server on the same layer 2 network and change the MAC <-> IP binding. vMotion and the layer 2 DCI issues that it has caused has kept networking in the dark for the last eight years.
  3. Applications that run don’t need to be rewritten to work in the cloud. People want to port them as-is to save money. So cloud networking configurations are a nightmare because we have to support protocols that shouldn’t even be used for the sake of legacy application support.

This list could go on, but all these examples point to one truth: The application developers have relied upon the network to solve their problems for years. So the network is unstable because it’s being extended beyond the use case. Newer applications, like Netflix and Facebook, and thrive in the cloud because they were written from the ground up to avoid using layer 2 DCI or even operate at layer 2 beyond the minimum amount necessary. They solve tricky problems like multi host combinations and failover in the app instead of relying on protocols from the golden age of networking to fix it quietly behind the scenes.

The network needs to evolve past being a science project for protocols that aim to fix stupid application programming decisions. Instead, the network needs to evolve with an eye toward stability and reduced functionality to get there. Take away the ability to even try to do those stupid tricks and what you’re left with is a utility that is a resource for your developers. They can use it for transport without worrying about it crashing every day with some bug in a protocol no one has used in five years yet was still installed just in case someone turned on an old server accidentally.

Nowhere is this more apparent than cloud networking stacks like AWS or Microsoft Azure. There, the networking is as simplistic as possible. The burden for advanced functionality per group of users isn’t pushed into a realm where network admins need to risk outages to fix a busted application. Instead, the app developers can use the networking resources in a basic way to encourage them to think about failover and resiliency in a new way. It’s a brave new world!


Tom’s Take

I’ll admit that DevOps has potential. It gets the teams talking and helps analyze build processes and create more agile applications. But in order for DevOps to work the way it should, it’s going to need a stable platform to launch from. That means networking has to get its act together and remove the unnecessary things that can cause bad interactions. This was caused in part by application developers taking the easy road and pushing against the networking team of wizards. When those wizards push back and offer reduced capabilities countered against more uptime and fewer issues you should start to see app developers coming around to work with the infrastructure teams to get things done. And that is the best way to avoid an embarrassing situation that involves fire.

Cloud Apps And Pathways

jam

Applications are king. Forget all the things you do to ensure proper routing in your data center. Forget the tweaks for OSPF sub-second failover or BGP optimal path selection. None of it matters to your users. If their login to Seibel or Salesforce or Netflix is slow today, you’ve failed. They are very vocal when it comes to telling you how much the network sucks today. How do we fix this?

Pathways Aren’t Perfect

The first problem is the cloud focus of applications. Once our packets leave our border routers it’s a giant game of chance as to how things are going to work next. The routing protocol games that govern the Internet are tried and true and straight out of RFC 1771(Yes, RFC 4271 supersedes it). BGP is a great tool with general purpose abilities. It’s becoming the choice for web scale applications like LinkedIn and Facebook. But it’s problematic for Internet routing. It scales well but doesn’t have the ability to make rapid decisions.

The stability of BGP is also the reason why it doesn’t react well to changes. In the old days, links could go up and down quickly. BGP was designed to avoid issues with link flaps. But today’s links are less likely to flap and more likely to need traffic moved around because of congestion or other factors. The pace that applications need to move traffic flows means that they tend to fight BGP instead of being relieved that it’s not slinging their traffic across different links.

BGP can be a good suggestion of path variables. That’s how Facebook uses it for global routing. But decisions need to be made on top of BGP much faster. That’s why cloud providers don’t rely on it beyond basic connectivity. Things like load balancers and other devices make up for this as best they can, but they are also points of failure in the network and have scalability limitations. So what can we do? How can we build something that can figure out how to make applications run better without the need to replace the entire routing infrastructure of the Internet?

GPS For Routing

One of the things that has some potential for fixing inefficiency with BGP and other basic routing protocols was highlighted during Networking Field Day 12 during the presentation from Teridion. They have a method for creating more efficiency between endpoints thanks to their agents. Founder Elad Rave explains more here:

I like the idea of getting “traffic conditions” from endpoints to avoid congestion. For users of cloud applications, those conditions are largely unknown. Even multipath routing confuses tried-and-true troubleshooting like traceroute. What needs to happen is a way to collect the data for congestion and other inputs and make faster decisions that aren’t beholden to the underlying routing structure.

Overlay networking has tried to do this for a while now. Build something that can take more than basic input and make decisions on that data. But overlays have issues with scaling, especially past the boundary of the enterprise network. Teridion has potential to help influence routing decisions in networks outside your control. Sadly, even the fastest enterprise network in the world is only as fast as an overloaded link between two level 3 interconnects on the way to a cloud application.

Teridion has the right idea here. Alternate pathways need to be identified and utilized. But that data needs to be evaluated and updated regularly. Much like the issues with Waze dumping traffic into residential neighborhoods when major arteries get congested, traffic monitors could cause overloads on alternate links if shifts happen unexpectedly.

The other reason why I like Teridion is because they are doing things without hardware boxes or the need to install software anywhere but the end host. Anyone working with cloud-based applications knows that the provider is very unlikely to provide anything outside of their standard offerings for you. And even if they manage, there is going to be a huge price tag. More often than not, that feature request will become a selling point for a new service in time that may be of marginal benefit until everyone starts using it. Then application performance goes down again. Since Teridion is optimizing communications between hosts it’s a win for everyone.


Tom’s Take

I think Teridion is on to something here. Crowdsourcing is the best way to gather information about traffic. Giving packets a better destination with shorter travel times means better application performance. Better performance means happier users. Happier users means more time spent solving other problems that have symptoms that aren’t “It’s slow” or “Your network sucks”. And that makes everyone happier. Even grumpy old network engineers.

Disclaimer

Teridion was a presenter during Networking Field Day 12 in San Francisco, CA. As a participant in Networking Field Day 12, my travel and lodging expenses were covered by Tech Field Day for the duration of the event. Teridion did not ask for nor where they promised any kind of consideration in the writing of this post. My conclusions here represent my thoughts and opinions about them and are mine and mine alone.