Unknown's avatar

About networkingnerd

Tom Hollingsworth, CCIE #29213, is a former network engineer and current organizer for Tech Field Day. Tom has been in the IT industry since 2002, and has been a nerd since he first drew breath.

Is The Rise Of SD-WAN Thanks To Ethernet?

Ethernet

SD-WAN has exploded in the market. Everywhere I turn, I see companies touting their new strategy for reducing WAN complexity, encrypting data in flight, and even doing analytics on traffic to help build QoS policies and traffic shaping for critical links. The first demo I ever watched for SDN was a WAN routing demo that chose best paths based on cost and time-of-day. It was simple then, but that kind of thinking has exploded in the last 5 years. And it’s all thanks to our lovable old friend, Ethernet.

Those Old Serials

When I started in networking, my knowledge was pretty limited to switches and other layer 2 devices. I plugged in the cables, and the things all worked. As I expanded up the OSI model, I started understanding how routers worked. I knew about moving packets between different layer 3 areas and how they controlled broadcast storms. This was also around the time when layer 3 switching was becoming a big thing in the campus. How was I supposed to figure out the difference between when I should be using a big router with 2-3 interfaces versus a switch that had lots of interfaces and could route just as well?

The key for me was media types. Layer 3 switching worked very well as long as you were only connecting Ethernet cables to the device. Switches were purpose built for UTP cable connectivity. That works really well for campus networks with Cat 5/5e/6 cabling. Switched Virtual Interfaces (SVIs) can handle a large amount of the routing traffic.

For WAN connectivity, routers were a must. Because only routers were modular in a way that accepted cards for different media types. When I started my journey on WAN connectivity, I was setting up T1 lines. Sometimes they had an old-fashioned serial connector like this:

s-l300

Those connected to external CSU/DSU modules. Those were a pain to configure and had multiple points of failure. Eventually, we moved up in the world to integrated CSU/DSU modules that looked like this:

ehwic-2-ports-t-1-e-1

Those are really awesome because all the configuration is done on the interface. They also take regular UTP cables instead of those crazy V.35 monsters.

cisco_v35_old_large

But those UTP cables weren’t Ethernet. Those were still designed to be used as serial connections.

It wasn’t until the rise of MPLS circuits and Transparent LAN services that Ethernet became the dominant force in WAN connectivity. I can still remember turning up my first managed circuit and thinking, “You mean I can use both FastEthernet interfaces? No cards? Wow!”.

Today, Ethernet dominates the landscape of connectivity. Serial WAN interfaces are relegated to backwater areas where you can’t get “real WAN connectivity”. And in most of those cases, the desire to use an old, slow serial circuit can be superseded by a 4G/LTE USB modem that can be purchased from almost any carrier. It would appear that serial has joined the same Heap of History as token ring, ARCnet, and other venerable connectivity options.

Rise, Ethernet

The ubiquity of Ethernet is a huge boon to SD-WAN vendors. They no longer have to create custom connectivity options for their appliances. They can provide 3-4 Ethernet interfaces and 2-3 USB slots and cover a wide range of options. This also allows them to simplify their board designs. No more modular chassis. No crazy requirements for WIC slots, NM slots, or any other crazy terminology that Cisco WAN engineers are all too familiar with.

Ethernet makes sense for SD-WAN vendors because they aren’t concerned with media types. All their intelligence resides in the software running on the box. They’d rather focus on creating automatic certificate-based IPsec VPNs than figuring out the clock rate on a T1 line. Hardware is not their end goal. It is much easier to order a reference board from Intel and plug it into a box than trying to configure a serial connector and make a custom integration.

Even SD-WAN vendors that are chasing after the service provider market are benefitting from Ethernet ubiquity. Service providers may still run serial connections in their networks, but management of those interfaces at the customer side is a huge pain. They require specialized technical abilities. It’s expensive to manage and difficult to troubleshoot remotely. Putting Ethernet handoffs at the CPE side makes life much easier. In addition, making those handoffs Ethernet makes it much easier to offer in-line service appliances, like those of SD-WAN vendors. It’s a good choice all around.

Serial connectivity isn’t going away any time soon. It fills an important purpose for high-speed connectivity where fiber isn’t an option. It’s also still a huge part of the install base for circuits, especially in rural areas or places where new WAN circuits aren’t easily run. Traditional routers with modular interfaces are still going to service a large number of customers. But Ethernet connectivity is quickly growing to levels where it will eclipse these legacy serial circuits soon. And the advantage for SD-WAN vendors can only grow with it.


Tom’s Take

Ethernet isn’t the only reason SD-WAN has succeeded. Ease of use, huge feature set, and flexibility are the real reasons when SD-WAN has moved past the concept stage and into deployment. WAN optimization now has SD-WAN components. Service providers are looking to offer it as a value added service. SD-WAN has won out on the merits of the technology. But the underlying hardware and connectivity was radically simplified in the last 5-7 years to allow SD-WAN architects and designers to focus on the software side of things instead of the difficulties of building complicated serial interfaces. SD-WAN may not owe it’s entire existence to Ethernet, but it got a huge push in the right direction for sure.

HPE Networking: Past, Present, and Future

hpe_pri_grn_pos_rgb

I had the chance to attend HPE Discover last week by invitation from their influencer team. I wanted to see how HPE Networking had been getting along since the acquisition of Aruba Networks last year. There have been some moves and changes, including a new partnership with Arista Networks announced in September. What follows is my analysis of HPE’s Networking portfolio after HPE Discover London and where they are headed in the future.

Campus and Data Center Divisions

Recently, HPE reorganized their networking division along two different lines. The first is the Aruba brand that contains all the wireless assets along with the campus networking portfolio. This is where the campus belongs. The edge of the network is an ever-changing area where connectivity is king. Reallocating the campus assets to the capable Aruba team means that they will do the most good there.

The rest of the data center networking assets were loaded into the Data Center Infrastructure Group (DCIG). This group is headed up by Dominick Wilde and contains things like FlexFabric and Altoline. The partnership with Arista rounds out the rest of the switch portfolio. This helps HPE position their offerings across a wide range of potential clients, from existing data center infrastructure to newer cloud-ready shops focusing on DevOps and rapid application development.

After hearing Dom Wilde speak to us about the networking portfolio goals, I think I can see where HPE is headed going forward.

The Past: HPE FlexFabric

As Dom Wilde said during our session, “I have a market for FlexFabric and can sell it for the next ten years.” FlexFabric represents the traditional data center networking. There is a huge market for existing infrastructure for customers that have made a huge investment in HPE in the past. Dom is absolutely right when he says the market for FlexFabric isn’t going to shrink the foreseeable future. Even though the migration to the cloud is underway, there are a significant number of existing applications that will never be cloud ready.

FlexFabric represents the market segment that will persist on existing solutions until a rewrite of critical applications can be undertaken to get them moved to the cloud. Think of FlexFabric as the vaunted buggy whip manufacturer. They may be the last one left, but for the people that need their products they are the only option in town. DCIG may have eyes on the future, but that plan will be financed by FlexFabric.

The Present: HPE Altoline

Altoline is where HPE was pouring their research for the past year. Altoline is a product line that benefits from the latest in software defined and webscale technologies. It is technology that utilizes OpenSwitch as the operating system. HPE initially developed OpenSwitch as an open, vendor-neutral platform before turning it over to the Linux Foundation this summer to run with development from a variety of different partners.

Dom brought up a couple of great use cases for Altoline during our discussion that struck me as brilliant. One of them was using it as an out-of-band monitoring solution. These switches don’t need to be big or redundant. They need to have ports and a management interface. They don’t need complexity. They need simplicity. That’s where Altoline comes into play. It’s never going to be as complex as FlexFabric or as programmable as Arista. But it doesn’t have to be. In a workshop full of table saw and drill presses, Altoline is a basic screwdriver. It’s a tool you can count on to get the easy jobs done in a pinch.

The Future: Arista

The Arista partnership, according to Dom Wilde, is all about getting ready for the cloud. For those customers that are looking at moving workloads to the cloud or creating a hybrid environment, Arista is the perfect choice. All of Arista’s recent solution sets have been focused on providing high-speed, programmable networking that can integrate a number of development tools. EOS is the most extensible operating system on the market and is a favorite for developers. Positioning Arista at the top of the food chain is a great play for customers that don’t have a huge investment in cloud-ready networking right now.

The question that I keep coming back to is…when does this Arista partnership become an acquisition? There is a significant integration between the two companies. Arista has essentially displaced the top of the line for HPE. How long will it take for Arista to make the partnership more permanent? I can easily foresee HPE making a play for the potential revenues produced by Arista and the help they provide moving things to the cloud.


Tom’s Take

I was the only networking person at HPE Discover this year because the HPE networking story has been simplified quite a bit. On the one hand, you have the campus tied up with Aruba. They have their own story to tell in a different area early next year. On the other hand, you have the simplification of the portfolio with DCIG and the inclusion of the Arista partnership. I think that Altoline is going to find a niche for specific use cases but will never really take off as a separate platform. FlexFabric is in maintenance mode as far as development is concerned. It may get faster, but it isn’t likely to get smarter. Not that it really needs to. FlexFabric will support legacy architecture. The real path forward is Arista and all the flexibility it represents. The question is whether HPE will try to make Arista a business unit before Arista takes off and becomes too expensive to buy.

Disclaimer

I was an invited guest of HPE for HPE Discover London. They paid for my travel and lodging costs as well as covering event transportation and meals. They did not ask for nor were they promised any kind of consideration in the coverage provided here. The opinions and analysis contained in this article represent my thoughts alone.

OpenFlow Is Dead. Long Live OpenFlow.

The King Is Dead - Long Live The King

Remember OpenFlow? The hammer that was set to solve all of our vaguely nail-like problems? Remember how everything was going to be based on OpenFlow going forward and the world was going to be a better place? Or how heretics like Ivan Pepelnjak (@IOSHints) that dared to ask questions about scalability or value of application were derided and laughed at? Yeah, good times. Today, I stand here to eulogize OpenFlow, but not to bury it. And perhaps find out that OpenFlow has a much happier life after death.

OpenFlow Is The Viagra Of Networking

OpenFlow is not that much different than Sildenafil, the active ingredient in Vigara. Both were initially developed to do something that they didn’t end up actually solving. In the case of Sildenafil, it was high blood pressure. The “side effect” of raising blood pressure in a specific body part wasn’t even realized until after the trials of the drug. The side effect because the primary focus of the medication that was eventually developed into a billion dollar industry.

In the same way, OpenFlow failed at its stated mission of replacing the forwarding plane programming method of switches. As pointed out by folks like Ivan, it had huge scalability issues. It was a bit clunky when it came to handling flow programming. The race from 1.0 to 1.3 spec finalization left vendors in the dust, but the freeze on 1.3 for the past few years has really hurt innovation. Objectively, the fact that almost no major shipping product uses OpenFlow as a forwarding paradigm should be evidence of it’s failure.

The side effect of OpenFlow is that it proved that networking could be done in software just as easily as it could be done in hardware. Things that we thought we historically needed ASICs and FPGAs to do could be done by a software construct. OpenFlow proved the viability of Software Defined Networking in a way that no one else could. Yet, as people abandoned it for other faster protocols or rewrote their stacks to take advantage of other methods, OpenFlow did still have a great number of uses.

OpenFlow Is a Garlic Press, Not A Hammer

OpenFlow isn’t really designed to solve every problem. It’s not a generic tool that can be used in a variety of situations. It has some very specific use cases that it does excel at doing, though. Think more like a garlic press. It’s a use case tool that is very specific for what it does and does that thing very well.

This video from Networking Field Day 13 is a great example of OpenFlow being used for a specific task. NEC’s flavor of OpenFlow, ProgrammableFlow, is used on conjunction with higher layer services like firewalls and security appliances to mitigate the spread of infections. That’s a huge win for networking professionals. Think about how hard it would be to track down these systems in a network of thousands of devices. Even worse, with the level of virulence of modern malware, it doesn’t take long before the infected system has infected others. It’s not enough to shut down the payload. The infection behavior must be removed as well.

What NEC is showing is the ultimate way to stop this from happening. By interrogating the flows against a security policy, the flow entries can be removed from switches across the network or have deny entries written to prevent communications. Imagine being able to block a specific workstation from talking to anything on the network until it can be cleaned. And have that happen automatically without human interaction. What if a security service could get new malware or virus definitions and install those flow entries on the fly? Malware could be stopped before it became a problem.

This is where OpenFlow will be headed in the future. It’s no longer about adapting the problems to fit the protocol. We can’t keep trying to frame the problem around how much it resembles a nail just so we can use the hammer in our toolbox. Instead, OpenFlow will live on as a point protocol in a larger toolbox that can do a few things really well. That means we’ll use it when we need to and use a different tool when needed that better suits the problem we’re actually trying to solve. That will ensure that the best tool is used for the right job in every case.


Tom’s Take

OpenFlow is still useful. Look at what Coho Data is using it for. Or NEC. Or any one of a number of companies that are still developing on it. But the fact that these companies have put significant investment and time into the development of the protocol should tell you what the larger industry thinks. They believe that OpenFlow is a dead end that can’t magically solve the problems they have with their systems. So they’ve moved to a different hammer to bang away with. I think that OpenFlow is going to live a very happy life now that people are leaving it to solve the problems it’s good at solving. Maybe one day we’ll look back on the first life of OpenFlow not as a failure, but instead as the end of the beginning of it become what it was always meant to be.

Nutanix and Plexxi – An Affinity to Converge

nutanix-logo

Nutanix has been lighting the hyperconverged world on fire as of late. Strong sales led to a big IPO for their stock. They are in a lot of conversations about using their solution in place of large traditional virtualization offerings that include things like blade servers or big boxes. And even coming off the recent Nutanix .NEXT conference there were some big announcements in the networking arena to help them complete their total solution. However, I think Nutanix is missing a big opportunity that’s right in front of them.

I think it’s time for Nutanix to buy Plexxi.

Software Says

If you look at the Nutanix announcements around networking from .NEXT, they look very familiar to anyone in the server space. The highlights include service chaining, microsegmentation, and monitoring all accessible through an API. If this sounds an awful lot like VMware NSX, Cisco ACI, or any one of a number of new networking companies then you are in the right mode of thinking as far as Nutanix is concerned.

SDN in the server space is all about overlay networking. Segmentation of flows and service chaining are the reason why security is so hard to do in the networking space today. Trying to get traffic to behave in a certain way drives networking professionals nuts. Monitoring all of that to ensure that you’re actually doing what you say you’re doing just adds complexity. And the API is the way to do all of that without having to walk down to the data center to console into a switch and learn a new non-Linux CLI command set.

SDN vendors like VMware and Cisco ACI would naturally have jumped onto these complaints and difficulties in the networking world and both have offered solutions for them with their products. For Nutanix to have bundled solutions like this into their networking offering is no accident. They are looking to battle VMware head-to-head and need to offer the kind of feature parity that it’s going to take a make medium to large shops shift their focus away from the VMware ecosystem and take a long look at what Nutanix is offering.

In a way, Nutanix and VMware are starting to reinforce the idea that the network isn’t a magical realm of protocols and tricks that make applications work. Instead, it’s a simple transport layer between locations. For instance, Amazon doesn’t rely on the magic of the interstate system to get your packages from the distribution center to your home. Instead, the interstate system is just a transport layer for their shipping overlays – UPS, FedEX, and so on. The overlay is where the real magic is happening.

Nutanix doesn’t care what your network looks like. They can do almost everything on top of it with their overlay protocols. That would seem to suggest that the focus going forward should be to marginalize or outright ignore the lower layers of the network in favor of something that Nutanix has visibility into and can offer control and monitoring of. That’s where the Plexxi play comes into focus.

Plexxi Logo

Affinity for Awesome

Plexxi has long been a company in search of a way to sell what they do best. When I first saw them years ago, they were touting their Affinities idea as a way to build fast pathways between endpoints to provide better performance for applications that naturally talked to each other. This was a great idea back then. But it quickly got overshadowed by the other SDN solutions out there. It even caused Plexxi to go down a slightly different path for a while looking at other options to compete in a market that they didn’t really have a perfect fit product.

But the Affinities idea is perfect for hyperconverged solutions. Companies like Nutanix are marking their solutions as the way to create application-focused compute nodes on-site without the need to mess with the cloud. It’s a scalable solution that will eventually lead to having multiple nodes in the future as your needs expand. Hyperconverged was designed to be consumable per compute unit as opposed to massively scaling out in leaps and bounds.

Plexxi Affinities is just the tip of the iceberg. Plexxi’s networking connectivity also gives Nutanix the ability to build out a high-speed interconnect network with one advantage – noninterference. I’m speaking about what happens when a customer needs to add more networking ports to support this architecture. They need to make a call to their Networking Vendor of Choice. In the case of Cisco, HPE, or others, that call will often involve a conversation about what they’re doing with the new network followed by a sales pitch for their hyperconverged solution or a partner solution that benefits both companies. Nutanix has a reputation for being the disruptor in traditional IT. The more they can keep their traditional competitors out of the conversation, the more likely they are to keep the business into the future.


Tom’s Take

Plexxi is very much a company with an interesting solution in need of a friend. They aren’t big enough to really partner with hyperconverged solutions, and most of the hyperconverged market at this point is either cozy with someone else or not looking to make big purchases. Nutanix has the rebel mentality. They move fast and strike quickly to get their deals done. They don’t take prisoners. They look to make a splash and get people talking. The best way to keep that up is to bundle a real non-software networking component alongside a solution that will make the application owners happy and keep the conversation focused on a single source. That’s how Cisco did it back and the day and how VMware has climbed to the top of the virtualization market.

If Nutanix were to spend some of that nice IPO money on a Plexxi Christmas present, I think 2017 would be the year that Nutanix stops being discussed in hushed whispers and becomes a real force to be reckoned with up and down the stack.

Visibility In Networking – Quick Thoughts from Networking Field Day

nfd-logo

I’m at Networking Field Day 13 this week. You can imagine how much fun I’m having with my friends! I wanted to drop some quick thoughts on visibility for this week on you all about what we’re hearing and raise some interesting questions.

I Can See Clearly Now

Visibility is a huge issue for companies. Seeing what’s going on is hard for people. Companies like Ixia talk about the need to avoid dropping any packets to make sure we have complete knowledge of the network. But that requires a huge amount of hardware and design. You’re always going to need traditional monitoring even when everything is using telemetry and other data models. Make sure you size things right.

Forward Networks told us that there is an increasing call for finding a way to monitor both the underlay network and the overlay network. Most overlay companies give you a way to tie into their system via API or other telemetry. However, there is no visibility into the underlay because of the event horizon. Likewise, companies like Forward Networks are focusing on the underlay with mapping technologies and modeling software but they can’t pass back through the event horizon to see into the overlay. Whoever ends up finding a way to marry both of these together is going to make a lot of money.

Apstra is taking the track of not caring what the underlay looks like. They’re going to give you the tools to manage it all without hard setup. You can rip and replace switches as needed with multivendor support. That’s a huge win if you run a heterogeneous network or you’re looking to start replacing traditional hardware with white or bright box options. Likewise, their ability to pull configs can help you visualize your device setup more effectively no matter what’s under there.


Tom’s Take

I’ve got some more Networking Field Day thoughts coming soon, but I wanted to get some thoughts out there for you to think about this weekend. Stay tuned for some new ideas coming out of the event!

How To Ask A Question At A Conference

question-mark-706906_1280

The last time you went to a conference, did you ask any questions? Were you curious about a technology and wanted to know more? Was there something that you didn’t quite get and needed an explanation? Congratulations. You’re in a quiet group of people that ask questions for knowledge. More and more, we are seeing questions becoming a vehicle for more than just knowledge acquisition. If you want to learn how to ask a proper question at a conference, read on.

1. Have A Question

I know it goes without saying, but if you’re going to raise your hand at a conference to ask a question, you should actually have a question in mind. Some people grab a microphone without thinking through what they’re going to say. This leads to stammering and broken thoughts that usually culminate in a random question mark here or there. This makes it difficult for the speaker to figure out what you’re trying to ask.

If you’re going to raise your hand, jot some notes down first. Bullet points help as does making a note or two. This is especially true if the speaker is answering questions before yours. If they answer part of your question before you get to ask it, you may have to reframe your thoughts. It never hurts to have an idea of what you’re going to say before you say it.

2. Look For Knowledge, Not To Make A Statement

The other side of the coin from the above recommendation of actually having a question is to make sure that what you’re asking is actually a question and not a statement. A great example of this is a video from Scott Bradner during a recent ONUG meeting:

I’m sure Scott has seen his fair share of statements masquerading as questions during his time. And I can’t disagree with him. Far too often, people seeking questions aren’t really asking to get information. Instead, they are trying to make a point about why they think they are right or why they disagree with the speaker. The point stops becoming a question and more of a soliloquy or soapbox. The most egregious will usually end this rant with an actual question along the lines of, “So, what do you think of my opinion?”

Please, at all costs, avoid this behavior. This is singularly the most annoying thing a speaker has to deal with. It’s enough to be questioned on your material, but it’s something else entirely to have to shift your thinking to someone else’s viewpoint while on stage. If you have a point you’d like to bring up with the speaker that is contrary to their thought process, you should do it after the presentation without people watching. Have a discussion and express opinions there. Don’t grandstand in front of the crowd just to satisfy your ego.

theres-no-question-youre-clever

3. Make Sure Your Question Wasn’t Already Answered.

This one’s a bit tougher. If you’re sitting in a session and you have a question, it’s important to make sure it wasn’t already asked and answered beforehand. This can be tougher if you have to duck out to take a call or you miss a section of the presentation. In these cases, you can ask for clarification or additional information but it would be better to ask after the session. Audiences tend to get a bit irritated if someone asks a question that was previously answered or that was covered earlier.

This one is probably the most forgivable of the question faux pas. People at conferences know that ducking out to deal with things is more common now. But if you are going to ask a question because you missed something, please make sure to address then when you ask. That helps everyone get the frame of reference for why you’re asking it. That will keep the audience on your side and less likely to boo you.


Tom’s Take

I ask lots of questions. I also answer them. And nothing irritates me more than having to deal with someone making a point during Q&A to try and make them look smarter than me. I get it. I have a hatred of keynotes and other speeches with no ability to get feedback. But at the same time, as Scott Bradner says above, the focus of the presentation is about the people presenting. It’s about the people doing the work and sharing the ideas. If you want to use Q&A time to pontificate about your position, then you need to volunteer to be a speaker.

Setting Sail on Secret Seas with Trireme

trireme-b

Container networking is a tough challenge to solve. The evolving needs of creating virtual networks to allow inter-container communications is difficult. But ensuring security at the same time is enough to make you pull your hair out. Lots of companies are taking a crack at it as has been demonstrated recently by microsegmentation offerings from Cisco, VMware NSX, and many others. But a new development on this front set sail today. And the captain is an old friend.

Sailing the Security Sea

Dimitri Stiladis did some great things in his time at Nuage Networks. He created a great overlay network solution that not only worked well for software defined systems but also extended into the container world as more and more people started investigating containers as the new way to provide application services. He saw many people rushing into this area with their existing solutions as well as building new solutions. However, those solutions were all based on existing technology and methods that didn’t work well in the container world. If you ever heard someone say, “Oh, containers are just lightweight VMs…” you know what kind of thinking I’m talking about.

Late last year, Dimitri got together with some of his friends to build a new security solution for containers. He founded Aporeto, which is from the Greek for “confidential”. And that really informs the whole idea of what they are trying to build. Container communications should be something easy to secure. All the right pieces are in place. But what’s missing is the way to do it easily and scale it up quickly. This is where existing solutions are missing the point by using existing ideas and constructs.

Enter Trireme. This project is an open source version of the technology Aporeto is working on was released yesterday to help container admins understand why securing communications between containers is critical and yet simple to do. I got a special briefing from Dimitri yesterday about it and once he helped me understand it I immediately saw the power of what they’ve done.

In The Same Boat

Trireme works by doing something very simple. All containers have a certificate that is generated at creation. This allows them to be verified for consistency and other things. What Trireme is doing is using a TCP Authorization Proxy to grab the digital identity of the container and insert it into the TCP SYN setup messages. Now, the receiving container will know who the sender is because the confirmed identity of the sender is encoded in the setup message. If the sender is authorized to talk to the receiver the communications can be setup. Otherwise the connection is dropped.

This is one of the “so simple I can’t believe I missed it” moments. If there is already a secure identity setup for the container it should be used. And adding that information to the TCP setup ensures that we don’t just take for granted that containers with similar attributes are allowed to talk to each other just because they are on the same network. This truly is microsegmentation with the addition of identity protection. Even if you spin up a new container with identical attributes, it won’t have the same digital identity as the previous container, which means it will need to be authorized all over again.

Right now the security model is simple. If the attributes of the containers match, they are allowed to talk. You can setup some different labels and try it yourself. But with the power behind using Kubernetes as the management platform, you can extend this metaphor quite a bit. Imagine being able to create a policy setup that allows containers with the “dev” label to communication if and only if they have the “shared” label as well. Or making sure that “dev” containers can never talk to “prod” containers for any reason, even if they are on the same network. It’s an extension of a lot of things already being looked at in the container world but it has the benefit of built in identity confirmation as well as scalability.

How does Trireme scale? Well, it’s not running a central controller or database of any kind. Instead, the heavy lifting is done by a local process on the container. That’s how Trireme can scale. No dependency on a central process or device failing and leaving everyone stranded. No need to communicate with anything other than the local container host. Kubernetes has the infrastructure to push down the policy changes to processes in the container which are then checked by the Trireme process. That means that Trireme never has to leave the local container to make decisions. Everything that is needed is right on deck.


Tom’s Take

It took me a bit to understand what Dimitri and his group are trying to do with Trireme and later with their Aporeto solution. Creating digital signatures and signing communications between containers is going to be a huge leap forward for security. If all communications are secured by default then security becomes the kind of afterthought that we need.

The other thing that Aporeto illustrates to me is the need for containers to be isolated processes, not heavy VMs. By creating a process boundary per container, Trireme and other solutions can help keep things as close to completely secure as possible. Lowering the attack surface of a construct down to the process level is making it a tiny target in a big ocean.

Designer or Architect? It’s A Matter Of Choice

hearthfire_draftingtable

I had a great time at ONUG this past week. I got to hear a lot of great presentations from some great people, and I got a chance to catch up with some friends as well. One of those was Pete Lumbis (@PeteCCDE) who had a great presentation this past spring at Interop. We talked a lot about tech and networking, but one topic he brought up that made me stop and think for a moment was the wide gulf between design and architecture.

Binary Designers

Design is a critical part of an IT project. Things must fit and make sense before the implementors can figure out how to put the pieces together. Design is all about building a list of products and describing how they’ll interact once turned on. Proper design requires you to step away from the keyboard for a moment and think about a bigger picture than just hacking CLI commands or Python code to make some lights start blinking in the right order.

But design is inherently limited. Think about the last design you did, whether it be wireless or networking or even storage. When you start a design, you automatically make assumptions about what’s going on in the scenario. Perhaps they want to expand their near-line storage capacity. That brings a set of products into play that you choose from. But what if the goal is something different? What if they want a fast caching tier? What if the goal is to create a new pod for object storage?

All of these scenarios are broad enough to require a designer to come up with a good mix of products to fulfill the goals of the project. But the designer has already had assumptions put down for them: The scope and the requirements are pre-determined for them before they ever start thinking about the technology that will be involved in the setup.

Design is all about choices. You have to choose the right product to meet the goals. Once you know the product, you have to make the right choices about which set of products to use? The orange ones or the blue ones? The cheap ones or the expensive ones? Design is about making good choices so implementers can focus on making those choices work.

Visionary Architects

Architecture, on the other hand, has very little to do with choice. Architects are idea people. They look at a problem faced by an organization and try to narrow the focus of the issue to make the designer’s choices easier. Architects don’t worry about individual products or even minor solution sets. They focus on technology areas.

Think back to our storage problem from above. How did the designer arrive at the near-line storage decision? Or the object storage idea? It’s because an architect is the one driving those ideas from a higher level. Architects may not know how to build an object storage bill of materials or how to assemble a chassis switch but they do know what those are used for. Architects instead know that you should be using flash storage in lower density, faster reaction systems when cost is sensitive. They know that a rack may only need a 1U ToR switch instead of a chassis if that ToR switch doesn’t have to provide power or advanced features. They won’t know the specific part number, but they know the technology.

Architects have vision. Designers know products. They need each other to make solutions work and designs happen. The same person can fulfill both roles provided they understand how things break down in the end. A designer architect needs to know that the solutions to customer problems should come before any decisions are made about products. Too often, we find ourselves cornered in a mess because the product mix was decided before the solution was determined.

It’s like trying to bake a cake when all you have in the house is flour, eggs, and swiss cheese. Maybe a cake isn’t what you should be making. The architect would realize that the problem is a limited set of ingredients. instead of deciding on a cake, the architect can work with the designer to find a solution to the problem of food with limited ingredients. Perhaps the designer realizes what’s needed is a soufflé instead. The team figures out the problem with the best design instead of deciding on a design before knowing what the problem is.


Tom’s Take

I was a designer in my past life at a VAR. I still had to implement my designs at the end of the day, but I was the one making the decisions about the products that were needed to meet the solutions my customers had to have. Now, at Tech Field Day I understand the technology at an architecture level. I know why you need this solution for that problem. My ability to hack CLI has gone down a bit but my understanding of the bigger picture has increased several times over that. I now think that I have a better idea of what needs to happen to make tech work the right way and be implemented easier when the architect’s vision can solve the problems that allows the designers to make the right choices.

Facebook Wedge 100 – The Future of the Data Center?

 

FBLike

Facebook is back in the news again. This time, it’s because of the release of their new Wedge 100 switch into the Open Compute Project (OCP). Wedge was already making headlines when Facebook announced it two years ago. A fast, open sourced 40Gig Top-of-Rack (ToR) switch was huge. Now, Facebook is letting everyone in on the fun of a faster Wedge that has been deployed into production at Facebook data centers as well as being offered for sale through Edgecore Networks, which is itself a division of Accton. Accton has been leading the way in the whitebox switching market and Wedge 100 may be one of the ways it climbs to the top.

Holy Hardware!

Wedge 100 is pretty impressive from the spec sheet. They paid special attention to making sure the modules were expandable, especially for faster CPUs and special purpose devices down the road. That’s possible because Wedge is a highly specialized micro server already. Rather than rearchitecting the guts of the whole thing, Facebook kept the CPU and the monitoring stack and just put newer, faster modules on it to ramp to 32x100Gig connectivity.

12809187_1676340369272065_1831349201_n

As suspected in the above image, Facebook is using Broadcom Tomahawk as the base connectivity in their switch, which isn’t surprising. Tomahawk is the roadmap for all vendors to get to 100Gig. It also means that the downlink connectivity for these switches could conceivably work in 25/50Gig increments. However, given the enormous amount of east/west traffic that Facebook must generate, Facebook has created a server platform they call Yosemite that has 100Gig links as well. Given the probably backplane there, you can imagine the data that’s getting thrown around the data centers.

That’s not all. Omar Baldonado has said that they are looking at going to 400Gig connectivity soon. That’s the kind of mind blowing speed that you see in places like Google and Facebook. Remember that this hardware is built for a specific purpose. They don’t just have elephant flows. They have flows the size of an elephant herd. That’s why they fret about the operating temperature of optics or the rack design they want to use (standard versus Open Racks). Because every little change matters a thousand fold at that scale.

Software For The People

The other exciting announcement from Facebook was on the software front. Of course, FBOSS has been updated to work with Wedge 100. I found it very interesting in the press release that much of the programming in FBOSS went into interoperability with Wedge 40 and with fixing the hardware side of things. This makes some sense when you realize that Facebook didn’t need to spend a lot of time making Wedge 40 interoperate with anything, since it was a wholesale replacement. But Wedge 100 would need to coexist with Wedge 40 as the rollout happens, so making everything play nice is a huge point on the checklist.

The other software announcement that got the community talking was support for third-party operating systems running on Wedge 100. The first one up was Open Network Linux from Big Switch Networks. ONL ran on the original Wedge 40 and now runs on the Wedge 100. This means that if you’re familiar with running BSN OSes on your devices, you can drop in a Wedge 100 in your spine or fabric and be ready to go.

The second exciting announcement about software comes from a new company, Apstra. Apstra announced their entry into OCP and their intent to get their Apstra Operating System (AOS) running on Wedge 100 by next year. That has a big potential impact for Apstra customers that want to deploy these switches down the road. I hope to hear more about this from Apstra during their presentation at Networking Field Day 13 next month.


Tom’s Take

Facebook is blazing a trail for fast ToR switches. They’ve got the technical chops to build what they need and release the designs to the rest of the world to be used for a variety of ideas. Granted, your data center looks nothing like Facebook. But the ideas they are pioneering are having an impact down the line. If Open Rack catches on you may see different ideas in data center standardization. If the Six Pack catches on as a new chassis concept, it’s going to change spines as well.

If you want to get your hands dirty with Wedge, build a new 100Gig pod and buy one from Edgecore. The downlinks can break out into 10Gig and 25Gig links for servers and knowing it can run ONL or Apstra AOS (eventually) gives you some familiar ground to start from. If it runs as fast as they say it does, it may be a better investment right now than waiting for Tomahawk II to come to your favorite vendor.

 

 

Tomahawk II – Performance Over Programmability

tomahawk2

Broadcom announced a new addition to their growing family of merchant silicon today. The new Broadcom Tomahawk II is a monster. It doubles the speed of it’s first-generation predecessor. It has 6.4 Tbps of aggregate throughout, divided up into 256 25Gbps ports that can be combined into 128 50Gbps or even 64 100Gbps ports. That’s fast no matter how you slice it.

Broadcom is aiming to push these switches into niches like High-Performance Computing (HPC) and massive data centers doing big data/analytics or video processing to start. The use cases for 25/50Gbps haven’t really changed. What Broadcom is delivering now is port density. I fully expect to see top-of-rack (ToR) switches running 25Gbps down to the servers with new add-in cards connected to 50Gbps uplinks that deliver them to the massive new Tomahawk II switches running in the spine or end-of-row (EoR) configuration for east-west traffic disbursement.

Another curious fact of the Tomahawk II is the complete lack of 40Gbps support. Granted, the support was only paid lip service in the Tomahawk I. The real focus was on shifting to 25/50Gbps instead of the weird 10/40/100Gbps split we had in Trident II. I talked about this a couple of years ago and wasn’t very high on it back then, but I didn’t know the level of apathy people had for 40Gbps uplinks. The push to 25/50Gbps has only been held up so far by the lack of availability of new NICs for servers to enable faster speeds. Now that those are starting to be produced in volume expect the 40Gbps uplinks to be a relic of the past.

A Foot In The Door

Not everyone is entirely happy about the new Broadcom Tomahawk II. I received an email today with a quote from Martin Izzard of Barefoot Networks, discussing their new Tofino platform. He said in part:

Barefoot led the way in June with the introduction of Tofino, the world’s first fully programmable switches, which also happen to be the fastest switches ever built.

It’s true that Tofino is very fast. It was the first 6.4 Tbps switch on the market. I talked a bit about it a few months ago. But I think that Barefoot is a bit off on its assessment here and has a bit of an axe to grind.

Barefoot is pushing something special with Tofino. They are looking to create a super fast platform with programmability. P4 is not quite an FPGA and it’s not an ASIC. It’s a switch stripped to its core and rebuilt with a language all its own. That’s great if you’re a dev shop or a niche market that has to squeeze every ounce of performance out of a switch. In the world of cars, the best analogy would be looking at Tofino like a specialized sports car like a Koenigsegg Agera. It’s very fast and very stylish, but it’s purpose built to do one thing – drive really fast on pavement and carry two passengers.

Broadcom doesn’t really care about development shops. They don’t worry about niche markets. Because those users are not their customers. Their customers are Arista, Cisco, Brocade, Juniper and others. Broadcom really is the Intel of the switching world. Their platforms power vendor offerings. Buying a basic Tomahawk II isn’t something you’re going to be able to do. Broadcom will only sell these in huge lots to companies that are building something with them. To keep the car analogy, Tomahawk II is more like the old F-body cars produced by Chevrolet that later went on to become Camaros, Firebirds, and Trans Ams. Each of these cars was distinctive and had their fans, but the chassis was the same underneath the skin.

Broadcom wants everyone to buy their silicon and use it to power the next generation of switches. Barefoot wants a specialist kit that is faster than anything else on the market, provided you’re willing to put the time into learning P4 and stripping out all the bits they feel are unnecessary. Your use case determines your hardware. That hasn’t changed, nor is it likely to change any time soon.


Tom’s Take

The data center will be 25/50/100Gbps top to bottom when the next switch refresh happens. It could even be there sooner if you want to move to a pod-based architecture instead of more traditional designs. The odds are very good that you’re going to be running Tomahawk or Tomahawk II depending on which vendor you buy from. You’re probably only going to be running something special like Tofino or maybe even Cavium if you’ve got a specific workload or architecture that you need performance or programmability.

Don’t wait for the next round of hardware to come out before you have an upgrade plan. Write it now. Think about where you want to be in 4 years. Now double your requirements. Start investigating. Ask your vendor of choice what their plans are. If their plans stink, as their competitor. Get quotes. Get ideas. Be ready for the meeting when it’s scheduled. Make sure you’re ready to work with your management to bury the hatchet, not get a hatchet jobbed network.