I’ve always been a science nerd. Especially when it comes to astronomy. I’ve always been fascinated by stellar objects. The most fascinating has to be the black hole. A region of space with intense gravity formed from the collapse of a stellar body. I’ve read all about the peculiar properties of classical black holes. Of particular interest to the networking field is the idea of the event horizon.
An event horizon is a boundary beyond which events no longer affect observers. In layman’s terms, it is the point of no return for things falling into a black hole. Anything that falls below the event horizon disappears from the perspective of the observer. From the point of view of someone falling into the black hole, they never reach the event horizon yet are unable to contact the outside world. The event horizon marks the point at which information disappears in a system.
How does this apply to the networking world? Well, every system has a visibility boundary. We tend to summarize information heading in both directions. To a network engineer, everything above the transport layer of the OSI model doesn’t really matter. Those are application decisions that are made by programmers that don’t affect the system other than to be a drain on resources. To the programmers and admins, anything below the session layer of the OSI model is of little importance. As long as a call can be made to utilize network resources, who cares what’s going on down there?
Software Defined Networking (SDN) vendors are enforcing these event horizons. VMware NSX and the Microsoft Hyper-V virtual networking solution both function in a similar manner. They both create overlay networks that virtualize resources below the level of the host. Tunnels are created between devices or systems that ride on top of the physical network beneath. This means that the overlay can function no matter the state of the underlay, provided a path between the two hosts exists. However, it also means that the overlay obscures the details of the physical network.
There are many that would call this abstraction. Why should the hosts care about the network state? All they really want is a path to reach a destination. It’s better to give them what they want and leave the gory details up to routing protocols and layer 2 loop avoidance mechanisms. But, that abstraction becomes an event horizon when the overlay is unwilling (or unable) to process information from the underlay network.
Applications and hosts should be aware enough to listen to network conditions. Overlays should not rely on OSPF or BGP to make tunnel endpoint rerouting decisions. Putting undue strain on network processing is part of what has led to the situation we have now, where network operating systems need to be complex and intensive to calculate solutions to problems that could be better solved at a higher level.
If the network reports a traffic condition, like a failed link or a congested WAN circuit, that information should be able to flow back up to the overlay and act as a data point to trigger an alternate solution or path. Breaking the event horizon for information flowing back up toward the overlay is crucial to allow the complex network constructs we’ve created, such as fabrics, to utilize the best possible solutions for application traffic.
That’s not to say the event horizon doesn’t exist in the other direction as well. The network has historically been ignorant of the needs of applications at a higher layer. Network engineers have spent thousands of hours of time creating things like Quality of Service in an attempt to meet the unique needs of higher level programs. Sometimes this works in a vacuum with no problems provided we’ve guess accurately enough to predict traffic patterns. Other times, it fails spectacularly when the information changes too quickly.
The underlay network needs to destroy the event horizon that prevents information at higher layers from flowing down into the network. Companies that have historically concentrated on networking alone have started to see how important this intelligence can be. By allowing the network to respond to the needs of applications quickly developers can provide enough information to ensure that they programs are treated fairly by changing network conditions even without needing to listen to them. In this way, the application people can no longer claim the network is a “black hole”.
Even as I was writing this, a number of news stories came out from a paper by Professor Stephen Hawking that stated that the classical event horizon doesn’t exist. The short, short version is that the conditions close to a quantum singularity preclude a well-defined boundary that prevents the escape of all information, including light. Pretty heady stuff.
In networking, we do have the luxury of a well-defined boundary between underlay networks and overlay networks. We’ve seen the damage caused by the apparent event horizon for years. Critical information wasn’t flowing back and forth as needed to help each side provide the best experience for users and engineers. We need to ensure that this barrier is removed going forward. The networking people can’t exist in a vacuum pretending that applications don’t have needs. The overlay admins need to understand the the underlay is a storehouse of critical information and shouldn’t be ignored simply because tunnels are awesome. Knowing about the event horizon is the first step to finding a way to blast through it.
“Applications and hosts should be aware enough to listen to network conditions.”
How do you think applications would have this awareness, either today or in an SDN world? Presumably in an SDN world the SDN controller would process network analytics, then be able to report north to entities able to consume the data.
But consider an enterprise app with the usual outward facing, business logic, and persistent data components. Would something in the business logic accept network reporting, then act on it, possibly modifying the application’s behavior or sending informed requests to the SDN controller? How about federated SDN controllers? What’s the scope of the app request, cascading through the network?
By “hosts” do you mean OS serving the app or hypervisor serving the OS serving the app?
I don’t mean to put too much baggage into a single sentence, buy what you’re suggesting seems a crucial app design issue for the future.
If SDN enables bidirectional communication between the apps and the network, it stands to reason that you would begin to architect each of them differently. Obviously you have to start with making it possible; no one will change anything if there is no support for it. But you could create applications than take advantage of network information.
Imagine massive data replication jobs. If they are not time critical, you could schedule them and create pipes across the network. You could serve content from caches that were less congested. You could do things like variable bit rate for mobile connections that are shifting from 3G to Edge and back to LTE on a train ride.
Ultimately, I agree with the premise of this post. I don’t think the future is overlays that are completely agnostic to the underlying network. I think there will be a desire to pin the overlays to the physical infrastructure and allow for the dynamic optimization of the physical transport to suit whatever is happening on the overlay.
Mike Bushong (@mbushong)
“I think there will be a desire to pin the overlays to the physical infrastructure and allow for the dynamic optimization of the physical transport to suit whatever is happening on the overlay.”
Agreed. My opinion is that this is all but inevitable but I am also thinking outside of the data center as well.
I can see how even long term the majority of applications wouldn’t care and are OK with relying on layer 4 delivery but the ‘did you get it?’ oh no, OK I’ll resend slower … but this doesn’t work well for all applications and we need to advance past this.
Also, optimization of bulky data transmission scheduling as well as non hop-count/bandwidth optimization mechanisms (IE choosing the best path taking into account jitter and delay) are going to CONTINUE to be appealing… look at EIGRP.
It would be great if business apps would be aware of network conditions and network state changes. And, as I think we’re agreeing, SDN enables the interfaces so the network could be informed of app requests for service and report network service availability.
I’m just not clear, except at the highest conceptual level, how a business application would either consume the SDN controller’s reporting or how the app would requests service or changes in service. There has to be something in the program’s logic that has these functions.
I guess the app would also have to have a tie in to other monitoring to understand its performance in terms of transaction goals and load acceleration changes, which may not strictly be fully network dependent.
Well said! No-one want to be looking at two net management tools, one for the overlay and one for the underlay, and trying to correlate what’s going on.
Pingback: Plexxi Pulse – Vote for OpenStack - Plexxi
We need something that straddles the overlay/underlay boundary and can report status/intent in a concise way each side can use, much as a RAID controller would report “degraded” status or an application would add QoS tags to traffic.