Positioning Policy Properly

Who owns the network policy for your organization? How about the security policy?Identity policy? Sound like easy questions, don’t they? The first two are pretty standard. The last generally comes down to one or two different teams depending upon how much Active Directory you have deployed. But have you ever really thought about why?

During Future:NET this week, those poll questions were asked to an audience of advanced networking community members. The answers pretty much fell in line with what I was expecting to see. But then I started to wonder about the reasons behind those decisions. And I realized that in a world full of cloud and DevOps/SecOps/OpsOps people, we need to get away from teams owning policy and have policy owned by a separate team.

Specters of the Past

Where does the networking policy live? Most people will jump right in with a list of networking gear. Port profiles live on switches. Routing tables live on routers. Networking policy is executed in hardware. Even if the policy is programmed somewhere else.

What about security policy? Firewalls are probably the first thing that come to mind. More advanced organizations have a ton of software that scans for security issues. Those policy decisions are dictated by teams that understand the way their tools work. You don’t want someone that doesn’t know how traffic flows through a firewall to be trying to manage that device, right?

Let’s consider the identity question. For a multitude of years the identity policy has been owned by the Active Directory (AD) admins. Because identity was always closely tied to the server directory system. Novell (now NetIQ) eDirectory and Microsoft AD were the kings of the hill when it came to identity. Today’s world has so much distributed identity that it’s been handed to the security teams to help manage. AD doesn’t control the VPN concentrator the cloud-enabled email services all the time. There are identity products specifically designed to aggregate all this information and manage it.

But let’s take a step back and ask that important question: why? Why is it that the ownership of a policy must be by a hardware team? Why must the implementors of policy be the owners? The answer is generally that they are the best arbiters of how to implement those policies. The network teams know how to translate applications in to ports. Security teams know how to create firewall rules to implement connection needs. But are they really the best people to do this?

Look at modern policy tools designed to “simplify” networking. I’ll use Cisco ACI as an example but VMware NSX certainly qualifies as well. At a very high level, these tools take into account the needs of applications to create connectivity between software and hardware. You create a policy that allows a database to talk to a front-end server, for example. That policy knows what connections need to happen to get through the firewall and also how to handle redundancy to other members of the cluster. The policy is then implemented automatically in the network by ACI or NSX and magically no one needs to touch anything. The hardware just works because policy automatically does the heavy lifting.

So let’s step back for moment and discuss this. Why does the networking team need to operate ACI or NSX? Sure, it’s because those devices still touch hardware at some point like switches or routers. But we’ve abstracted the need for anyone to actually connect to a single box or a series of boxes and type in lines of configuration that implement the policy. Why does it need to be owned by that team? You might say something about troubleshooting. That’s a common argument that whoever needs to fix it when it breaks is who needs to be the gatekeeper implementing it. But why? Is a network engineer really going to SSH into every switch and correct a bad application tag? Or is that same engineer just going to log into a web console and fix the tag once and propagate that info across the domain?

Ownership of policy isn’t about troubleshooting. It’s about territory. The tug-of-war to classify a device when it needs to be configured is all about collecting and consolidating power in an organization. If I’m the gatekeeper of implementing workloads then you need to pay tribute to me in order to make things happen.

If you don’t believe that, ask yourself this: If there was a Routing team and and Switching team in an organization, who would own the routed SVI interface on a layer 3 switch? The switching team has rights because it’s on their box. The routing team should own it because it’s a layer 3 construct. Both are going to claim it. And both are going to fight over it. And those are teams that do essentially the same job. When you start pulling in the security team or the storage team or the virtualization team you can see how this spirals out of control.

Vision of the Future

Let’s change the argument. Instead of assigning policy to the proper hardware team, let’s create a team of people focused on policy. Let’s make sure we have proper representation from every hardware stack: Networking, Security, Storage, and Virtualization. Everyone brings their expertise to the team for the purpose of making policy interactions better.

Now, when someone needs to roll out a new application, the policy team owns that decision tree. The Policy Team can have a meeting about which hardware is affected. Maybe we need to touch the firewall, two routers, a switch, and perhaps a SAN somewhere along the way. The team can coordinate the policy changes and propose an implementation plan. If there is a construct like ACI or NSX to automate that deployment then that’s the end of it. The policy is implemented and everything is good. Perhaps some older hardware exists that needs manual configuration of the policy. The Policy Team then contacts the hardware owner to implement the policy needs on those devices. But the Policy Team still owns that policy decision. The hardware team is just working to fulfill a request.

Extend the metaphor past hardware now. Who owns the AWS network when your workloads move to the cloud? Is it still the networking team? They’re the best team to own the network, right? Except there are no switches or routers. They’re all software as far as the instance is concerned. Does that mean your cloud team is now your networking team as well? Moving to the cloud muddies the waters immensely.

Let’s step back into the discussion about the Policy Team. Because they own the policy decisions, they also own that policy when it changes hardware or location. If those workloads for email or productivity suite move from on-prem to the cloud then the policy team moves right along with them. Maybe they add an public cloud person to the team to help them interface with AWS but they still own everything. That way, there is no argument about who owns what.

The other beautiful thing about this Policy Team concept is that it also allows the rest of the hardware to behave as a utility in your environment. Because the teams that operate networking or security or storage are just fulfilling requests from the policy team they don’t need to worry about anything other than making their hardware work. They don’t need to get bogged down in policy arguments and territorial disputes. They work on their stuff and everyone stays happy!


Tom’s Take

I know it’s a bit of stretch to think about pulling all of the policy decisions out of the hardware teams and into a separate team. But as we start automating and streamlining the processes we use to implement application policy the need for it to be owned by a particular hardware team is hard to justify. Cutting down on cross-faction warfare over who gets to be the one to manage the new application policy means enhanced productivity and reduced tension in the workplace. And that can only lead to happy users in the long run. And that’s a policy worth implementing.

Rules Shouldn’t Have Exceptions

MerkurRazor

On my way to Virtualization Field Day 4, I ran into a bit of a snafu at the airport that made me think about policy and application. When I put my carry-on luggage through the X-ray, the officer took it to the back and gave it a thorough screening. During that process, I was informed that my double-edged safety razor would not be able to make the trip (or the blade at least). I was vexed, as this razor had flown with me for at least a whole year with nary a peep from security. When I related as much to the officer, the response was “I’m sorry no one caught it before.”

Everyone Is The Same, Except For Me

This incident made me start thinking about polices in networking and security and how often they are arbitrarily enforced. We see it every day. The IT staff comes up with a new plan to reduce mailbox sizes or reduce congestion by enforcing quality of service (QoS). Everyone is all for the plan during the discussion stages. When the time comes to implement the idea, the exceptions start happening. Upper management won’t have mailbox limitations. The accounting department is exempt from the QoS policy. The list goes on and on until it’s larger than the policy itself.

Why does this happen? How can a perfect policy go from planning to implementation before it falls apart? Do people sit around making up rules they know they’ll never follow? That does happen in some cases, but more often it happens that the folks that the policy will end up impacting the most have no representation in the planning process.

Take mailboxes for example. The IT department, being diligent technology users, strive for inbox zero every day. They process and deal with messages. They archive old mail. They keep their mailbox a barren wasteland of in-process things and shuffle everything else off to the static archive. Now, imagine an executive. These people are usually overwhelmed by email. They process what they can but the wave will always overtake them. In addition, they have no archive. Their read mail sits around in folders for easy searching and quick access when a years-old issue becomes present again.

In modern IT, any policies limiting mailbox sizes would be decided by the IT staff based on their mailbox size. Sure, a 1 GB limit sounds great. Then, when the policy is implemented the executive staff pushes back with their 5 GB (or larger) mailboxes and says that the policy does not apply to them. IT relents and finds a way to make the executives exempt.

In a perfect world, the executive team would have been consulted or had representation on the planning team prior to the decision. The idea of having huge mailboxes would have been figured out in the planning stage and dealt with early instead of making exceptions after the fact. Maybe the IT staff needed to communicate more. Perhaps the executive team needed to be more involved. Those are problems that happen every day. So how do we fix them?

Exceptions Are NOT The Rule

The way to increase buy-in for changes and increase communication between stakeholders is easy but not without pain. When policies are implemented, no deviations are allowed. It sounds harsh. People are going to get mad at you. But you can’t budge an inch. If a policy exception is not documented in the policy it will get lost somewhere. People will continue to be uninvolved in the process as long as they think they can negotiate a reprieve after the fact.

IT needs to communicate up front exactly what’s going into the change before the the implementation. People need to know how they will be impacted. Ideally, that will mean that people have talked about the change up front so there are no surprises. But we all know that doesn’t happen. So making a “no exceptions” policy or rule change will get them involved. Because not being able to get out of a rule means you want to be there when the rules get decided so you can make your position clear and ensure the needs of you and your department are met.


Tom’s Take

As I said yesterday on Twitter, people don’t mind rules and polices. They don’t even mind harsh or restrictive rules. What they have a problem with is when those rules are applied in an arbitrary fashion. If the corporate email policy says that mailboxes are supposed to be no more than 1 GB in size then people in the organization will have a problem if someone has a 20 GB mailbox. The rules must apply to everyone equally to be universally adopted. Likewise, rules must encompass as many outlying cases as possible in order to prevent one-off exceptions for almost everyone. Planning and communication are more important than ever when planning those rules.