The Privacy Pickle

I recorded a fantastic episode of The Network Collective last night with some great friends from the industry. The topic was privacy. Originally I thought we were just going to discuss how NAT both was and wasn’t a form of privacy and how EUI-64 addressing wasn’t the end of days for people worried about being tracked. But as the show wore on, I realized a few things about privacy.

Booming In Peace

My mom is a Baby Boomer. We learn about them as a generation based on some of their characteristics, most notably their rejection of the values of their parents. One of things they hold most dear is their privacy. They grew up in a world where they could be private people. They weren’t living in a 1 or 2 room house with multiple siblings. They had the right of privacy. They could have a room all to themselves if they so chose.

Baby Boomers, like my mom, are intensely private adults. They marvel at the idea that targeted advertisements can work for them. When Amazon shows them an ad for something they just searched for they feel like it’s a form of dark magic. They also aren’t trusting of “new” things. I can still remember how shocked my mother was that I would actively get into someone else’s car instead of a taxi. When I explained that Uber and Lyft do a similar job of vetting their drivers it still took some convincing to make her realize that it was safe.

Likewise, the Boomer generation’s personal privacy doesn’t mesh well with today’s technology. While there are always exceptions to every rule, the number of people in their mid-50s and older that use Twitter and Snapchat are far, far less than the number that is the target demographic for each service. I used to wonder if it was because older people didn’t understand the technology. But over time I started to realize that it was more based on the fact that older people just don’t like sharing that kind of information about themselves. They’re not open books. Instead, Baby Boomers take a lot of studying to understand.

Zee Newest

On the opposite side of the spectrum is my son’s generation, Generation Z. GenZ is the opposite of the Boomer generation when it comes to privacy. They have grown up in a world that has never known anything but the ever-present connectivity of the Internet. They don’t understand that people can live a life without being watched by cameras and having everything they do uploaded to YouTube. Their idea of celebrity isn’t just TV and movie stars but also extends to video game streamers on Twitch or Instagram models.

Likewise, this generation is more open about their privacy. They understand that the world is built on data collection. They sign away their information. But they tend to be crafty about it. Rather than acting like previous generations that would fill out every detail of a form this generation only fills out the necessary pieces. And they have been known to put in totally incorrect information for no other reason than to throw people off.

GenZ realizes that the privacy genie is out of the bottle. They have to deal with the world they were born into, just like the Baby Boomers and the other generations that came before them. But the way that they choose to deal with it is not through legislation but instead through self-regulation. They choose what information they expose so as not to create a trail or a profile that big data consuming companies can use to fingerprint them. And in most cases, they don’t even realize they’re doing it! My son is twelve and he instinctively knows that you don’t share everything about yourself everywhere. He knows how to navigate his virtual neighborhood just a sure as I knew how to ride my bike around my physical one back when I was his age.

Tom’s Take

Where does that leave me and my generation? Well, we’re a weird mashup on Generation X and Generation Y/Millenials. We aren’t as private as our parents and we aren’t as open as our children. We’re cynical. We’re rebelling against what we see as our parent’s generation and their complete privacy. Likewise, just like our parents, we are almost aghast at the idea that our children could be so open. We’re coming to live in a world where Big Data is learning everything about us. And our children are growing up in that world. Their children, the generation after GenZ, will only know a world where everyone knows everything already. Will it be like Minority Report, where advertising works with retinal patterns? Or will it be a generation where we know everything but really know nothing because no one tells the whole truth about who they are?

Advertisements

Data Is Not The New Oil, It’s Nuclear Power

Big Data. I believe that one phrase could get millions in venture capital funding. I don’t even have to put a product with it. Just say it. And make no mistake about it: the rest of the world thinks so too. Data is “the new oil”. At least, according to some pundits. It’s a great headline making analogy that describes how data is driving business and controlling it can lead to an empire. But, data isn’t really oil. It’s nuclear power.

Black Gold, Texas Tea

Crude oil is a popular resource. Prized for a variety of uses, it is traded and sold as a commodity and refined into plastics, gasoline, and other essential items of modern convenience. Oil creates empires and causes global commerce to hinge on every turn of the market. Living in a state that is a big oil producer, the exploration and refining of oil has a big impact.

However, when compared to Big Data, oil isn’t the right metaphor. Much like oil, data needs to be refined before use. But oil can be refined into many different distinct things. Data can only be turned into information. Oil burns up when consumed. Aside from some smoke and a small amount of residuals oil all but disappears after it expends the energy trapped within. Data doesn’t disappear after being turned into information.

In fact, the biggest issue that I have with the entire “Data as Oil” argument is that oil doesn’t stick around. We don’t see massive pools of oil on the side of the road from spills. We don’t hear about our massive issues with oil disposal or securing our spent oil to prevent theft. People that treat data like oil are only looking at the refined product as the final form. They tend to forget that the raw form of data sticks around after the transformation. Most people will tell you that’s a good thing because you can run analytics and machine learning against static datasets and continue to derive value from it. But doesn’t make me think of oil at all.

Welcome To The Nuclear Age

In fact, Big Data reminds me most of a nuclear power plant. Much like oil, the initial form of radioactive material isn’t very useful. It radiates and creates a small amount of heat but not enough to run the steam generators in a power plant. Instead, you must bombard the uranium 235 pellets with neutrons to start a fission reaction. Once you have a sustained controllable reaction the amount of generated heat rises and creates the resource you need to power the rest of your machinery.

Much like data, nuclear fission reactions don’t do much without the proper infrastructure to harness them. Even after you transform you data into information you need to parse, categorize, and analyze it. The byproduct of the transformation is the critical part of the whole process.

Much like nuclear fuel rods, data stays in place for years. It continues to produce the resource after being modified and transformed. It sits around in the hopes that it can be useful until the day that it no longer serves its purpose. Data that is past the useful shelf life goes into a data warehouse when it will eventually be forgotten. Spent nuclear fuel rods are also eventually removed and placed somewhere where they can’t affect other things. Maybe it’s buried deep underground. Or shot into space. Or placed under enough concrete that they will never be found again.

The danger in data and in nuclear power is not what happens when everything goes right. Instead, it’s what happens when everything goes wrong. With nuclear power, wrong is a chain reaction meltdown. Or wrong could be improper disposal of waste. It could be a disaster at the plant or even a theft of fissile nuclear material from the plant. The fuel rods themselves are simultaneously our source of power and the source of our potential disaster.

Likewise, data that is just sitting around and stored improperly can lead to huge disasters. We’re only four years removed from Target’s huge data breach. And how many more are waiting out there to happen? It seems the time between data leaks is shrinking as more and more bad actors are finding ways to steal, manipulate, and appropriate data for their own ends. And, much like nuclear fuel rods, the methods of protecting the data are few compared to other fuel sources.

Data isn’t something that can easily be hidden or compacted. It needs to be readable to be useful. It needs to be fast to be useful. All the things that make it easy to use also make it easy to exploit. And once we’re done with it, the way that it is stored in perpetuity only increase the likelihood of it being used improperly. Unless we’re willing to bury it under metaphorical concrete we’re in for a bad future if we forget how to handle spent data.


Tom’s Take

Data as Oil is a stupid metaphor. It’s meant to impress upon finance CEOs and Wall Street wonks how important it is for data to be taken seriously. Data as Oil is something a data scientist would say to get a job. By drawing a bad comparison, you make data seem like a commodity to be traded and used as collateral for empire building. It’s not. It’s a ticking bomb of disastrous proportions when not handled correctly. Rather than coming up with a pithy metaphor for cable news consumption and page views, let’s treat data with the respect it deserves and make sure we plan for how we’re going to deal with something that won’t burn up into smoke whenever it’s convenient.

Devaluing Data Exposures

I had a great time this week recording the first episode of a new series with my co-worker Rich Stroffolino. The Gestalt IT Rundown is hopefully the start of some fun news stories with a hint of snark and humor thrown in.

One of the things I discussed in this episode was my belief that no data is truly secure any more. Thanks to recent attacks like WannaCry and Bad Rabbit and the rise of other state-sponsored hacking and malware attacks, I’m totally behind the idea that soon everyone will know everything about me and there’s nothing that anyone can do about it.

Just Pick Up The Phone

Personal data is important. Some pieces of personal data are sacrificed for the greater good. Anyone who is in IT or works in an area where they deal with spam emails and robocalls has probably paused for a moment before putting contact information down on a form. I have an old Hotmail address I use to catch spam if I’m relative certain that something looks shady. I give out my home phone number freely because I never answer it. These pieces of personal data have been sacrificed in order to provide me a modicum of privacy.

But what about other things that we guard jealously? How about our mobile phone number. When I worked for a VAR that was the single most secretive piece of information I owned. No one, aside from my coworkers, had my mobile number. In part, it’s because I wanted to make sure that it got used properly. But also because I knew that as soon as one person at the customer site had it, soon everyone would. I would be spending my time answering phone calls instead of working on tickets.

That’s the world we live in today. So many pieces of information about us are being stored. Our Social Security Number, which has truthfully been misappropriated as an identification number. US Driver’s Licenses, which are also used as identification. Passport numbers, credit ratings, mother’s maiden name (which is very handy for opening accounts in your name). The list could be a blog post in and of itself. But why is all of this data being stored?

Data Is The New Oil

The first time I heard someone in a keynote use the phrase “big data is the new oil”, I almost puked. Not because it’s a platitude the underscores the value of data. I lost it because I know what people do with vital resources like oil, gold, and diamonds. They horde them. Stockpiling the resources until they can be refined. Until every ounce of value can be extracted. Then the shell is discarded until it becomes a hazard.

Don’t believe me? I live in a state that is legally required to run radio and television advertisements telling children not to play around old oilfield equipment that hasn’t been operational in decades. It’s cheaper for them to buy commercials than it is to clean up their mess. And that precious resource? It’s old news. Companies that extract resources just move on to the next easy source instead of cleaning up their leftovers.

Why does that matter to you? Think about all the pieces of data that are stored somewhere that could possibly leak out about you. Phone numbers, date of birth, names of children or spouses. And those are the easy ones. Imagine how many places your SSN is currently stored. Now, imagine half of those companies go out of business in the next three years. What happens to your data then? You can better believe that it’s not going to get destroyed or encrypted in such a way as to prevent exposure. It’s going to lie fallow on some forgotten server until someone finds it and plunders it. Your only real hope is that it was being stored on a cloud provider that destroys the storage buckets after the bill isn’t paid for six months.

Devaluing Data

How do we fix all this? Can this be fixed? Well, it might be able to be done, but it’s not going to be fun, cheap, or easy. It all starts by making discrete data less valuable. An SSN is worthless without a name attached to it, for instance. If all I have are 9 random numbers with no context I can’t tell what they’re supposed to be. The value only comes when those 9 numbers can be matched to a name.

We’ve got to stop using SSN as a unique identifier for a person. It was never designed for that purpose. In fact, storing SSN as all is a really bad idea. Users should be assigned a new, random ID number when creating an account or filling out a form. SSN shouldn’t be stored unless absolutely necessary. And when it is, it should be treated like a nuclear launch code. It should take special authority to query it, and the database that queries it should be directly attached to anything else.

Critical data should be stored in a vault that can only be accessed in certain ways and never exposed. A prime example is the trusted enclave in an iPhone. This enclave, when used for TouchID or FaceID, stores your fingerprints and your face map. Pretty important stuff, yes? However, even with biometric ID systems become more prevalent there isn’t any way to extract that data from the enclave. It’s stored in such a way that it can only be queried in a specific manner and a result of yes/no returned from the query. If you stole my iPhone tomorrow, there’s no way for you to reconstruct my fingerprints from it. That’s the template we need to use going forward to protect our data.


Tom’s Take

I’m getting tired of being told that my data is being spread to the four winds thanks to it lying around waiting to be used for both legitimate and nefarious purposes. We can’t build fences high enough around critical data to keep it from being broken into. We can’t keep people out, so we need to start making the data less valuable. Instead of keeping it all together where it can be reconstructed into something of immense value, we need to make it hard to get all the pieces together at any one time. That means it’s going to be tough for us to build systems that put it all together too. But wouldn’t you rather spend your time solving a fun problem like that rather than making phone calls telling people your SSN got exposed on the open market?

Don’t Build Big Data With Bad Data

I was at Pure Accelerate 2017 this week and I saw some very interesting things around big data and the impact that high speed flash storage is going to have. Storage vendors serving that market are starting to include analytics capabilities on the box in an effort to provide extra value. But what happens when these advances cause issues in the training of algorithms?

Garbage In, Garbage Out

One story that came out of a conversation was about training a system to recognize people. In the process of training the system, the users imported a large number of faces in order to help the system start the process of differentiating individuals. The data set they started with? A collection of male headshots from the Screen Actors Guild. By the time the users caught the mistake, the algorithm had already proven that it had issues telling the difference between test subjects of particular ethnicities. After scrapping the data set and using some different diverse data sources, the system started performing much better.

This started me thinking about the quality of the data that we are importing into machine learning and artificial intelligence systems. The old computer adage of “garbage in, garbage out” is never more apt today than it has been in history. Before, bad inputs caused data to be suspect when extracted. Now, inputting bad data into a system designed to make decisions can have even more far-reaching consequences.

Look at all the systems that we’re programming today to be more AI-like. We’ve got self-driving cars that need massive data inputs to help navigate roads at speed. We have network monitoring systems that take analytics data and use it to predict things like component failures. We even have these systems running the background of popular apps that provide us news and other crucial information.

What if the inputs into the system cause it to become corrupted or somehow compromised? You’ve probably heard the story about how importing UrbanDictionary into Watson caused it to start cursing constantly. These kinds of stories highlight how important the quality of data being used for the basis of AI/ML systems can be.

Think of a future when self-driving cars are being programmed with failsafes to avoid living things in the roadway. Suppose that the car has been programmed to avoid humans and other large animals like horses and cows. But, during the import of the small animal data set, the table for dogs isn’t imported for some reason. Now, what would happen if the car encountered a dog in the road? Would it make the right decision to avoid the animal? Would the outline of the dog trigger a subroutine that helped it make the right decision? Or would the car not be able to tell what a dog was and do something horrible?

Do You See What I See?

After some chatting with my friend Ryan Adzima, he taught me a bit about how facial recognition systems work. I had always assumed that these systems could differentiate on things like colors. So it could tell a blond woman from a brunette, for instance. But Ryan told me that it’s actually very difficult for a system to tell fine colors apart.

Instead, systems try to create contrast in the colors of the picture so that certain features stand out. Those features have a grid overlaid on them and then those grids are compared and contrasted. That’s the fastest way for a system to discern between individuals. It makes sense considering how CPU-bound things are today and the lack of high definition cameras to capture information for the system.

But, we also must realize that we have to improve data collection for our AI/ML systems in order to ensure that the systems are receiving good data to make decisions. We need to build validation models into our systems and checks to make sure the data looks and sounds sane at the point of input. These are the kinds of things that take time and careful consideration when planning to ensure they don’t become a hinderance to the system. If the very safeguards we put in place to keep data correct end up causing problems, we’re going to create a system that falls apart before it can do what it was designed to do.


Tom’s Take

I thought the story about the AI training was a bit humorous, but it does belie a huge issue with computer systems going forward. We need to be absolutely sure of the veracity of our data as we begin using it to train systems to think for themselves. Sure, teaching a Jeopardy-winning system to curse is one thing. But if we teach a system to be racist or murderous because of what information we give it to make decisions, we will have programmed a new life form to exhibit the worst of us instead of the best.

AI, Machine Learning, and The Hitchhiker’s Guide

Deep_Thought

I had a great conversation with Ed Horley (@EHorley) and Patrick Hubbard (@FerventGeek) last night around new technologies. We were waxing intellectual about all things related to advances in analytics and intelligence. There’s been more than a few questions here at VMworld 2016 about the roles that machine learning and artificial intelligence will play in the future of IT. But during the conversation with Ed and Patrick, I finally hit on the perfect analogy for machine learning and artificial intelligence (AI). It’s pretty easy to follow along, so don’t panic.

The Answer

Machine learning is an amazing technology. It can extrapolate patterns in large data sets and provide insight from seemingly random things. It can also teach machines to think about problems and find solutions. Rather than go back to the tired Target big data example, I much prefer this example of a computer learning to play Super Mario World:

You can see how the algorithms learn how to play the game and find newer, better paths throughout the level. One of the things that’s always struck me about the computer’s decision skills is how early it learned that spin jumps provide more benefit than regular jumps for a given input. You can see the point in the video when this is figured out by the system, whereafter all jumps become spinning for maximum effect.

Machine learning appears to be insightful and amazing. But the weakness of machine learning is be exemplified by Deep Thought from The Hitchhiker’s Guide to the Galaxy. Deep Thought was created to find the answer to the ultimate question of life, the universe, and everything. It was programmed with an enormous dataset – literally every piece of knowledge in the known universe. After seven million years, it finally produces The Answer (42, if you’re curious). Which leads to the plot of the book and other hijinks.

Machine learning is capable of great leaps of logic, but it operates on a fundamental truth: all inputs are a bounded data set. Whether you are performing a simple test on a small data set or combing all the information in the universe for answers you are still operating on finite information. Machine learning can do things very fast and find pieces of data that are not immediately obvious. But it can’t operate outside the bounds of the data set without additional input. Even the largest analytics clusters won’t produce additional output without more data being ingested. Machine learning is capable of doing amazing things. But it won’t ever create new information outside of what it is operating on.

The Question

Artificial Intelligence (AI), on the other hand, is more like the question in The Hitchhiker’s Guide. Deep Thought admonishes the users of the system that rather than looking for the answer to Life, The Universe, and Everything, they should have been looking for The Question instead. That involves creating a completely different computer to find the Question that matches the Answer that Deep Thought has already provided.

AI can provide insight above and beyond a given data set input. It can provide context where none exists. It can make leaps of logic similarly to those that humans are capable of doing. AI doesn’t simply stop when it faces an incomplete data set. Even though we are seeing AI in infancy today, the most advanced systems are capable of “filling in the blanks” to cover missing information. As the algorithms learn more and more how to extrapolate they’ll become better at making incomplete decisions.

The reason why computers are so good at making quick decisions is because they don’t operate outside the bounds of the possible. If the entire universe for a decision is a data set, they won’t try to look around that. That ability to look beyond and try to create new data where none exists is the hallmark of intelligence. Using tools to create is a uniquely biologic function. Computers can create subsets of data with tools but they can’t do a complete transformation.

AI is pushing those boundaries. Given enough time and the proper input, AI can make the leaps outside of bounds to come up with new ideas. Today it’s all about context. Tomorrow may find AI providing true creativity. AI will eventually pass a Turing Test because it can look outside the script and provide the pseudorandom type of conversation that people are capable of.


Tom’s Take

Computers are smart. They think faster than we do. They can do math better than we can. They can produce results in a fraction of the time at a scale that boggles the mind. But machine learning is still working from a known set. No matter how much work we pour into that aspect of things, we are still going to hit a limit of the ability of the system to break out of it’s bounds.

True AI will behave like we do. It will look for answers when their are none. It will imagine when confronted with the impossible. It will learn from mistakes and create solutions to them. That’s the power of intelligence versus learning. Even the most power computer in the universe couldn’t break out of its programming. It needed something more to question the answers it created. Just like us, it needed to sit back and let its mind wander.

Networking Needs Information, Not Data

GameAfoot

Networking Field Day 12 starts today. There are a lot of great presenters lined up. As I talk to more and more networking companies, it’s becoming obvious that simply moving packets is not the way to go now. Instead, the real sizzle is in telling you all about those packets instead. Not packet inspection but analytics.

Tell Me More, Tell Me More

Ask any networking professional and they’ll tell you that the systems they manage have a wealth of information. SNMP can give you monitoring data for a set of points defined in database files. Other protocols like NetFlow or sFlow can give you more granular data about a particular packet group of data flow in your network. Even more advanced projects like Intel’s Snap are building on the idea of using telemetry to collect disparate data sources and build collection methodologies to do something with them.

The concern that becomes quickly apparent is the overwhelming amount of data being received from all these sources. It reminds me a bit of this scene:

How can you drink from this firehose? Maybe you should be asking if you should instead?

Order From Chaos

Data is useless. We need to perform analysis on it to get information. That’s where a new wave of companies is coming into the networking market. They are building on the frameworks and systems that are aggregating data and presenting it in a way that makes it useful information. Instead of random data points about NetFlow, these solutions tell you that you’ve got a huge problem with outbound traffic of a specific type that is sent at a specific time with a specific payload. The difference is that instead of sorting through data to make sense of it, you’ve got a tool delivering the analysis instead of the raw data.

Sometimes it’s as simple as color-coding lines of Wireshark captures. Resets are bad, so they show up red. Properly torn down connections are good so they are green. You can instantly figure out how good things are going by looking for the colors. That’s analysis from raw data. The real trick in modern networking monitoring is to find a way to analyze and provide context for massive amounts of data that may not have an immediate correlation.

Networking professionals are smart people. They can intuit a lot of potential issues from a given data set. They can make the logical leap to a specific issue given time. What reduces that ability is the sheer amount of things that can go wrong with a particular system and the speed at which those problems must be fixed, especially at scale. A hiccup on one end of the network can be catastrophic on the others if allowed to persist.

Analytics can give us the context we need. It can provide confidence levels for common problems. It can ensure that symptoms are indeed happening above a given baseline or threshold. It can help us narrow the symptoms and potential issues before we even look at the data. Analytics can exclude the impossible while highlighting the more probably causes and outcomes. Analytics can give us peace of mind.


Tom’s Take

Analytics isn’t doing our job for us. Instead, it’s giving us the ability to concentrate. Anyone that spends their time sifting through data to try and find patterns is losing the signal in the noise. Patterns are things that software can find easily. We need to leverage the work being put into network analytics systems to help us track down the issues before they blow up into full problems. We need to apply the thing that makes network professionals the best suited to look at the best information we can gather about a situation. Our concentration on what matters is where our job will be in five years. Let’s take the knowledge we have and apply it.

Drowning in the Data of Things

DrowningSign

If you saw the news coming out of Cisco Live Berlin, you probably noticed that Internet of Things (IoT) was in every other announcement. I wrote about the impact of the new Digital Ceiling initiative already, but I think that IoT is a bit deeper than that. The other thing that seems to go hand in hand with discussion of IoT is big data. And for most of us, that big data is going to be a big problem.

Seen And Not Heard

Internet of Things is about dumb devices getting smart. Think Flowers for Algernon. Only now, instead of them just being smarter they are also going to be very talkative too. The amount of data that these devices used to hold captive will be unleashed on something. We assume that the data is going to be sent to a central collection point or polled from the device by an API call or a program that is mining the data for another party. But do you know who isn’t going to be getting that data? Us.

IoT devices are going to be talking to providers and data collection systems and, in a lot of cases, each other. But they aren’t going to be talking directly to end users or IT staff. That’s because IoT is about making devices intelligent enough to start making their own decisions about things. Remember when SDN came out and everyone started talking about networks making determinations about forwarding paths and topology changes without human inputs? Remember David Meyer talking about network fragility?

Now imagine that’s not the network any more. Imagine it’s everything. Devices talking to other devices and making decisions without human inputs. IoT gives machines the ability to make a limited amount of decisions based on data inputs. Factory floor running a bit too hot for the milling machines to keep up? Talk to the environmental controls and tell it to lower the temperature by two degrees for the next four hours. Is the shelf in the fridge where the milk is stored getting close to the empty milk jug weight? Order more milk. Did a new movie come out on Netflix that meets your viewing guidelines? Add that movie to the queue and have the TV turn on when your phone enters the house geofence.

Think about those processes for moment. All of them are fairly easy conditional statements. If this, then do that. But conditional statements aren’t cut and dried. They require knowledge of constraints and combinations. And all that knowledge comes from data.

More Data, More Problems

All of that data needs to be collected somehow. That means transport networks are going to be stressed now that there are ten times more devices chatting on them. And a good chunk of those devices, especially in the consumer space, are going to be wireless. Hope your wireless network is ready for that challenge. That data is going to be transported to some data sink somewhere. As much as we would like to hope that it’s a collector on our network, the odds are much better that it’s an off-site collector. That means your WAN is about to be stressed too.

How about storing that data? If you are lucky enough to have an onsite collection system you’d better start buying drives for it now. This is a huge amount of data. Nimble Storage has been collecting analytics data from their storage arrays for a while now. Every four hours they collect more data than there are stars in the Milky Way. Makes you wonder where they keep it? And how long are they going to keep that data? Just like the crap in your attic that you swear you’re going to get around to using one day, big data and analytics platforms will keep every shred of information you want to keep for as long you want to have it taking up drive space.

And what about security? Yeah, that’s an even scarier thought. Realize that many of the breaches we’ve read about in the past months have been hackers having access to systems for extended periods of time and only getting caught after they have exfiltrated data from the system. Think about what might happen if a huge data sink is sitting around unprotected. Sure, terabytes worth of data may be noticed if someone tries to smuggle it out of the DLP device. But all it takes is a quick SQL query against the users tables for social security numbers, a program to transpose those numbers into letters to evade the DLP scanner, and you can just email the file to yourself. Script a change from letters back to numbers and you’ve got a gold mine that someone left unlocked and lying around. We may be concentrating on securing the data in flight right now, but even the best armored car does no good if you leave the bank vault door open.


Tom’s Take

This whole thing isn’t rain clouds and doom and gloom. IoT and Big Data represent a huge challenge for modern systems planning. We have the ability to unlock insight from devices that couldn’t tell us their secrets before. But we have to know how deep that pool will be before we dive in. We have to understand what these devices represent before we connect them. We don’t want our thermostats DDoSing our home networks any more than we want the milling machines on the factory floor coming to life and trying to find Sarah Connor. But the challenges we have with transporting, storing, and securing the data from IoT devices is no different than trying to program on punch cards or figure out how to download emails from across the country. Technology will give us the key to solve those challenges. Assuming we can keep our head above water.