Throw More Storage At It

Embed from Getty Images

All of  this has happened before.   All of this will happen again.

The more time I spend listening to storage engineers talk about the pressing issues they face in designing systems in this day and age, the more I’m convinced that we fight the same problems over and over again with different technologies.  Whether it be networking, storage, or even wireless, the same architecture problems crop in new ways and require different engineers to solve them all over again.

Quality is Problem One

A chance conversation with Howard Marks (@DeepStorageNet) at Storage Field Day 4 led me to start thinking about these problems.  He was talking during a presentation about the difficulty that storage vendors have faced in implementing quality of service (QoS) in storage arrays.  As Howard described some of the issues with isolating neighboring workloads and ensuring they can’t cause performance issues for a specific application, I started thinking about the implementation of low latency queuing (LLQ) for voice in networking.

LLQ was created to solve a specific problem.  High volume, low bandwidth flows can starve traditional priority queuing systems.  In much the same way, applications that demand high amounts of input/output operations per second (IOPS) while storing very little data can cause huge headaches for storage admins.  Storage has tried to solve this problem with hardware in the past by creating things like write-back caching or even super fast flash storage caching tiers.

Make It Go Away

In fact, a lot of the problems in storage mirror those from networking world many years ago.  Performance issues used to have a very simple solution – throw more hardware at the problem until  it goes away.  In networking, you throw bandwidth at the issue.  In storage, more IOPS are the answer.  When hardware isn’t pushed to the absolute limit, the answer will always be to keep driving it higher and higher.  But what happens when performance can’t fix the problem any more?

Think of a sports car like the Bugatti Veyron.  It is the fastest production car available today, with a top speed well over 250 miles per hour.  In fact, Bugatti refuses to talk about the absolute top speed, instead countering “Would you ask how fast a jet can fly?”  One of the limiting factors in attaining a true measure of the speed isn’t found in the car’s engine.  Instead, the sub systems around the engine being to fail at such stressful speeds.  At 258 miles per hour, the tires on the car will completely disintegrate in 15 minutes.  The fuel tank will be emptied in 12 minutes.  Bugatti wisely installed a governor on the engine limiting it to a maximum of 253 miles per hour in an attempt to prevent people from pushing the car to its limits.  A software control designed to prevent performance issues by creating artificial limits.  Sound familiar?

Storage has hit the wall when it comes to performance.  PCIe flash storage devices are capable of hundreds of thousands of IOPS.  A single PCI card has the throughput of an entire data center.  Hungry applications can quickly saturate the weak link in a system.  In days gone by, that was the IOPS capacity of a storage device.  Today, it’s the link that connects the flash device to the rest of the system.  Not until the advent of PCIe was the flash storage device fast enough to keep pace with workloads starved for performance.

Storage isn’t the only technology with hungry workloads stressing weak connection points.  Networking is quickly reaching this point with the shift to cloud computing.  Now, instead of the east-west traffic between data center racks being a point of contention, the connection point between the user and the public cloud will now be stressed to the breaking point.  WAN connection speeds have the same issues that flash storage devices do with non-PCIe interfaces.  They are quickly saturated by the amount of outbound traffic generated by hungry applications.  In this case, those applications are located in some far away data center instead of being next door on a storage array.  The net result is the same – resource contention.


Tom’s Take

Performance is king.  The faster, stronger, better widgets win in the marketplace.  When architecting systems it is often much easier to specify a bigger, faster device to alleviate performance problems.  Sadly, that only works to a point.  Eventually, the performance problems will shift to components that can’t be upgraded.  IOPS give way to transfer speeds on SATA connectors.  Data center traffic will be slowed by WAN uplinks.  Even the CPU in a virtual server will eventually become an issue after throwing more RAM at a problem.  Rather than just throwing performance at problems until they disappear, we need to take a long hard look at how we build things and make good decisions up front to prevent problems before they happen.

Like the Veyron engineers, we have to make smart choices about limiting the performance of a piece of hardware to ensure that other bottlenecks do not happen.  It’s not fun to explain why a given workload doesn’t get priority treatment.  It’s not glamorous to tell someone their archival data doesn’t need to live in a flash storage tier.  But it’s the decision that must be made to ensure the system is operating at peak performance.

Replacing Nielsen With Big Data

Embed from Getty Images

I’m a fan of television. I like watching interesting programs. It’s been a few years since one has caught my attention long enough to keep my watching for multiple seasons. Part of that reason is due to my fear that an awesome program is going to fail to reach a “target market” and will end up canceled just when it’s getting good. It’s happened to several programs that I liked in the past.

Sampling the Goods

Part of the issue with tracking the popularity of television programs comes from the archaic method in which programs are measured. Almost everyone has heard of the Nielsen ratings system. This sampling method was created in the 1930s as a way to measure radio advertising reach. In the 50s, it was adapted for use in television.

Nielsen selects target audiences that represent the greater whole. They ask users to keep written diaries of their television watching habits. They also have the ability to install a device called a set meter which allows viewers to punch in a code to identify themselves via age groups and lock in a view for a program. The set meter can tell the instant a channel is changed or the TV is powered off.

In theory, the sampling methodology is sound. In practice, it’s a bit shaky. View diaries are unreliable because people tend to overreport their view habits. If they feel guilty that they haven’t been writing anything down, they tend to look up something that was on TV and write it down. Diaries also can’t determine if a viewer watched the entire program or changed the channel in the middle. Set meters aren’t much better. The reliance on PIN codes to identify users can lead to misreported results. Adults in a hurry will sometimes punch in an easier code assigned to their children, leading to skewed age results.

Both the diary and the set meter fail to take into account the shift in viewing habits in modern households. Most of the TV viewing in my house takes place through time-shifted DVR recordings. My kids tend to monopolize the TV during the day, but even they are moving to using services like Netflix to watch all episodes of their favorite cartoons in one sitting. Neither of these viewing habits are easily tracked by Nielsen.

How can we find a happy medium? Sample sizes have been reduced signifcantly due to cord-cutting households moving to Internet distribution models. People tend to exaggerate or manipulate self-reported viewing results. Even “modern” Nielsen technology can’t keep up. What’s the answer?

Big Data

I know what you’re saying: “We’ve already got that with Nielsen, right?” Not quite. TV viewing habits have shifted in the past few years. So has TV technology. Thanks to the shift from analog broadcast signals to digital and the explosion of set top boxes for cable decryption and movie service usage, we now have a huge portal into the living room of every TV watcher in the world.

Think about for a moment. The idea of a sample size works provided it’s a good representative sample. But tracking this data is problematic. If we have access to a way to crunch the actual data instead of extrapolating from incomplete sets shouldn’t we use that instead? I’d rather believe the real numbers instead of trying to guess from unreliable sources.

This also fixes the issue of time-shifted viewing. Those same set top boxes are often responsible for recording the programs. They could provide information such as number of shows recorded versus viewed and whether or not viewers skip through commercials. For those that view on mobile devices that data could be compiled as well through integration with the set top box. User logins are required for mobile apps as it is today. It’s just a small step to integrating the whole package.

It would require a bit of technical upgrading on the client side. We would have to enable the set top boxes to report data back to a service. We could anonymize the data to a point to be sure that people aren’t being unnecessarily exposed. It will also have to be configured as an opt-out setting to ensure that the majority is represented. Opt-in won’t work because those checkboxes never get checked.

Advertisers are going to demand the most specific information about people that they can. The ratings service exists to work for the advertisers. If this plan is going to work, a new company will have to be created to collect and analyze this data. This way the analysis company can ensure that the data is specific enough to be of use to the advertisers while at the same time ensuring the protection of the viewers.


Tom’s Take

Every year, promising new TV shows are yanked off the airwaves because advertisers don’t see any revenue. Shows that have a great premise can’t get up to steam because of ratings. We need to fix this system. In the old days, the deluge of data would have drown Nielsen. Today, we have the technology to collect, analyze, and store that data for eternity. We can finally get the real statistics on how many people watched Jericho or After MASH. Armed with real numbers, we can make intelligent decisions about what to keep on TV and what to jettison. And that’s a big data project I’d be willing to watch.

Google+ And The Quest For Omniscience

GooglePlusEverything

When you mention Google+ to people, you tend to get a very pointed reaction. Outside of a select few influencers, I have yet to hear anyone say good things about it. This opinion isn’t helped by the recent moves by Google to make Google+ the backend authentication mechanism for their services. What’s Google’s aim here?

Google+ draws immediate comparisons to Facebook. Most people will tell you that Google+ is a poor implementation of the world’s most popular social media site. I would tend to agree for a few reasons. I find it hard to filter things in Google+. The lack of a real API means that I can’t interact with it via my preferred clients. I don’t want to log into a separate web interface simply to ingest endless streams of animated GIFs with the occasional nugget of information that was likely posted somewhere else in the first place.

It’s the Apps

One thing the Google of old was very good at doing was creating applications that people needed. GMail and Google Apps are things I use daily. Youtube gets visits all the time. I still use Google Maps to look things up when I’m away from my phone. Each of these apps represent a separate development train and unique way of looking at things. They were more integrated than some of the attempts I’ve seen to tie together applications at other vendors. They were missing one thing as far as Google was concerned: you.

Google+ isn’t a social network. It’s a database. It’s an identity store that Google uses to nail down exactly who you are. Every +1 tells them something about you. However, that’s not good enough. Google can only prosper if they can refine their algorithms.  Each discrete piece of information they gather needs to be augmented by more information.  In order to do that, they need to increase their database.  That means they need to drive adoption of their social network.  But they can’t force people to use Google+, right?

That’s where the plan to integrate Google+ as the backend authentication system makes nefarious sense. They’ve already gotten you hooked on their apps. You comment on Youtube or use Maps to figure out where the nearest Starbucks already. Google wants to know that. They want to figure out how to structure AdWords to show you more ads for local coffee shops or categorize your comments on music videos to sell you Google Play music subscriptions. Above all else, they can use that information as a product to advertisers.

Build It Before They Come

It’s devilishly simple. It’s also going to be more effective than Facebook’s approach. Ask yourself this: when’s the last time you used Facebook Mail? Facebook started out with the lofty goal of gathering all the information that it could about people. Then they realized the same thing that Google did: You have to collect information on what people are using to get the whole picture. Facebook couldn’t introduce a new system, so they had to start making apps.

Except people generally look at those apps and push them to the side. Mail is a perfect example. Even when Facebook tried to force people to use it as their primary communication method their users rebelled against the idea. Now, Facebook is being railroaded into using their data store as a backend authentication mechanism for third party sites. I know you’ve seen the “log In With Facebook” buttons already. I’ve even written about it recently. You probably figured out this is going to be less successful for a singular reason: control.

Unlike Google+ and the integration will all Google apps, third parties that utilize Facebook logins can choose to restrict the information that is shared with Facebook. Given the climate of privacy in the world today, it stands to reason that people are going to start being very selective about the information that is shared with these kinds of data sinks. Thanks to the Facebook login API, a significant portion of the collected information never has to be shared back to Facebook. On the other hand, Google+ is just making a simple backend authorization. Given that they’ve turned on Google+ identities for Youtube commenting without a second though, it does make you wonder what other data their collecting without really thinking about it.


Tom’s Take

I don’t use Google+. I post things there via API hacks. I do it because Google as a search engine is too valuable to ignore. However, I don’t actively choose to use Google+ or any of the apps that are now integrated into it. I won’t comment on Youtube. I doubt I’ll use the Google Maps functions that are integrated into Google+. I don’t like having a half-baked social media network forced on me. I like it even less when it’s a ham-handed attempt to gather even more data on me to sell to someone willing to pay to market to me. Rather than trying to be the anti-Facebook, Google should stand up for the rights of their product…uh, I mean customers.

The Visibility Event Horizon

black_hole1

I’ve always been a science nerd.  Especially when it comes to astronomy.  I’ve always been fascinated by stellar objects.  The most fascinating has to be the black hole.  A region of space with intense gravity formed from the collapse of a stellar body.  I’ve read all about the peculiar properties of classical black holes.  Of particular interest to the networking field is the idea of the event horizon.

An event horizon is a boundary beyond which events no longer affect observers.  In layman’s terms, it is the point of no return for things falling into a black hole.  Anything that falls below the event horizon disappears from the perspective of the observer.  From the point of view of someone falling into the black hole, they never reach the event horizon yet are unable to contact the outside world.  The event horizon marks the point at which information disappears in a system.

How does this apply to the networking world?  Well, every system has a visibility boundary.  We tend to summarize information heading in both directions.  To a network engineer, everything above the transport layer of the OSI model doesn’t really matter.  Those are application decisions that are made by programmers that don’t affect the system other than to be a drain on resources.  To the programmers and admins, anything below the session layer of the OSI model is of little importance.  As long as a call can be made to utilize network resources, who cares what’s going on down there?

Looking Down

Software Defined Networking (SDN) vendors are enforcing these event horizons.  VMware NSX and the Microsoft Hyper-V virtual networking solution both function in a similar manner.  They both create overlay networks that virtualize resources below the level of the host.  Tunnels are created between devices or systems that ride on top of the physical network beneath.  This means that the overlay can function no matter the state of the underlay, provided a path between the two hosts exists.  However, it also means that the overlay obscures the details of the physical network.

There are many that would call this abstraction.  Why should the hosts care about the network state?  All they really want is a path to reach a destination.  It’s better to give them what they want and leave the gory details up to routing protocols and layer 2 loop avoidance mechanisms.  But, that abstraction becomes an event horizon when the overlay is unwilling (or unable) to process information from the underlay network.

Applications and hosts should be aware enough to listen to network conditions.  Overlays should not rely on OSPF or BGP to make tunnel endpoint rerouting decisions.  Putting undue strain on network processing is part of what has led to the situation we have now, where network operating systems need to be complex and intensive to calculate solutions to problems that could be better solved at a higher level.

If the network reports a traffic condition, like a failed link or a congested WAN circuit, that information should be able to flow back up to the overlay and act as a data point to trigger an alternate solution or path.  Breaking the event horizon for information flowing back up toward the overlay is crucial to allow the complex network constructs we’ve created, such as fabrics, to utilize the best possible solutions for application traffic.

Gazing Up

That’s not to say the event horizon doesn’t exist in the other direction as well.  The network has historically been ignorant of the needs of applications at a higher layer.  Network engineers have spent thousands of hours of time creating things like Quality of Service in an attempt to meet the unique needs of higher level programs.  Sometimes this works in a vacuum with no problems provided we’ve guess accurately enough to predict traffic patterns.  Other times, it fails spectacularly when the information changes too quickly.

The underlay network needs to destroy the event horizon that prevents information at higher layers from flowing down into the network.  Companies that have historically concentrated on networking alone have started to see how important this intelligence can be.  By allowing the network to respond to the needs of applications quickly developers can provide enough information to ensure that they programs are treated fairly by changing network conditions even without needing to listen to them.  In this way, the application people can no longer claim the network is a “black hole”.


Tom’s Take

Even as I was writing this, a number of news stories came out from a paper by Professor Stephen Hawking that stated that the classical event horizon doesn’t exist.  The short, short version is that the conditions close to a quantum singularity preclude a well-defined boundary that prevents the escape of all information, including light.  Pretty heady stuff.

In networking, we do have the luxury of a well-defined boundary between underlay networks and overlay networks.  We’ve seen the damage caused by the apparent event horizon for years.  Critical information wasn’t flowing back and forth as needed to help each side provide the best experience for users and engineers.  We need to ensure that this barrier is removed going forward.  The networking people can’t exist in a vacuum pretending that applications don’t have needs.  The overlay admins need to understand the the underlay is a storehouse of critical information and shouldn’t be ignored simply because tunnels are awesome.  Knowing about the event horizon is the first step to finding a way to blast through it.

The Value of the Internet of Things

NestPrice
The recent sale of IBM’s x86 server business to Lenovo has people in the industry talking.  Some of the conversation has centered around the selling price.  Lenovo picked up IBM’s servers for $2.3 billion, which is almost 66% less than the initial asking price of $6 billion two years ago.  That price drew immediate comparisons to the Google acquisition of Nest, which was $3.2 billion.  Many people asked how a gadget maker with only two shipping products could be worth more than the entirety of IBM’s server business.
Are You Being Served?
It says a lot for the decline of hardware manufacturing, especially at the low end.  IT departments have been moving away from smaller, task focused servers for many years now.  Instead of buying a new 1U, dual socket machine to host an application, developers have used server virtualization as a way to spin up new services quickly with very little additional cost.  That means that older low end servers aren’t being replaced when they reach the end of their life.  Those workloads are being virtualized and moved away while the equipment is permanently retired.
It also means that the target for server manufacturers is no longer the low end.  IT departments that have seen the benefits of virtualization now want larger servers with more memory and CPU power to insert into virtual clusters.  Why license several small servers when I can save money by buying a really big server?  With advances in SAN technology and parts that can be replaced without powering down the system, the need to have multiple systems for failover is practically negated.
And those virtual workloads are easily migrated away from onsite hardware as well.  The shift to cloud computing is the coup-de-gras for the low end server market.  It is just as easy to spin up an Amazon Web Services (AWS) instance to test software as it is to provision new hardware or a virtual cluster.  Companies looking to do hybrid cloud testing or public cloud deployments don’t want to spend money on hardware for the data center.  They would rather pour that money into AWS instances.
Those Internet Things
I think the disparity in the purchase price also speaks volumes for the value yet to be recognized in the Internet of Things (IoT).  Nest was worth so much to Google because it gave them an avenue not previously available.  Google wants to have as many devices in your home as it can afford to acquire.  Each of those devices can provide data to tune Google’s algorithms and provide quality data to advertisers that pay Google handsomely for those analytics.
IoT devices don’t need home servers.  They don’t ask for DNS entries.  They don’t have web interfaces.  The only setup needed out of the box is a connection to the wireless network in your home.  Once that happens, IoT devices usually connect back to a server in the cloud.  The customer accesses the device via an application downloaded from an app store.  No need for any additional hardware in the customer’s home.
IoT devices need infrastructure to work effectively.  However, they don’t need that infrastructure to exist on premises.  The shift to cloud computing means that these devices are happy to exist anywhere without dependence on hardware.  Users are more than willing to download apps to control them instead of debating how to configure the web UI.  Without the need for low end hardware to run these devices, the market for that hardware is effectively dead.

Tom’s Take
I think IBM got exactly what they wanted when they offloaded their server business.  They can now concentrate on services and software.  The kinds of things that are going to be important in the Internet of Things.  Rather than lamenting the fire sale price of a dying product line, we should instead by looking to the value still locked inside IoT devices and how much higher it can go.

A Plan To Fix E-Rate

The federal E-Rate program is in the news again. This time, it is due to a mention in the president’s State of the Union speech. He asked a deceptively simple question: “Why can’t schools have the same kind of wifi we have in coffee shops?” After the speech, press releases went flying from the Federal Communications Commission (FCC) talking about a restructuring plan that will eliminate older parts of the Federal Universal Service Fund (USF) like pagers and dial-up connections.

This isn’t the first time that E-Rate has been skewered by the president. Back in June of 2013, he asked some tough questions about increasing the availability of broadband connections to schools. Back then, I thought a lot about what his aim was and how easily it would be accomplished. With the recent news, I feel that I have to say something that the government doesn’t want to hear but needs to be said.

Mr. President, E-Rate is fundamentally broken and needs to be completely overhauled before your goals are met.

E-Rate hasn’t really changed since its inception in 1997. All that’s happened is more rules being piled on the top to combat fraud and attempt to keep up with changing technologies. Yet no one has really looked at the landscape of education technology today to see how best to use E-Rate funding. Computer labs have given way to laptop carts or even tablet carts. T1 WAN connections are now metro fiber handoffs at speeds of 100Mbit or better. Servers do much more than run DNS or web servers.

E-Rate has to be completely overhauled. The program no longer meets the needs of its constituents. Today, it serves as a vehicle for service providers and resellers to make money while forcing as much technology into schools as their meager budgets can afford. When schools with a 90% discount percentage are still having a hard time meeting their funding commitments, you have to take a long hard look at the prices being charged and the value being offered to the schools.

With that in mind, I’ve decided to take a stab at fixing E-Rate. It’s not a perfect solution, but I think it’s a great start. We need to build on the important pieces and discard the things that no longer make sense. To that end, I’m suggesting the Priority 1 / Priority 2 split be abolished. Cisco even agrees with me (PDF Link). In it’s place, we need to take a hard look at what our schools need to educate the youth of America.

Tier 1: WAN Connections

Schools need faster WAN connections. Online testing has replaced Scantrons. Streaming video is augmenting the classroom. The majority of traffic is outbound to the Internet, not internally. T1/T3 doesn’t cut it any more. Schools are going to need 100Mbit or better to meet student needs. Yet providers are reluctant to build out fiber networks that are unprofitable. Schools don’t want to pay for expensive circuits that are going to be clogged with junk.

Tier 1 in my proposal will be funding for fast WAN circuits and the routers that connect them. In the current system, that router is Priority 2, so even if you get the 10Gbit circuit you asked for, you may not be able to light it if P2 doesn’t come through. Under my plan, these circuits would be mandated to be fiber. That way, you can increase the amount of bandwidth to a site without the need to run a new line. That’s important, since most schools find themselves quickly consuming additional bandwidth before they realize it. Having a circuit capable of having additional head room is key to the future.

Service providers would also be capped at the amount that they could charge on a monthly basis for the circuit. It does a school no good to order a 1Gbps fiber circuit if they can’t afford to pay for it every month. By capping the amount that SPs can charge, they will be forced to compete or find other means to fund build outs.

Tier 2: Wireless Infrastructure

Wireless is key to the LAN connectivity in schools today. The days of wiring honeycombing the walls is through. Yet, Priority 2 still has a cabling component. It’s time to bring out schools into the 21st century. To that end, Tier 2 of my plan will be focused entirely on improving school wireless connectivity. No more cable runs unless they have a wireless AP on the end. Switches must be PoE/PoE+ capable to support the wireless infrastructure.

In addition, wireless site surveys must be performed before any installation plan is approved. VARs tend to skimp on the surveys now due to inability to recover costs in a straightforward manner. Now, they must do them. The costs of the site survey will be a line item for the site that is capped based on discount percentage. This will lead to an overall reduction in the amount of equipment ordered and installed, so the costs are easy to justify. The capped amount keeps VARs from price gouging with unnecessary additional work that isn’t critical to the infrastructure.

Tier 3: Server Infrastructure

Servers are still an important part of education IT. Even though the applications and services they provide are being increasing outsourced to hosted facilities there will still be a need to maintain existing equipment. However, current E-Rate rules only allow servers to serve Internet functions like DNS, DHCP, or Web Servers. This is ridiculous. DNS is an integral part of Active Directory, so almost every server that is a member of a domain is running it. DHCP is a minuscule part of a server’s function. Given the costs of engineering multiple DHCP servers in a network, having this as a valid E-Rate function is pointless. And when’s the last time a school had their own web server? Hosting services provide scale and easy-of-use that can’t be matched by a small box running in the back of the data center.

Tier 3 of my plan has servers available for schools. However, the hardware can run only one approved role: hypervisors. If you take a server under my E-Rate plan, it has to run ESX/Hyper-V/KVM on the bare metal. This means that ordering fewer big servers will be key to run virtual workloads. They cost allocation nightmare is over. These servers will be performing hypervisor duties all the time. The end user will be responsible for licensing the OS running on the guest. That gets rid of the gray areas we have today.

If you take a virtual server under Tier 3, you must provide a migration plan for your existing non-virtualized workloads. That means that once you accept Tier 3 funding for a server, you have one calendar year to migrate your workloads to that system. After that year, you can no longer claim existing servers as eligible. Moving to the future shouldn’t be painful, but buying a big server and not taking advantage of it is silly. If you show the foresight to use virtualization you’re going to use it all the way.

Of course, for those schools that don’t want to take a server because their workloads already exist in private clouds like Amazon Web Services (AWS) or Rackspace, there will be funding for AWS as well. We have to embrace the available options to ensure our students are learning at the fullest capacity.

Tom’s Take

E-Rate is a fifteen year old program in need of a remodel. The current system is underfunded, prone to gaming, and will eventually collapse in on itself. When USF is forced to rely on rollover funds from previous years to meet funding goals even at 90% something has to change. Priority 1 is bleeding E-Rate dry. The above plan focuses on the technology needed for schools to continue efficiently educating students in the coming years. It meets the immediate needs of education without starving the fund, since an increase is unlikely to come, even though other parts of USF have a sketchy reputation at best, as a quick Google search about the USF-funded cell phone program will attest. As Damon Killian said in The Running Man, “Hard times call for hard choices.” We have to be willing to blow up E-Rate as we know it in order to refocus it to make it serve the ultimate goal: Educating our students.

Disclaimer

Because I know someone from the FCC or SLD is going to read this, here’s the following disclaimer: I don’t work for an E-Rate provider. While I have in the past, this post does not reflect the opinions of anyone at that organization or any other organization involved in the consulting or execution of E-Rate. Don’t assume that because I think the program is broken that means the people that are still working with the program should be punished. They are doing good work while still conforming to the craziest red tape ever. Don’t punish them because I spoke out. Instead, fix the system so no one has to speak out again.

The Slippery Slope of Social Sign-In

FBTalons

At the recent Wireless Field Day 6, we got a chance to see a presentation from AirTight Networks about their foray into Social Wifi. The idea is that business can offer free guest wifi for customers in exchange for a Facebook Like or by following the business on Twitter. AirTight has made the process very seamless by allowing an integrated Facebook login button. Users can just click their way to free wifi.

I’m a bit guarded about this whole approach. It has nothing to do with AirTight’s implementation. In face, several other wireless companies are racing to have similar integration. It does have everything to do with the way that data is freely exchanged in today’s society. Sometimes more freely than it should.

Don’t Forget Me

Facebook knows a lot about me. They know where I live. They know who my friends are. They know my wife and how many kids we have. While I haven’t filled out the fields, there are others that have indicated things like political views and even more personal information like relationship status or sexual orientation. Facebook has become a social data dump for hundreds of millions of people.

For years, I’ve said that Facebook holds the holy grail of advertising – an searchable database of everything a given demographic “likes”. Facebook knows this, which is why they are so focused on growing their advertising arm. Every change to the timeline and every misguided attempt to “fix” their profile security has a single aim: convincing business to pay for access to your information.

Now, with social wifi, those business can get access to a large amount of data easily. When you create the API integration with Facebook, you can indicate a large number of discreet data points easily. It’s just a bunch of checkboxes. Having worked in IT before, I know the siren call that could cause a business owner to check every box he could with the idea that it’s better to collect more data rather than less. It’s just harmless, right?

Give It Away Now

People don’t safeguard their social media permissions and data like they should. If you’ve ever gotten DM spam from a follower or watched a Facebook wall swamped with “on behalf of” postings you know that people are willing to sign over the rights to their accounts for a 10% discount coupon or a silly analytics game. And that’s after the warning popup telling the user what permissions they are signing away. What if the data collection is more surreptitious?

The country came unglued when it was revealed that a government agency was collecting metadata and other discreet information about people that used online services. The uproar led to hearings and debate about how far reaching that program was. Yet many of those outraged people don’t think twice about letting a coffee shop have access to a wealth of data that would make the NSA salivate.

Providers are quick to say that there are ways to limit how much data is collected. It’s trivial to disable the ability to see how many children a user has. But what if that’s the data the business wants? Who is to say that Target or Walmart won’t collect that information for an innocent purpose today only to use it to target advertisements to users at a later date. That’s the exact kind of thing that people don’t think about.

Big data and our analytic integrations are allowing it to happen with ease today. The abundance of storage means we can collect everything and keep it forever without needing to worry about when we should throw things away. Ubiquitous wireless connectivity means we are never truly disconnected from the world. Services that we rely on to tell us about appointments or directions collect data they shouldn’t because it’s too difficult to decide how to dispose of it. It may sound a bit paranoid but you would be shocked to see what people are willing to trade without realizing.


Tom’s Take

Given the choice between paying a few dollars for wifi access or “liking” a company’s page on Facebook, I’ll gladly fork over the cash. I’d rather give up something of middling value (money) instead of giving up something more important to me (my identity). The key for vendors investigating social wifi is simple: transparency. Don’t just tell me that you can restrict the data that a business can collect. Show me exactly what data they are collecting. Don’t rely on the generalized permission prompts that Facebook and Twitter provide. If business really want to know how I voted in the last election then the wifi provider has a social responsibility to tell me that before I sign up. If shady businesses are forced to admit they are overstepping their data collection bounds then they might just change their tune. Let’s make technology work to protect our privacy for once.

Linux Lost The Battle But Won The War

I can still remember my first experience with Linux.  I was an intern at IBM in 2001 and downloaded the IBM Linux Client for e-Business onto a 3.5″ floppy and set about installing it to a test machine in my cubicle.  It was based on Red Hat 6.1.  I had lots of fun recompiling kernels, testing broken applications (thanks Lotus Notes), and trying to get basic hardware working (thanks deCSS).  I couldn’t help but think at the time that there was great potential in the software.

I’ve played with Linux on and off for the last twelve years.  SuSE, Novell, Ubuntu, Gentoo, Slackware, and countless other distros too obscure to rank on Google.  Each of them met needs the others didn’t.  Each tried to unseat Microsoft Windows as the predominant desktop OS.  Despite a range of options and configurability, they never quite hit the mark.  I think every year since 2005 has been the “Year of Desktop Linux”.  Yet year after year I see more Windows laptops out there and very few being offered with Linux installed from the factory.  It seems as though Linux might not ever reach the point of taking over the desktop.  Then I saw a chart that forced me to look at the battle in a new perspective:

AndroidDominance

Consider that Android is based on kernel version 3.4 with some Google modifications.  That means it runs Linux under the hood, even if the interface doesn’t look anything like KDE or GNOME.  And it’s running on millions of devices out there.  Phones and tablets in the hands of consumers world wide.  Linux doesn’t need to win the desktop battle any more.  It’s already ahead in the war for computing dominance.

It happened not because Linux was a clearly superior alternative to Windows-based computing.  It didn’t happen because users finally got fed up with horrible “every other version” nonsense from Redmond.  It happened because Linux offered something Windows has never been able to give developers – flexibility.

I’ve said more than once that the inherent flexibility of Linux could be considered a detriment to desktop dominance.  If you don’t like your window manager you can trade it out.  Swap GNOME for xfce or KDE if you prefer something different.  You can trade filesystems if you want.  You can pull out pieces of just about everything whenever you desire, even the kernel.  Without the mantra of forcing the user to accept what’s offered, people not only swap around at the drop of a hat but are also free to spin their own distro whenever they want.  As of this writing, Ubuntu has 72 distinct projects based on the core distro.  Is it a wonder why people have a hard time figuring out what to install?

Android, on the other hand, has minimal flexibility when it comes to the OS.  Google lets the carriers put their own UI customizations in place, and the hacker community has spun some very interesting builds of their own.  But the rank and file mobile device user isn’t going to go out and hack their way to OS nirvana.  They take what’s offered and use it in their daily computing lives.  Android’s development flexibility means it can be installed on a variety of hardware, from low end mobile phones to high end tablets.  Microsoft has much more stringent rules for hardware running their mobile OS.  Android’s licensing model is also a bit more friendly (it’s hard to beat free).

If the market is really driving toward a model of mobile devices replacing larger desktop computing, then Android may have given Linux the lead that it needs in the war for computing dominance.  Linux is already the choice for appliance computing.  Virtualization hypervisors other than Hyper-V are either Linux under the hood or owe much of their success to Linux.  Mobile devices are dominated by Linux.  Analysts were so focused on how Linux was a subpar performer when it came to workstation mindshare that they forgot to see that the other fronts in the battle were being quietly lost by Microsoft.


Tom’s Take

I’m not going to jump right out there and say that Linux is going to take over the desktop any time soon.  It doesn’t have to.  With the backing of Google and Android, it can quietly keep right on replacing desktop machines as they die off and mobile devices start replicating that functionality.  While I spend time on my old desktop PC now, it’s mostly for game playing.  The other functions that I use computers for, like email and web surfing, are slowly being replaced by mobile devices.  Whether or not you realize it, Linux and *BSD make up a large majority of the devices that people use in every day computing.  The hears and minds of the people were won by Linux without unseating the king of the desktop.  All that remains is to see how Microsoft chooses to act.  With a lead like the one Android has already in the mobile market, the war might be over before we know it.

Let’s Hear It For Uptime

I recently had to have a technician come troubleshoot a phone issue at my home.  I still have a landline with my cable provider.  Mostly because it would be too expensive to change to a package without a phone.  The landline does come in handy on occasion, so I needed to have it fixed.  When I was speaking with the technician that came to fix things, I inquired about something the customer service people on the phone had said about upgrading my equipment.  The field tech told me, “You don’t want that.  Your old system is much better.”  When he explained how the low voltage system would be replaced by a full voice over IP (VoIP) router, I agreed with him.  My thoughts were mostly around the uptime of my phone in the event of a power outage.

Uptime is something that we have grown accustomed to in today’s world.  If you don’t believe me, go unplug your wireless router for the next five minutes.  If your family isn’t ready to burn you at the stake then you are luckier than most.  For the rest of us we measure our happiness in the availability of services.  Cloud email, streaming video, and Internet access all have to be available at the touch of a button.  Whether it be for work or for personal use, uptime is very important in a connected world.

It still surprises me that people don’t focus on uptime as an important metric of their solutions.  Selling redundant equipment or ensuring redundant paths should be one of the first considerations you have when planning a system.  As Greg Ferro once told me, “When I tell you to buy one switch, I always mean two.” Backup equipment is as important as anything you can install.

You have to test your uptime as well.  You don’t have to go to all the trouble of building your own chaos monkey, but you need to pull the plug on the primary every so often to be sure everything works.  You also need to make sure that your backup systems are covered all the way down.  Switches may function just fine with two control engines, but everything stops without power.  Generators and battery backups are important.  In the above case, I would need to put my entire network on a battery backup system in order to ensure I have the same phone uptime that I enjoy now with a relatively low-tech system.

You also have to account for other situations as well.  Several gaming sites were taken offline recently due to the efforts of a group launching distributed denial of service (DDoS) attacks against soft targets like login servers.  You have to make sure that the important aspects of your infrastructure are protected against external issues like this.  Customers don’t know the difference between a security related attack and an outage.  They all look the same in the eyes of a person paying for your service.

We should all strive to provide the most uptime possible for everything that we do.  Potential customers may scoff at the idea of paying for extra parts they don’t currently use.  That usually falls away once you explain what happens in the event of an outage.  We should also strive to point out issues with contingency plans when we see them.  Redundant circuits from a provider aren’t really redundant if they share the same last mile. You’ll never know how this affects you until you test your settings.  When it comes to uptime, take nothing for granted.  Test everything until you know that it won’t quit when failure happens.

image1

Don’t just plan for downtime.  Forget how many nines you support.  I was once told that a software vendor had “seven nines” of uptime.  I responded by telling them, “that’s three seconds of downtime allowed per year.  Wouldn’t it just be easier to say you never go down?”  Rather than having the mindset that something will eventually fail you should instead have the idea that everything will stay up and running.  It’s a subtle shift in thinking, but changing your perception does wonders for designing solutions that are always available.

The Compost-PC Era

Generic Mobile Devices

I realized the other day that the vibration motor in my iPhone 5s had gone out.  Thankfully, my device was still covered under warranty.  I set up an appointment to have it fixed at the nearest Apple store.  I figured I’d go in and they’d just pop in a new motor.  It is a simple repair according to iFixit.  I backed my phone up one last time as a precaution.  When I arrived at the store, it took no time to determine what was wrong.

What shocked me was that the Genius tech told me, “We’re just going to replace your whole phone.  We’ll send the old one off to get repaired.”  I was taken aback.  This was a $20 part that should have taken all of five minutes to pop in.  Instead, I got my phone completely replaced after just three months.  As the new phone synced from my last iClould backup, I started thinking about what this means for the future of devices.

Bring Your Own Disposable

Most mobile devices are a wonder of space engineering.  Cramming an extra long battery in with a vibrant color screen and enough storage to satisfy users is a challenge in any device.  Making it small enough and light enough to hold in the palm of your hand is even more difficult.  Compromises must be made.  Devices today are held together as much by glue and adhesive as they are nuts and bolts and screws.  Gaining access to a device to repair a broken part is becoming more and more impossible with each new generation.

I can still remember opening the case on my first PC to add a sound card and an Overdrive processor.  It was a bit scary but led to a career in repairing computers.  I’ve downright terrified to pop open an iPhone.  The ribbon cables are so fragile that it doesn’t take much to render the phone unusable.  Even Apple knows this.  They are much more likely to have the repairs done in a separate facility rather than at the store.  Other than screen replacements, the majority of broken parts result in a new phone being given to the customer.  After all, it’s very easy to replace devices when the data is safe somewhere.

The Cloud Will Save It All

Use of cloud storage and backup is the key to the disposable device trend.  If you tell me that I’m going to lose my laptop and all the data on it I’m going to get a little concerned.  If you tell me that I’m going to lose my phone, I don’t mind as much thanks to the cloud backup I have configured.  In the above case, my data was synced back to my phone as I shopped for a new screen protector.  Just like a corporate system, data loss is the biggest concern on a device.  Cloud storage is a lot like a roaming profile.  I can sync that data back to a fresh device and keep going after a short interruption.  Gone are the wasted hours of reinstallation of operating system and software.

Why repair devices when they can easily be replaced at little cost?  Why should you pay someone to spend their time diagnosing a bad CPU or bad RAM when you can just unwrap a new mobile device, sync your profile and data, and move on with your project?  The implications for PC repair techs are legion.  As are the implications for manufacturers that create products that are easy to open and contain field replaceable parts.

Why go to all the extra effort of making a device that can be easily repaired if it’s much cheaper to just glue it together and recycle what parts you can after it breaks?  Customers have already shown their propensity to upgrade devices with every new cycle each year.  They’d rather buy everything new instead of upgrading the old to match.  That means making the device field repairable (or upgradable) is extra cost you don’t need.  Making devices that aren’t easily fixed in the field means you need to spend less of your budgets training people how to repair them.  In fact, it’s just easier to have the customer send the device back to the manufacturing plant.


Tom’s Take

The cloud has enabled us to keep our data consistent between devices.  While it has helped blur the lines between desktop and mobile device, it has also helped blur the lines tying people to a specific device.  If I can have my phone or tablet replaced with almost no impact, I’m going to elect to have than done rather than finding replacement parts to keep the old one running just a bit longer.  It also means that after pulling the useful parts out of those mildly broken devices that they will end up in the same landfill that analysts are saying will be filled with rejected desktop PCs.