Data retention is a huge deal for many companies. When you say “tape backup”, the first thing that leaps to people’s minds is backup operations. Servers with Digital Audio Tape (DAT) drives or newer Linear-Tape Open (LTO) units. Judiciously saving those bits for the future when you might just need to dig up one or two in order to recover emails or databases. After visiting with SpectraLogic at their 2013 Spectra Summit, I’m starting to see that tape isn’t just for saving the day. It’s for saving everything.
Let’s Go To The Tape
Tape is cheap. As outlined in this Computer World article, for small applications of less than 6 tape drives, tape is 1/6th the cost of disk backup. It also lasts virtually forever. I’ve still got VHS tapes from the 80s that I can watch if I so desire. And that’s consumer grade magnetic media. Imagine how well enterprise grade stuff would work? It’s also portable. You can eject a tape and take it home on the weekends as a form of disaster recovery. If you have at least one tape offsite in the grandfather-father-son rotation, you can be assured of getting at least some of your data back in the event of a disaster.
Tape has drawbacks. It’s slow. Really slow. The sequential access of tape drives makes them inefficient as a storage medium. You can batch writes to a cluster of drives, but good luck if you ever want to get that data back in a reasonable time frame. I once heard someone refer to tape as “Write Once, Read Never”. It also has trouble scaling very large. In the end, you need to cluster several tape units together in order to achieve the kind of scale that you need to capture data from the the virtual firehose today.
SpectraLogic launched a product called DeepStorage. That is in no way affiliated with Howard Marks (@DeepStorageNet). DeepStorage is the idea that you can save files forever. It uses a product called BlackPearl to eliminate one of the biggest issues with tape: speed. BlackPearl comes with SSD drives to use as a write cache for data being sent to the tape archive. BlackPearl uses a SpectraLogic protocol called DS3, which stands for DeepS3, to hold the data until it can be written to the tape archive in the most efficient manner. DS3 looks a lot like Amazon S3. That’s on purpose. With the industry as a whole moving toward RESTful APIs and more web interfaces, making a RESTful API for tape storage seems like a great fit for SpectraLogic.
It’s goes a little deeper than that, though (pardon the pun). One other thing that made me pause was LTFS – the Linear Tape File System. LTFS allows for a more open environment to write data. In the past, any data that you backed up to tape left you at the mercy of the software you used to write that data. CommVault couldn’t read Veritas volumes. ARCServe didn’t play nicely with Symantec. With LTFS, you can not only read data from multiple different backup vendors, but you can also stop treating tape drives like Write Once, Read Never devices. LTFS allows a cluster of tape units to look and act just like a storage array. A slow array to be sure, but still an array.
SpectraLogic took the ideas behind LTFS and coupled them with DeepStorage to create an idea – “buckets”. Buckets function just like the buckets you find in Amazon S3. These are user-defined constructs that hold data. The BlackPearl caches these buckets and optimizes the writes to your tape array. Where the bucket metaphor works well is the portability of the bucket. Let’s say you wanted to transfer long-term data like phone records or legal documents between law firms that are both using DeepStorage. All you need to do is identify the bucket in question, eject the tape (or tapes) needed to recreate that bucket, and then send the tapes to the destination. Once there, the storage admin just needs to import the bucket from the tapes in question and all the data in that bucket can be read. No software version mismatches. No late night panicked calls because nothing will mount. Data exchange without hassles.
The Tape Library of Congress
The ideas here boggle the mind. While at the Spectra Summit, we heard from companies like NASCAR and Yahoo. They are using BlackPearl and DS3 as a way to store large media files virtually forever. There’s no reason you can’t do something similar. I had to babysit a legal server migration one night because it had 480,000 WordPerfect documents that represented their entire case log for the last twenty years. Why couldn’t that be moved to long-term storage? For law offices that still have paper records of everything and don’t want to scan it all in for fear of an OCR mistake, why not just make an image of every file and store it on an LTFS volume fronted by DS3?
The flexibility of a RESTful API means that you can created a customized interface virtually on the fly. Afraid the auditors aren’t going to be able to find data from five years ago? Make a simple searching interface that is customized to their needs. Want to do batch processing across multiple units with parallel writes for fault tolerance? You can program that as well. With REST calls, anything is possible.
DS3 is going to enable you to keep data forever. No more worrying about throwing things out. No need to rent storage lockers for cardboard boxes full of files. No need to worry about the weather or insects. Just keeping the data center online is enough to keep your data in a readable format from now until forever.
I was a guest of SpectraLogic for their 2013 Spectra Summit. They paid for my flight and lodging during the event. They also provided a t-shirt, a jacket, and a 2 GB USB drive containing marketing collateral. They did not ask for any consideration in the writing of this review, nor were they promised any. The conclusions reach herein are mine and mine alone. In addition, any errors or omissions are mine as well.