Home
Critical Data Archiving - CD? DVD? USB? What? PDF Print E-mail
Written by Virgil   
Tuesday, 02 February 2010 18:10
AddThis Social Bookmark Button

 

Lately, I've been talking with some IT gurus and some other normal people about data integrity for long term storage. There's talk about a 'catastrophe' in '2012' but chances are it's likely to be just as big as the Millenium Bug - nada, nothing, zip - unless your mission critical software is based on the Mayan Calendar. 

 

Yes, the words "data integrity" and "long term storage" often make most of us want to run the other way, especially if they come up in a social setting. So, I'm going to keep it light and breezy at the beginning and then get into the nerdy stuff a bit later in the article so that your eyes won't glaze over too soon. 

 

Almost everything we do these days is digital: financial, medical, communication, family photos and videos. The media they're stored on is either optical, magnetic or silicon. Each of these is flawed for archival purposes - some more than others. 

 

In the cases above, all of the information is something that you are likely to want to keep for quite a few years. That's where the problem starts with all digital media. 

 

The simplest solution often is the best and in this case is paper. Want to keep it a long time? Print it on paper - either by inkjet or laser - and it's going to be around a lot longer than you are, if it's stored correctly. If.

 

Let's have a look at your digital media storage options:

  • Replicated CD / DVD: costly and only good if you want more than 500 copies. Not really useful as an archiving media. Lifetime? More than 20 years easily.
  • CD-R / DVD-R: cheap, pretty effective. The results of long term storage can be a bit iffy, depending on the media used and the condition of the burner used to make the disc. Lifetime? Two to five years on average, but your mileage may vary up or down. 
  • Backup Tapes: Not cheap and not for the average user. A reasonable quality DLT drive will set you back at least five thousand dollars and the tapes are not cheap either, not to mention bulky to store and often a bit sensitive to environmental factors. Head maintenance schedules need to be adhered to which adds more expense. Also, not all tapes written by all DLT drives on all operating systems can be read by each other. Data interchange fail. Expected lifetime of the tape? Somewhere between four and ten years on average but there are claims of much longer.
  • Hard disk: Commonly available and cheap as. However, they tend to last between about two and five years in most cases.
  • Flash memory (SD Cards / USB Drives / etc): Not particularly cheap per Gigabyte. Cheap ones fail sooner. No ifs or buts about it, but then again, same thing goes for all the other options above. How long will a good one last? About eight to ten years on a set-and-forget basis. 
  • RAID Arrays of Hard Disks (RAID5): Moderately costly and really only for the enthusiast or the professional user. With five or more drives, plus configured redundant drives they're hot and noisy and not cheap to run. However, if a drive fails they will keep running and can automatically rebuild the content from the dead drive onto one of the redundant drives, if they're configured that way. Top stuff! However - as all the drives are likely to have been bought at the same time, one or more may fail in quick succession - perhaps while the rebuild of the dead drive is occurring - which means all your data is pretty much gone. Lifetime? Hard to assess accurately because of statistical failure of one or more components in a complex system. At a guess? Bet on two to five years before a failure but make sure you've got those redundant drives installed and configured for a better chance of survival. 
  • USB RAID Arrays:  A fairly recent innovation that has only really become possible because of USB3.0. I've seen some earlier homebrew ones, but they were very, very slow and not practical except for the serious enthusiast. However, being a sealed unit the whole "statistical failure of a complex system" issue comes into play and they're not generally repairable in the manner of a RAID array based on normal hard disks. Lifetime? Fingers crossed but hope it's more than the warranty. 

 

As you can see, pretty much all of the digital media - CD-R, DVD-R, hard disk, tape and RAID - on their own, aren't really sufficiently reliable to protect really valuable data - and that's just for normal business and personal use! In specialist applications such as at sea on ships and boats or in industrial facilities, let alone for government and corporate archivists, these general estimates of useful data carrier life reduce dramatically. There are quite a few independent papers which agree on this point. Some of the top sources for knowledge about this are the National Archive of USA and the Australian National Archive. There are also some interesting papers about it at UNESCO's 'Memory of the World' project pages

 

Some people are really enthusiastic about "the cloud", referring to offsite backup or immediate storage of files via the internet to a remote server where you pay per Gigabyte, per month, etc.

It also adds the additional failure vector of connectivity. If you can't connect, your data or content may as well be in the clouds.

Even if the service provider you choose has exemplary connectivity and redundancy there's a chance that they will cut you off and remove your content if you don't pay their bills or if they go bust.Or if they are not telling you the truth about their backup generator, redundant connections and all those fine buzzwords that seem to flow from the spiel you read before you pay the first time. 

That's a problem that none of the above physical media have - reliance on ongoing payments to ensure your data integrity and reliance that the other guy isn't going to go out of business and your data is going to end up at a law-department auction and be available for all to see and commercially exploit. 

 

The easiest way to overcome these challenges is by data rotation of physical digital media which you own, have and hold. This also protects against the redundancy of your chosen playback device in the case of removable media (CD, DVD, ZIP drive, Bernoulli Box, Jaz drive, USB, etc).

While it's not perfect, it's pretty much the only solution available right now. If data rotation periods can be safely extended, then that method will be a winner which saves valuable content and time.

 

Data rotation is part of an ongoing method to preserve archived or collected data beyond the probable MTTF (Mean Time To Failure) or MTBF (Mean Time Between Failures) of your chosen data carrier. It usually involves checking the data integrity at a specified service interval to make sure the content is still readable and in some cases transferring it to whatever new or state of the art storage media is available at the time.

Need a simple example? It's something along the lines of transferring Grandma's 8 mm movies to Betamax in the early 1980s and then transferring it to VHS in the 1990s... and onwards to DVD-R in the early 2000s. See? Data rotation in normal, everyday life - preserving your precious family memories. 

 

In a commercial environment for legal, medical and financial practitioners it's much the same.

Data from finished jobs gets recorded onto CD-R or DVD-R and is sent to your off-site storage facility, perhaps never to be seen again or at least until the statute of limitations in your jurisdiction expires - if there even is a statute of limitations relating to any allegations later brought against you.

Chances are, the data will expire before the statute does which could be rather embarrassing... or leave you in a legal minefield or worse.

Some organisations that undertake storage of professional documents have a proper climate controlled vault. Some do not. Be aware. 

 

Data integrity, as dull as it is, is the backbone of our personal memories, our necessary records and our professional integrity. Don't let anyone amuse you to the contrary. Good, truthful data equals good truthful times. 

 

If you're not able to involve yourself in some of the superior methods above, either for lack of financial resources or for lack of willingness, then at least consider using top grade recordable media. It's a little more expensive, but at least you can be sure that your valuable personal content or critical corporate data is reasonably well protected.

 

And yes, we're looking for a better solution for you.