This content is submitted by a BigAdmin user. It has not been reviewed for technical accuracy by Sun Microsystems, though it may have been lightly edited to improve readability. If you find an error or would like to comment on the article, please contact the submitter or use the comment field at the bottom of the article. Community submissions may not follow Sun trademark guidelines. For information on Sun trademarks, please see http://www.sun.com/suntrademarks/.
 
 

Using ZFS to Fight Data Rot, the Silent Killer

Kevin McAleer, January 2009

Previously, I wrote an article for BigAdmin about why I chose the ZFS file system to ensure my data was safe: How I Used Solaris OS and ZFS to Solve My Mac OS X Storage Problem.

One of the reasons I chose the ZFS file system as opposed to Apple HFS+, Linux ext3, or Microsoft Windows NTFS is because the ZFS file system checksums all the data written to and read from it. This might seem unnecessary, a little obsessive, or even CPU-hungry, but it is essential for long-term data storage and for detecting data rot.

So what is data rot, why should I fear it, and most importantly, what can I do about it?

Quite simply, data rot is the result of tiny changes in the magnetic particles that make up the media in hard disks. The effect this has on your data is random but predictable: data loss. It might be the contents of a file that gets corrupted, the file header that describes the contents of the file, or, worse, the file allocation table that describes the location or links to the file. The file might be a system file or a data file; either way, it's eventually going to be bad news.

According to a recent study, Analyzing the Effects of Disk-Pointer Corruption (pdf), 0.66% of SATA disks and 0.06% of Fibre Channel disk developed corruption in 17 months of use. The same article describes how some corruption is worse than others and explains that most modern filing systems are unable to deal effectively with this (excluding the ZFS file system, of course!).

So you're probably thinking "Doesn't chkdsk detect and correct this kind of problem (or the fsck utility or Disk Utility in Linux or Mac OS X, respectively)"? Well, maybe, maybe not, depending on where the corruption occurs. If the corruption occurs in the file system structure, then see the References listed below. If it occurs in the file content, then the answer is "probably not".

We've established what data rot is and how existing tools are not suited to detecting, correcting, or preventing it. Now, on to why you should care about this...

How important is your data? I mean, really? Think about it. I personally have the following data stored on my computer: photos and videos of my daughter since birth, software downloads I've purchased (including Adobe Photoshop and Adobe Dreamweaver, which weren't cheap), my iTunes library (for which I must have spent a couple of hundred, if not into the triple 0's, of dollars), and various work projects.

I'm not prepared to let anything happen to this data. So I've taken steps to avoid obvious problems:

  • The file server is a dedicated box.
  • My data is separated out to avoid accidental deletion.
  • I back up my data regularly (on the Mac with Time Machine and on OpenSolaris with the ZFS send and receive commands).

I've also taken steps to design my storage solution correctly: I use several disks in a RAID configuration (RAID-Z with a hot spare) to ensure a single disk failure can't cause data loss.

Finally, I choose to use the ZFS file system because I know that it checksums every read and write to the filing system, ensuring that my data is as it was when it was written to disk.

I run a "scrub" of the ZFS file system every week to ensure that no data has become corrupted by data rot, and this week, it detected over 20 instances of it. Thankfully, ZFS effortlessly replaced the corrupted data with good data held elsewhere on disk (thanks to RAID-Z) without any loss whatsoever.

Conclusion: To prevent data rot, choose the ZFS file system.

Although I didn't lose data, the experience did drive me to write this article, because I wanted to make people aware of this issue.

References

About the Author

Kevin McAleer is the director of Advice Factory, offering advice and IT consultancy services to businesses in the UK. He is an Apple Mac fan and also an evangelist for Sun's ZFS technology.

The information and links on this page have been provided by a BigAdmin user. The submitter is solely responsible for such information and links. Sun is not responsible for the availability of external sites or resources, and does not endorse and is not responsible or liable for any content, advertising, products, or other materials on or available from such sites or resources. Sun will not be responsible or liable, directly or indirectly, for any actual or alleged damage or loss caused by or in connection with use of or reliance on the information posted here, or goods or services available on or through any external site or resource.
 
 

Comments (latest comments first)

Discuss and comment on this resource in the BigAdmin Wiki

Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License.


Left Curve
Popular Downloads
Right Curve
Untitled Document
Left Curve
More Systems Downloads
Right Curve