NASLite Network Attached Storage

www.serverelements.com
Task-specific simplicity with low hardware requirements.
It is currently Thu Mar 28, 2024 7:35 am

All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Wed Jan 04, 2012 9:11 am 
Offline

Joined: Fri Feb 25, 2005 11:50 pm
Posts: 139
I've started using one of our servers as a "mirror machine" to preserve family photos/videos. All rsyncs are local, and eventually I'll have a swappable drive cage in place to facilitate easy removal for offsite storage.

Everything seems to be working well, but I'd still like to hear of any known "best practices" for maintaining large archival file collections. Am already working to eventually "lock down" potential file corruption issues via that "ICE ECC" parity archive product http://www.ice-graphics.com/ICEECC/IndexE.html (kinda a poor man's zfs).

Is the NASlite rsync completely trustworthy for bit level mirroring and is there anything I should watch for while generating these multiple backups? Currently using those WD20EARS advanced format drives for power efficiency and error correction.


Top
 Profile  
 
PostPosted: Wed Jan 04, 2012 10:20 pm 
Offline

Joined: Sun Apr 02, 2006 9:05 pm
Posts: 1688
Location: Up State NY in the USA!!!!
As far as I know the only accepted media for long term storage is still tape. DLT is used at this time and holds a fair amount of data per cartridge. There are archival CDs and DVDs that use dye that has a much longer life rating for data retention.

Hard drives are not a good choice as the areal density on the latest drives are very high and the magnetic domains are very small and prone to degradation over time. Remember that they use some form of PRML for recovery of said data, kinda scarey!

Mike


Top
 Profile  
 
PostPosted: Wed Jan 04, 2012 10:38 pm 
Offline

Joined: Fri Feb 25, 2005 11:50 pm
Posts: 139
mikeiver1 wrote:
As far as I know the only accepted media for long term storage is still tape. DLT is used at this time and holds a fair amount of data per cartridge. There are archival CDs and DVDs that use dye that has a much longer life rating for data retention.

Hard drives are not a good choice as the areal density on the latest drives are very high and the magnetic domains are very small and prone to degradation over time. Remember that they use some form of PRML for recovery of said data, kinda scarey!

Mike


I know....But these are large tifs and pngs files from scanned film. The jpegs I keep on S3 (which supposedly controls data rot)...but for files this large - options are more limited (cost/media availability). Still also use DVD-RAM and until recently tape. This tape had an order of magnitude better error spec than any hard drive - still failed me. Badly. Only other backups saved the day.

Keep these large files around as a labor saving device. Prevents me from dealing with the labor intensive scanning process with frozen film. And as the software gets better - I reprocess these "originals" into better jpegs.


Top
 Profile  
 
PostPosted: Wed Jan 04, 2012 10:43 pm 
Offline

Joined: Fri Feb 25, 2005 11:50 pm
Posts: 139
mikeiver1 wrote:
As far as I know the only accepted media for long term storage is still tape. DLT is used at this time and holds a fair amount of data per cartridge. There are archival CDs and DVDs that use dye that has a much longer life rating for data retention.

Hard drives are not a good choice as the areal density on the latest drives are very high and the magnetic domains are very small and prone to degradation over time. Remember that they use some form of PRML for recovery of said data, kinda scarey!

Mike



Really trying to get a feeling for the validity of the rsync function (which also has many command line options). Are settings used with NASLite conservative?

Need to know with all these multiple mirrors I'm "burning". Trying to make my ad hoc setup more systematic.


Top
 Profile  
 
PostPosted: Thu Jan 05, 2012 10:24 pm 
Offline

Joined: Sun Apr 02, 2006 9:05 pm
Posts: 1688
Location: Up State NY in the USA!!!!
Well you could go with a RAID 6 array with a hot spare. Double parity, you would have to have quite a bit of failure to loose it all.

As far as tape is concerned, you might look at a DLT or AIT drive. These are not "Consumer" grade devices. These are used extensively in the enterprise. That being said there are still issues with tape.

Likely your best solution is to simply do a complete copy of your files to another drive or drives and off site them. Do a complete backup every month and hope for the best. You may end up with one or two files corrupted in worst case situation.

Mike


Top
 Profile  
 
PostPosted: Thu Jan 05, 2012 10:59 pm 
Offline

Joined: Fri Feb 25, 2005 11:50 pm
Posts: 139
mikeiver1 wrote:
Well you could go with a RAID 6 array with a hot spare. Double parity, you would have to have quite a bit of failure to loose it all.



Likely your best solution is to simply do a complete copy of your files to another drive or drives and off site them. Do a complete backup every month and hope for the best. You may end up with one or two files corrupted in worst case situation.

Mike


Actually, that's what I'm in the process of doing. Hence the questions concerning rsync. Wiki claims a possible "hash collision probability" of base2 -E160 ...but I'm betting that is when rsync is used at the limits of it's "accuracy". So I'm wondering if that is really the default for NASlite? If so, pretty good deal...

Google's technical paper regarding drive errors/failures suggests that "multiple copies" on different drives is the best protection strategy - yet I've been unable to find any "voting" difference applications which could identify the one file out of five which might likely be bad. You would think that if Google relied on this "multiple copy strategy" - such tools would be all over the place? And this "Greyhole" project seems bent on doing the Google plan on a smaller scale - but I don't see anything suggesting they really know how to select that one bad file for removal. (Looney Tune Coyote rapidly shaking head from side to side)

I'm pretty frustrated about this issue, because the harder you dig - the messier it looks. Beginning to believe that no one really knows what they are talking about. Everything we "know" is just stuff endlessly repeated with no one doing any real testing work.

Guess that's why the Smithsonian still uses frozen film? :) Arrghhhhhhhh

Mike you've always been very helpful here, and I appreciate your input. Anything you can provide regarding the "envelope" of rsync function within NASLite would be appreciated. If Naslite *is* doing this correctly, then perhaps half my battle is won.

35mm converted to tiff on a Nikon negative scanner yields a file averaging 150MB ea. I sometimes compress them a little to PNG, but as you noted in your comment regarding high density disks - that's askin' for it.

Got my SATA swap card cage in, and plan to finish up those mirrors next week (total 5). If NASlite does rsync correctly, I'd say it's probably the most cost effective archiving tool available. All investment goes to disks (and I bought mine before the Thailand price hike).


Top
 Profile  
 
PostPosted: Fri Jan 06, 2012 6:12 pm 
Offline

Joined: Sun Apr 02, 2006 9:05 pm
Posts: 1688
Location: Up State NY in the USA!!!!
As far as finding errors in a file stored on multiple drives there are a number of ways that will likely work, none fast though.

You could generate an MD5 hash and have a script to do a simple compare of them all. There are bit for bit comparators that one could run against the drives but again this would be slow.

To be honest there are really no fast ways to check the consistency of a saved file other than to save it with parity or double parity. This will eat a S#!t load of storage to say the least and it is not fool proof either but I think it is likely the best and most reasonable solution for you at this point.

The real underlying issue here is Entropy. The physics law simply proclaims that there is a natural tendency for things, all things, to move from an orderly to disorderly state and no known form of storage is immune to this.

My money would be on a hardware RAID6 controller and drives.

Mike


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot] and 38 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group