On ZFS data recovery
ZFS, Zetabyte File System, is a fortress for data. The true strength of it you can feel during a disaster. Here are some of my experiences on ZFS data recovery.
If you’ve accidently destroyed pool…
That’s not a problem. zpool import -d will import it back for you.
If some other disaster strikes…
First of all, keep calm. Write down everything you’re going to do.
Second, if possible, work on copy of your data. Here’s a trick: if you have a spare disk/disks slightly larger than your broken pool, create zfs pool on them, copy partition(s) to file(s) and snapshot. Now you can work on copy and roll back in case of any error.
Snapshot often. You’ve tried something, it seemed to take you closer to recover? Snapshot. You’ve messed everything? Rollback.
Third, get the latest OpenZFS implementation. It’s becoming more and more stable, and more features occasionally get added to zdb.
Fourth, read documentation. Just do it while your data is being copied, and do it twice when you have no place to copy it and have to work on your live (hopefully) pool.
Always import broken pool with -o readonly. It would be much, much safer.
The case of broken disk
It’s actually the simplest one. ZFS is copy-on-write, so in most cases you have at least a fair chance to get your data back.
Just set verification of data and metadata off (check for your system, for FreeBSD it’s
sysctl vfs.zfs.spa.load_verify_metadata=0
sysctl vfs.zfs.spa.load_verify_data=0
and zpool will import whatever you have on your disks. With possible broken data, but you’ve set checksum to fletcher4, didn’t you?
The case of exploded data
Sometimes, you get your pool exploded. Actually, I’ve met with this disaster once for, say, 200 server-years of working. Looks like I’ve hit a bug with destroying filesystem with new shining zstd compression (as of 2022 Jan, don’t use it unless you’re 100% sure). Well, zfs explodes much less often than any system I’ve worked with, but allows recover where all others does not.
zpool import claimed that pool is in FAULTED state, corrupted data. zdb agreed and did not show any dataset.
Attempt to turn off data verification made no change.
Attempt to import -FX (read the manual) resulted in panic during import.
Than I’ve read, read and read until I’ve got the idea. zdb -u got list of uberblocks, and list of possible transactions!
I’ve asked zpool to rewind to the oldest of them, and — WOW — my data is here. About 30 minutes before the crash, and some 2–3 files (of some 3 terabytes) become broken, but that was enough and I did not try later transactions.
After you’ve settled the case…
Now it’s time to thing of things you’ve missed. ZFS is cool, but raid does not help in some cases. Use backup, and do not backup to the same room. I’ve seen server room after fire, and after flooding. Make your backup somewhere far, maybe even in different country. Restoring from backup is generally faster then recovering from broken pool, and will give you 100% predictable result.
Check your disks often. SMART self-testing and zpool scrub, automated, monthly at least, with working error reporting, are your best friends.
Beware of rate but deadly dying twins case: if your disks, due to firmware bug, are programmed to die after some amount of data written, and you’ve set them to RAID, they’ll die together (with all of your data). So it’s a good idea to wait a month when setting up/replacing RAID.
Replacing. HDD lives for 3–5 years (well, sometimes less, sometimes more, but after three years their time is running out). Replacing them every 3 years protects you from disaster and also gives you bigger/faster disks. Remember, who do not upgrade on schedule, recover at the most inappropriate time.