complete pool corruption and lost trust

All your general support questions for OpenZFS on OS X.

complete pool corruption and lost trust

Postby hansmueller » Wed May 26, 2021 3:02 am

Hello there,

I have recently reformatted my personal backup drive with ZFS, in order to use it with Linux and MacOsX and played back our family photos on it. This pool was made from linux-side.
Then, I reformatted a spare partition on my end 2013 MBP (Catalina) with ZFS (1.9.4), made from mac-side, and copied the photos over, for use with Apple Photos. Unfortunately, I am not sure, but I suspect I copied it using zfs send/receive from the linux-ZFS-formatted backup drive and that might have caused problems.

Everything was fine for maybe 2 weeks, but then ZFS complained about corruption. I ran scrub and it identified a long list of corrupted "files", with garbage file names.
I might have forgotten to export the pool manually during reboots (which might have happened only a handful of times at most), but I do not believe a zfs pool would crash that way because of that.
I tried to force import the pool a couple of times, but was unsuccessful. I also tried
Code: Select all
zpool clear -F zimagepool
cannot open 'zimagepool': no such pool


Now, I am no longer able to import the pool at all using zpool import, but strangely, the partition is auto-mounted by macos as msdos partition with garbage names and I can even cd into that mysteriously mounted Volume. I guess this only has deteriorated corruption.

I now tried to boot ubuntu 21 on my mac which gave the following results:
Code: Select all
root@ubuntu:/home/ubuntu# zpool import
no pools available to import
root@ubuntu:/home/ubuntu# zpool import -d /dev/sda5
   pool: zimagepool
     id: 4626414534438887032
  state: ONLINE
status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
   the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
 config:

   zimagepool  ONLINE
     sda5      ONLINE
root@ubuntu:/home/ubuntu# zpool import zimagepool
cannot import 'zimagepool': no such pool available
root@ubuntu:/home/ubuntu# zpool import 4626414534438887032
cannot import '4626414534438887032': no such pool available
root@ubuntu:/home/ubuntu# zpool import -f 4626414534438887032
cannot import '4626414534438887032': no such pool available
root@ubuntu:/home/ubuntu# man zpool import
root@ubuntu:/home/ubuntu# zpool import -F 4626414534438887032
cannot import '4626414534438887032': no such pool available



May

The problem now is not data loss, since this was just restored from backup anyway. My problem is, that I now have lost my trust in openzfsonosx and I don't dare to even mount the ZFS-backup drive, because if it corrupts that one, that would be desastrous.

The only safe-enough way I see to use the backup-drive with my mac is only indirectly via smb share, served from a separate linux-laptop connected to the drive. I don't trust it enough to even import readOnly.

Can anyone comment what might have caused the corruption and what I have to consider when using ZFS for mac and linux ?

Thanks for any hints...

Ingvar
hansmueller
 
Posts: 3
Joined: Mon May 10, 2021 1:41 am

Re: complete pool corruption and lost trust

Postby hansmueller » Wed May 26, 2021 3:27 am

Is it possible that a deduplication using rmlint tool could have caused this?
hansmueller
 
Posts: 3
Joined: Mon May 10, 2021 1:41 am

Re: complete pool corruption and lost trust

Postby lundman » Thu May 27, 2021 1:17 am

Hello, I'm sorry you've had a poor experience with ZFS. Some comments:
Code: Select all
zpool clear -F zimagepool
cannot open 'zimagepool': no such pool


That just means the pool is not imported, similarly:

Code: Select all
root@ubuntu:/home/ubuntu# zpool import -d /dev/sda5
   pool: zimagepool


The "-d" switch for import actually takes a directory to look inside, so the command should be;

Now the original cause would be interesting. Checking the output on diskutil list to make sure partition isn't labelled wrong. If it had the guid of MSDOS, it could be macOS tried to mount it as MSDOS, which would be bad.


Code: Select all
zpool import -d /dev


That ZOL handles being given a path ("/dev/sda5") is an oddity, and you probably should not rely on it.
So the proper command to import the pool is

Code: Select all
zpool import -d /dev zimagepool
or
zpool import -d /dev 4626414534438887032


Otherwise it looks ok at a distance, you should import it and then run a scrub.

Because it was imported on a different system, you need to use "-f", that is not a concern.

So then it would be

Code: Select all
zpool import -fd /dev/ zimagepool
zpool scrub zimagepool
zpool status
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: complete pool corruption and lost trust

Postby hansmueller » Fri May 28, 2021 8:08 am

Hello lundman,

Thank you very much for your answer. The import "-d" parameter in the linux version also accepts devices.

I managed to import my pool! It still has a few hundred files reported as corrupt.

I tried to run scrub and clear, but the errors -logically- persist.
Some but by far not all corrupt files were part of a deduplication using the rmlint tool. I believe this has caused the corruption, but I am still quite uncertain why.

As I said, luckily the files can be replaced, what haunts me is whether I should trust the osx zfs implementation, and if I can risk to use our main backup-drive in read-write mode on a regular basis.

Thanks
hansmueller
 
Posts: 3
Joined: Mon May 10, 2021 1:41 am

Re: complete pool corruption and lost trust

Postby jawbroken » Sun May 30, 2021 2:58 am

You haven't made it clear where you ran rmlint, macOS or Linux, or why you suspect the macOS version of OpenZFS rather than the Linux one. I see some dangerous looking deduplication options for rmlint (e.g. clone, reflink) that I would guess might cause issues if the filesystem didn't support the relevant features. If you still have the rmlint command that you ran, or the shell script it produced, then that would be useful information also, especially to compare to the list of corrupted files you are seeing.
jawbroken
 
Posts: 61
Joined: Wed Apr 01, 2015 4:46 am


Return to General Help

Who is online

Users browsing this forum: No registered users and 24 guests