Use in production; zpool split and zpool attach fail

All your general support questions for OpenZFS on OS X.

Use in production; zpool split and zpool attach fail

Postby zfsfan » Sun Mar 19, 2023 6:10 pm

I'm very convinced of ZFS and consider using it with Macs, however, I am wondering whether the current homebrew edition (zfs-macOS-2.1.6-1) is ready for use in production.

I found minor bugs while experimenting:
  1. zpool attach: no such device in pool (with workaround) https://github.com/openzfs/zfs/issues/580#issuecomment-1475471258
  2. zpool split: cannot open device - busy (but worked earlier) https://github.com/openzfs/zfs/issues/807#issuecomment-1475482816

But mostly the file corruption reported at https://openzfsonosx.org/forum/viewtopic.php?f=26&t=3783 worries me. Is this solved? It seems to happen only under certain circumstances (
It seems like the combination of ZFS, Ventura and patch has a bug.
).

What is the general experience?
Do you have any recommendation whether the current homebrew edition (zfs-macOS-2.1.6-1: https://formulae.brew.sh/cask/openzfs) is ready for use in production?
zfsfan
 
Posts: 16
Joined: Sun Mar 19, 2023 5:38 pm

Re: Use in production; zpool split and zpool attach fail

Postby lundman » Sun Mar 19, 2023 7:30 pm

The "corruption" is actually decmpfs compression, and using macOS tools that have support baked-in for decmpfs (ditto, tar, etc). It is an unfortunate case where
Apple will assume everything can do decmpfs (even though they have a way to query the filesystem to see if it is supported) then just "use it" without us being able to stop it.
Plus, decmpfs is private in kernel, so we can't use it ourselves (not that we'd want to, as it'd not work for other ZFS platforms).

So we have some code that "works around" tools trying to decmpfs. This code was a bit lacking, and did not "truncate" the file when it happened to
exactly change a file to only-be-shorter. Didn't corrupt it per-se, just didn't make it smaller, and so it looked corrupted.

It is a risk we have to live with, luckily, the guys in here do some excellent testing.

There are known 2.1.6 xattr issues that we will do a 2.1.7 release for.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Use in production; zpool split and zpool attach fail

Postby Bottacco » Sun Mar 19, 2023 11:27 pm

lundman wrote:It is a risk we have to live with, luckily, the guys in here do some excellent testing.

There are known 2.1.6 xattr issues that we will do a 2.1.7 release for.

I am running now a server with three datastore and 15 people connected to it on a cMP 2010 and it is working fine, but the corruption worries me to. We are not using ditto, tar or similar, just copying over SMB and backing up to another machine with Carbon Copy Cloner.

Any suggestion on what utilities to use and which not? Also, any time frame to the 2.1.7 version?

Thanks a lot for your advice
Bottacco
 
Posts: 10
Joined: Thu Feb 09, 2023 1:27 pm

Re: Use in production; zpool split and zpool attach fail

Postby zfsfan » Mon Mar 20, 2023 8:49 am

lundman wrote:The "corruption" is actually decmpfs compression, and using macOS tools that have support baked-in for decmpfs (ditto, tar, etc). It is an unfortunate case where
Apple will assume everything can do decmpfs (even though they have a way to query the filesystem to see if it is supported) then just "use it" without us being able to stop it.


Thanks.
So using system tools (= macOS tools) like "cp" is risky, but using external tools like "rsync" (https://formulae.brew.sh/formula/rsync) is safe?

I basically want to run a backup server where user data from Macs is copied to ZFS using rsync.
(Maybe direct access via a network protocol such as SMB might be added in the future.)


lundman wrote:This code was a bit lacking, and did not "truncate" the file when it happened to
exactly change a file to only-be-shorter. Didn't corrupt it per-se, just didn't make it smaller, and so it looked corrupted.


Strictly speaking, this is data corruption, since the file content differs (and the SHA-256 checksum), and in some cases this data might not work with the intended application.
Just for the records.
I guess xattr isn't so important. I can't remember any case where xattr was relevant.
zfsfan
 
Posts: 16
Joined: Sun Mar 19, 2023 5:38 pm

Re: Use in production; zpool split and zpool attach fail

Postby lundman » Mon Mar 20, 2023 4:43 pm

I don't think "cp" knows about decmpfs, but ditto and tar definitely does - they have flags to not use it, but they are on by default. The built in rsync does too, but you have to use a flag to enable it as they default to off, so generally people don't come across it. Non-apple rsync, say from macports, or homebrew, does not.


The split/attach/replace drive issue is that you need to use /dev/diskX names for them, BUT, you want to use by-id/by-serial/by-uuid names during normal operations. Luckily, it is trivial to swap. So to
do a replace drive etc, you swap to /dev/disk names, do replace,. then switch back.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Use in production; zpool split and zpool attach fail

Postby Bottacco » Thu Mar 23, 2023 11:30 am

The problem using /dev/disk names, is only if the enclosure connects via USB or does it happen even if it connects via a SAS HBA?
Bottacco
 
Posts: 10
Joined: Thu Feb 09, 2023 1:27 pm

Re: Use in production; zpool split and zpool attach fail

Postby lundman » Sun Mar 26, 2023 4:27 pm

The issue is the drive rename. disk5 might be disk3 next time, and with normal data disks in ZFS, that is not an issue as they are labelled. But cache and log disks, if used in your pool, are not labelled and can just open the last /dev/diskX and write to it, even if it isn't even a ZFS disk. This is why we recommend using by-X names instead.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Use in production; zpool split and zpool attach fail

Postby zfsfan » Mon Mar 27, 2023 1:57 pm

The order of /dev/disk1, /dev/disk2, etc. usually just reflects the order when mounting at boot time, which is random and shouldn't be relied upon.
Hence using the IDs which stay the same all the time is the correct approach.

However, one problem linked in the first post was that the device is busy, which wasn't caused by not using IDs.

In rare occasions unmounting the ZFS system fails (busy, or process freezes), so I have to reboot the system to clean up the situation.
But the filesystem remains readable, and generally OpenZFS on OS X seems to work well.
So I'm very happy with ZFS.

It's a pity that Apple has given up supporting ZFS natively.
zfsfan
 
Posts: 16
Joined: Sun Mar 19, 2023 5:38 pm

Re: Use in production; zpool split and zpool attach fail

Postby Sharko » Tue Mar 28, 2023 10:27 am

I'm not at my Mac to double-check this, but I believe that Carbon Copy Cloner still uses rsync under the hood to do the copying. Hopefully without the explicit enabling of decmpfs.
Sharko
 
Posts: 230
Joined: Thu May 12, 2016 12:19 pm


Return to General Help

Who is online

Users browsing this forum: Google [Bot] and 28 guests