Is OpenZFS on OS X stable?

All your general support questions for OpenZFS on OS X.

Is OpenZFS on OS X stable?

Postby dguisinger01 » Fri Jun 26, 2020 10:11 pm

I ask this because I keep having hard crashes on MacOS 10.15.5.

Twice I have tried to run zpool clear (once due to an actual power failure, the second time due to an accidental unplugging of the dock sitting between the computer and the TB3 Thunderbay, oops)
Both times, zpool clear froze the terminal window and prevented my computer from rebooting.

The second time, it didn't want to even import my pool after reboot, saying every single disk was corrupt in the RaidZ-2 array. A second reboot fixed it.

Tonight I used Zetawatch to export my pool so I could unplug my laptop.
Within seconds of exporting my pool, I went to unplug the cable and just as I pulled it out, Zetawatch notified me that my drive pool was auto-imported. My menubar gave me the spinning beach ball, my dock froze, and I had to forcibly power off my laptop once again.

I can't imagine there is a valid reason the ZFS tools are repeatedly locking up my system. I've only had my RAID array and ZFS for a week... I'm now worried to put anymore data on the system and am considering wiping out the pool and replacing it with SoftRAID, or sending the Thunderbay back and getting a Synology box to use these disks with instead.
dguisinger01
 
Posts: 21
Joined: Fri Jun 12, 2020 9:51 am

Re: Is OpenZFS on OS X stable?

Postby jawbroken » Sun Jun 28, 2020 5:36 am

Have you tried it without the dock? I can imagine the tools not dealing well with the drives dropping out, which seems to be what is happening to you for various reasons. I can only say that I don't have the same problems as you with a similar setup (MacOS 10.15.5, OpenZFS on OS X 1.9.4, 3x 4-bay TB2 Thunderbay).
jawbroken
 
Posts: 61
Joined: Wed Apr 01, 2015 4:46 am

Re: Is OpenZFS on OS X stable?

Postby 0xdeadbeef » Sun Jun 28, 2020 8:06 am

I've been using OpenZFS on OS X on multiple Macs for many years now. Got four Akitio Thunder Quad X (TB3, four 3.5" bays), filled with SATA SSDs and HDDs across three pools in a stripe+mirror setup and it's getting extensive use. Aside the hiccups with new major macOS releases, the only problems I had when I accidentally turned of one of the TB3 drive cases before exporting the zpool. It was impossible to get the pool do anything without booting and the OS had several processes suck in blocked IO b/c the pool was stuck. However, since no data was written during that incident, a forced reboot took care of it. A scrub afterwards showed no errors.
0xdeadbeef
 
Posts: 8
Joined: Tue Feb 23, 2016 11:23 pm

Re: Is OpenZFS on OS X stable?

Postby dguisinger01 » Sun Jun 28, 2020 10:03 am

I haven't tried without the dock - as far as the Mac is concerned its SATA controllers on a PCIe interface, I'm not sure the number of connections it passes through makes a difference. I'm not having an issue while everything remains connected, just when power drops either through my action or the grid going down. I will have the power routed through an APC UPS later this week after its delivered. Part of my problem is my laptop obviously is battery backed up, so if power fails, ZFS notices. I guess I'll try direct connect it if I keep having major issues.

Oxdeadbeef, so after you disconnected your pool, you are saying the zpool tools did nothing or they locked the system up like mine are? I wouldn't expect a missing non-boot drive to lock the filesystem code or require a forced reboot. Previously I had a time machine backup plugged into the dock, it didn't take down the system if I forgot to unmount it first....

I have noticed the drive names change everytime I plug it in, sometimes they list the Toshiba part number, other times they list the PCIe/SATA attachment path, and other times they show the media-{uuid} name. Its like MacOS or ZFS can't make up its mind. I don't know if that changing within a single session of MacOS after an accidental disconnect could cause the tools to be getting confused.

-edit-
I reported the auto-import after export bug to ZetaWatch. Looks like the author has confirmed finding a bug in testing that was re-importing my pool while i was unplugging it..... at least I'm not fully crazy and that one is real lol
dguisinger01
 
Posts: 21
Joined: Fri Jun 12, 2020 9:51 am

Re: Is OpenZFS on OS X stable?

Postby 0xdeadbeef » Mon Jun 29, 2020 12:01 pm

Oxdeadbeef, so after you disconnected your pool, you are saying the zpool tools did nothing or they locked the system up like mine are?


All access to the mounted filesystems on that pool was resulting in processes in state "U" (STAT column in ps(1) output). However, that quickly affects other system and UI processes to be affected:

I wouldn't expect a missing non-boot drive to lock the filesystem code or require a forced reboot.


From users point of view you're right. However, there are plenty of processes accessing those filesystems affected by the pool outage. From Finder checking free space (to show in Finder windows status bar) to spotlight and several more. With the processes "locked" (stuck in "uninterruptible wait" state) by the kernel it quickly starts to affect many things for a user.

However...

Previously I had a time machine backup plugged into the dock, it didn't take down the system if I forgot to unmount it first....


... I'd agree with the argument that this shouldn't happen - the process should receive a read error and handle it that way. It's what happens with HFS and APFS filesystems and it's what I'd expect from a users perspective as well. I'm not deep into the implementation details of ZFS kext (or the solaris compat/abstraction layer it's based on), maybe @lundman can shed some light on why this isn't happening when he find some time?

I have noticed the drive names change everytime I plug it in, sometimes they list the Toshiba part number, other times they list the PCIe/SATA attachment path, and other times they show the media-{uuid} name. Its like MacOS or ZFS can't make up its mind. I don't know if that changing within a single session of MacOS after an accidental disconnect could cause the tools to be getting confused.


Dunno, I only see media-* ones based on UUID from the drives GPT label. I'd retry the zfs replace approach to move them all to the media UUID, timing it to the drive UUID instead of the physical location.
0xdeadbeef
 
Posts: 8
Joined: Tue Feb 23, 2016 11:23 pm

Re: Is OpenZFS on OS X stable?

Postby dguisinger01 » Mon Jun 29, 2020 12:14 pm

Yeah, this is what I get.... but as far as I can tell, it looks different on every reboot.
Somtimes its all one, sometimes its all the other, other times its a strange mix like below.

Code: Select all
  pool: thunderbay8
 state: ONLINE
  scan: none requested
config:

   NAME                                                                                                             STATE     READ WRITE CKSUM
   thunderbay8                                                                                                      ONLINE       0     0     0
     raidz2-0                                                                                                       ONLINE       0     0     0
       media-1F05BCAF-68AB-114B-BD1A-BA1F22032F11                                                                   ONLINE       0     0     0
       TOSHIBA_MG07ACA14TE-79C0A07EF94G:1                                                                           ONLINE       0     0     0
       TOSHIBA_MG07ACA14TE-10C0A02EF94G:1                                                                           ONLINE       0     0     0
       PCI0@0-PEG1@1,1-UPSB@0-DSB1@1-UPS0@0-pci-bridge@4-pci-bridge@0-pci-bridge@2-pci197b,585@0-PRT1@1-PMP@0-@0:1  ONLINE       0     0     0
       media-E71C76CC-FFE8-0F4E-AADB-D6CD251B3707                                                                   ONLINE       0     0     0
       media-A31C3A93-C451-7948-B896-BB6C0817EBF6                                                                   ONLINE       0     0     0
       PCI0@0-PEG1@1,1-UPSB@0-DSB1@1-UPS0@0-pci-bridge@4-pci-bridge@0-pci-bridge@1-pci197b,585@0-PRT3@3-PMP@0-@0:1  ONLINE       0     0     0
       PCI0@0-PEG1@1,1-UPSB@0-DSB1@1-UPS0@0-pci-bridge@4-pci-bridge@0-pci-bridge@2-pci197b,585@0-PRT3@3-PMP@0-@0:1  ONLINE       0     0     0
dguisinger01
 
Posts: 21
Joined: Fri Jun 12, 2020 9:51 am

Re: Is OpenZFS on OS X stable?

Postby FadingIntoBlue » Thu Jul 02, 2020 10:24 pm

I have noticed the drive names change everytime I plug it in, sometimes they list the Toshiba part number, other times they list the PCIe/SATA attachment path, and other times they show the media-{uuid} name. Its like MacOS or ZFS can't make up its mind. I don't know if that changing within a single session of MacOS after an accidental disconnect could cause the tools to be getting confused.


Changing the device names on an existing pool can be done by simply exporting the pool and re-importing it with the -d option to specify which new names should be used.

To use the names in / var / run / disk /by-id,

Code: Select all
$ sudo zpool export tank
$ sudo zpool import -d /var/run/disk/by-id tank


To use the names in / var / run / disk /by-serial,

Code: Select all
$ sudo zpool export tank
$ sudo zpool import -d /var/run/disk/by-serial tank


To use the names in / var / run / disk /by-path,

Code: Select all
$ sudo zpool export tank
$ sudo zpool import -d /var/run/disk/by-path tank


To use the less safe (because they vary) BSD disk names in /dev,

Code: Select all
$ sudo zpool export tank
$ sudo zpool import -d /dev tank


Even if you are using invariant paths (by-id, by-serial, or by-path), you can reveal the "normal" BSD disk names at any time,

Code: Select all
$ zpool status -L tank


Combine that with distil list and you can tell exactly which disk is which

NB: I didn't write the above, but have lost the attribution, apologies to the original author. It lays it out very clearly, hence my retaining it.
FadingIntoBlue
 
Posts: 106
Joined: Tue May 27, 2014 12:25 am


Return to General Help

Who is online

Users browsing this forum: jawbroken and 29 guests

cron