Auto-import (I think) caused 2 drives to drop out of my pool

All your general support questions for OpenZFS on OS X.

Auto-import (I think) caused 2 drives to drop out of my pool

Postby mkush » Wed Oct 28, 2020 11:55 am

Auto-import was not working for me so on a suggestion, I tried giving Full Disk Access to /bin/bash, which seemed to fix it.

HOWEVER, the pool was imported with 2 of 8 drives showing faulted. The pool contains 8 16TB HDDs in a RAIDZ2 config, in an OWC Thunderbay 8, connected via a Thunderbolt 2 optical cable to my 2019 Mac Pro. (The array is located out of ear-shot of the computer.)

I have noticed in the past that sometimes it takes a while for all the drives to appear to the OS after the computer is booted or after the Thunderbolt cable leading to the array is plugged in. I'm guessing that all drives weren't available yet when the array was auto-imported.

Question is: how do I fix this? The array is not currently imported and I did not write any data to it since this problem occurred.

Code: Select all
mkush@Aslan ~ % sudo zpool import
Password:
   pool: Backup
     id: 5299986820443420611
  state: DEGRADED
 status: One or more devices contains corrupted data.
 action: The pool can be imported despite missing or damaged devices.  The
   fault tolerance of the pool may be compromised if imported.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
 config:

   Backup                               DEGRADED
     raidz2-0                           DEGRADED
       ST16000NM001G-2KK103-ZL244X5X:1  ONLINE
       ST16000NM001G-2KK103-ZL24NTAD:1  ONLINE
       ST16000NM001G-2KK103-ZL24PXL4:1  ONLINE
       ST16000NM001G-2KK103-ZL24P1KA:1  ONLINE
       ST16000NM001G-2KK103-ZL24MYTG:1  ONLINE
       10336031023599959445             FAULTED  corrupted data
       13072125643004748429             FAULTED  corrupted data
       ST16000NM001G-2KK103-ZL24PYCT:1  ONLINE
mkush
 
Posts: 53
Joined: Tue Sep 30, 2014 1:17 pm

Re: Auto-import (I think) caused 2 drives to drop out of my

Postby lundman » Wed Oct 28, 2020 4:02 pm

The autoimport script waits up to 60s for InvariantDisks to say it is idle - all disks probed. So in theory, it shouldn't import incomplete pools.

Have you tried exporting, and re-importing the pool manually to make sure all disks are present?
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Auto-import (I think) caused 2 drives to drop out of my

Postby mkush » Wed Oct 28, 2020 6:43 pm

Yes, several times, same result. Those 2 disks are seen in Disk Management but they won’t show up in a good condition in the pool. Obviously I’d like to get them back in the pool somehow without having to rebuild them, since the pool is now in a fragile state. I’d also like to know what happened for sure. It seems odd that 2 disks would just suddenly both get corrupted. Timing was perfect for when I allowed Full Disk Access and rebooted.
mkush
 
Posts: 53
Joined: Tue Sep 30, 2014 1:17 pm

Re: Auto-import (I think) caused 2 drives to drop out of my

Postby tangles » Sun Nov 01, 2020 6:33 am

Assuming you have a backup and so you just want to save time:

1. get your pool healthy
2. systematically test the bays that had the failed/missing disks with known-good-disks.

1.
Write down which disk is currently in which bay so you don't lose track.

Physically identify the disks that are missing, power down and remove them and start up and see if your pool mounts.

If not, even after force mounting and using the -d option, then maybe Master Lundy can help because there's something else going on and no point continuing.

If the pool does mount okay, then I'd reseat the questionable disks back into your OWC chassis while shutdown and see if they're recognised back into the pool after restarting.

If not, then I'd wipe the two disks in question using the dd command to write zeros to them for the first 1GB or so on each.

then zpool replace the 1st missing disk (ie.with itself) and see if it resilvers.

If it fails to resilver, then I'd be focusing on that OWC chassis where these two drives are located in the bays, and also the power supply for it.

Obviously don't move the disks around just yet, because you don't have any redundancy left up your sleeve should the root cause actually be a dodgy bay(s).

2.
You could mount a couple disks internally if you have the special/customised mini 6pin power cable thingy for SATA disks for the Mac Pro (going from memory here).
Alternatively get your hands on 2 cheap external cases just to test with.

If you get the pool resilvered using internal ports or external enclosures, then systematically move the disks back into your OWC chassis, but put known-good-disks into the two bays in question and boot up.

If you have problems with the good disks in the same bays, then I think it's fair to say your OWC chassis has crapped out on you.
If the same two disks in different bays have continued issues, then you've got two disks that are flaky and so replace them.

It's not strange to have multiple disks go bad at the same time, especially if you purchased them all at once (i.e. same batch)

I keep the ridiculous Addonics 10 SATA port HBA around in case a chassis or SAS controller misbehaves on me. This way I can rule out disk v chassis immediately. You might want to keep something similar on the shelf also.
tangles
 
Posts: 195
Joined: Tue Jun 17, 2014 6:54 am

Re: Auto-import (I think) caused 2 drives to drop out of my

Postby mkush » Sun Nov 01, 2020 8:47 pm

Thanks! The pool will mount fine, just reporting itself as degraded. I’m fairly sure that all the disks are fine and that somehow this happened because the pool was imported prior to all disks being ready. Since I have written no data to the pool, I was hoping there was some way to get the two drives back into the pool without resilvering them. This is really mostly to avoid two massive resilvering operations that will be hard on the disks and time consuming. I have all the data elsewhere so I do have the option of wiping out the pool and starting over. I just thought maybe there was a quick trick to get the disks back in without resilvering.
mkush
 
Posts: 53
Joined: Tue Sep 30, 2014 1:17 pm


Return to General Help

Who is online

Users browsing this forum: beska, Google [Bot] and 34 guests