Pool suspended when trying to remove corrupted files.

All your general support questions for OpenZFS on OS X.

Pool suspended when trying to remove corrupted files.

Postby zisper » Wed Aug 17, 2016 4:34 am

Hi;

So, I have a mirrored pool with a corrupted file on it. It was actually just a movie that was copying to the pool via rsync from another server when it was shut down uncleanly, so no big deal I was thinking I'd just delete it. However, as soon as I try to do so, the pool gets suspended, output from system.log below:

Code: Select all
Aug 17 22:05:35 Mini-i5 sudo[574]:  zfsuser : TTY=ttys000 ; PWD=/Users/zfsuser ; USER=root ; COMMAND=/usr/local/bin/zpool status -v
Aug 17 22:05:39 Mini-i5 kernel[0]: Sandbox: mdworker(567) deny(1) mach-lookup com.apple.distributed_notifications@1v3
Aug 17 22:05:58 Mini-i5 sudo[577]:  zfsuser : TTY=ttys000 ; PWD=/Users/zisper/Music/iTunes/iTunes Media/Movies/Two Mules for Sister Sara ; USER=root ; COMMAND=/bin/rm -fr deleteme.m4v
Aug 17 22:05:58 Mini-i5 zed[580]: eid=7 class=delay pool=PlayPool
Aug 17 22:05:58 Mini-i5 kernel[0]: SPL: Warning: Pool 'PlayPool' has encountered an uncorrectable I/O failure and has been suspended.
Aug 17 22:05:58 Mini-i5 zed[582]: eid=8 class=delay pool=PlayPool
Aug 17 22:05:58 Mini-i5 zed[584]: eid=9 class=delay pool=PlayPool
Aug 17 22:05:58 Mini-i5 zed[586]: eid=10 class=delay pool=PlayPool
Aug 17 22:05:58 Mini-i5 zed[588]: eid=11 class=data pool=PlayPool
Aug 17 22:05:59 Mini-i5 zed[595]: error: data-notify.sh: eid=11: failed to lock "/var/run/zed.zedlet.state.lock": /etc/zfs/zed.d/zed-functions.sh: line 128: flock: command not found
Aug 17 22:05:59 Mini-i5 zed[608]: error: data-notify.sh: eid=11: failed to unlock "/var/run/zed.zedlet.state.lock": /etc/zfs/zed.d/zed-functions.sh: line 166: flock: command not found
Aug 17 22:05:59 Mini-i5 zed[614]: eid=12 class=io_failure pool=PlayPool


The file wasn't always called "deleteme" by the way, I was just testing if renaming would still work. Any advice on how I go about deleting the file? Should I be concerned that zed is calling flock, which isn't a standard OS X function (and not on my computer)?
Also, if it gets in this state again, is it possible to un-suspend the pool? I can't find a way to cleanly unmount/export/shutdown the machine as it is.
zisper
 
Posts: 16
Joined: Wed Jul 20, 2016 2:48 am

Re: Pool suspended when trying to remove corrupted files.

Postby zisper » Wed Aug 17, 2016 4:41 am

I just tried "pool clear" (which didn't work) but it reminded me, the pool is actually a three way mirror with only 2 disks in it connected at the minute. Could the IO hang be related to deleting the file whilst the third disk isn't connected? This couple of lines from dmesg that appeared when I tried to clear the pool makes me wonder:

Code: Select all
ZFS: vdev_disk_open('/private/var/run/disk/by-id/media-60987131-D035-9E4B-811B-FD2208DBFA96') failed error 2
SPL: Warning: Pool 'PlayPool' has encountered an uncorrectable I/O failure and has been suspended.

Mini-i5:~ zfsuser$ zpool status
  pool: PlayPool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Aug 17 21:59:24 2016
    21.4G scanned out of 481G at 9.99M/s, 13h4m to go
    21.4G resilvered, 4.44% done
config:

   NAME                                            STATE     READ WRITE CKSUM
   PlayPool                                        DEGRADED     1     0     0
     mirror-0                                      DEGRADED     4     0     0
       media-7D5A5263-F73A-C841-8F1D-16AA3345F709  ONLINE       0     0     4  (resilvering)
       6520973245515165263                         UNAVAIL      0     0     0  was /private/var/run/disk/by-id/media-60987131-D035-9E4B-811B-FD2208DBFA96
       media-6EA69111-A906-9B41-B8A9-33D457B225AF  ONLINE       0     0     4  (resilvering)




And as it turns out, that's exactly what it was. I reconnected the third drive, "zpool clear" and the io operations started again (including the file delete which was still waiting for the pool io to resume.)
I think that's a feature not a bug? I don't think I need to raise an issue against it - I was just misunderstanding what was going on. (The missing "flock" command seems like it might be worth following up on though.)
zisper
 
Posts: 16
Joined: Wed Jul 20, 2016 2:48 am

Re: Pool suspended when trying to remove corrupted files.

Postby lundman » Wed Aug 24, 2016 11:30 pm

I think the zpool property failmode comes into play here as well, the default is "wait" which seems to be what you experienced. "continue" would make it carry on without it. As always, these settings have side-effects :)
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan


Return to General Help

Who is online

Users browsing this forum: No registered users and 32 guests