zpool getting dropped during scrub

This forum is to find answers to problems you may be having with ZEVO Community Edition.

Moderators: jhartley, MSR734, nola

zpool getting dropped during scrub

Post by satadru » Wed Apr 03, 2013 11:39 am

I have 5 disks in an external sata enclosure with a esata port multiplier, which is connected via a SIL3124 expresscard to a macbook pro.

disks 1 & 2 are in a zfs stripe pool
disks 3 & 4 are in another zfs mirror pool, tank2
disk 5 is in a zfs mirror pool with a Firewire 800 connected drive.

The first mirror pool (tank2) keeps dropping during scrubs. (All pools are scrubbing simultaneously.)

Here's the error I see:

zfsx_domount: 'tank2' zfsvfs_create err 2
ZFSLabelScheme:stop: 0xffffff803ed0ea00 goodbye 'zfs vdev for 'tank2''
ZFSLabelScheme:stop: 0xffffff803ed0ea00 goodbye 'zfs vdev for 'tank2''
ZFSLabelScheme:probe: label 'tank2', vdev 5272918757791093917
ZFSLabelScheme:start: 'tank2' critical mass with 1 vdev(s) (skip import)
ZFSLabelScheme:probe: label 'tank2', vdev 17283157620683678563
zfsx_vdm_open: 'tank2' disk3s2
zfsx_vdm_open: 'tank2' disk2s2
zfsx_mount: '/Volumes/tank2'
zfsx_unmount: '/Volumes/tank2' forced (umount)
zfsvfs_teardown: '/Volumes/tank2' (txg_wait_synced in 1471 ms)
zfsx_vdm_close: 'disk3s2'
zfsx_vdm_close: 'disk2s2'

Rebooting brings the pool back.

Any ideas? Running OS X 10.8.3, MacBookPro3,1, current version of Zevo. The system is obviously under fairly heavy I/O load at this time... but the volume shouldn't drop.
satadru Offline


 
Posts: 20
Joined: Fri Sep 14, 2012 10:07 pm

Re: zpool getting dropped during scrub

Post by grahamperrin » Thu Apr 04, 2013 10:02 pm

Output please from the following command:

kextstat | grep -v com.apple

Also please paste (as code) a comprehensive set of messages from system.log from around the time of the issue. To include maybe the five minutes beforehand.

zfsx_domount

zfsx_domount: 'tank2' zfsvfs_create err 2


For a file system within the affected pool, that's thought-provoking. Who or what attempted that mount?

For me

There was an unexpected goodbye under unexpected force when adding a cache vdev, but the component device there was peculiar (a low-end 8 GB USB flash drive) and there's Core Storage in the mix … so nothing conclusive there.

Postscript

Not long after ZEVO Community Edition was released, I began testing multiple concurrent scrubs … whilst doing so is not recommended with this edition, for me it wasn't a problem.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: zpool getting dropped during scrub

Post by satadru » Mon Apr 08, 2013 9:36 am

This happened earlier today, during a multi-drive scrub.

Code: Select all
kextstat | grep -v com.apple
Index Refs Address            Size       Wired      Name (Version) <Linked Against>
   40    0 0xffffff7f80a99000 0x15000    0x15000    com.SiliconImage.driver.Si3132 (1.2.5) <39 10 5 4 3>
  101    1 0xffffff7f80799000 0x19b000   0x19b000   com.getgreenbytes.filesystem.zfs (2012.09.23) <13 7 5 4 3 1>
  102    0 0xffffff7f80936000 0x6000     0x6000     com.getgreenbytes.driver.zfs (2012.09.14) <101 13 7 5 4 3 1>
  120    0 0xffffff7f8098a000 0x5000     0x5000     com.Cycling74.driver.Soundflower (1.5.1) <115 5 4 3>
  127    3 0xffffff7f8256d000 0x42000    0x42000    org.virtualbox.kext.VBoxDrv (4.2.8) <7 5 4 3 1>
  128    0 0xffffff7f825af000 0x8000     0x8000     org.virtualbox.kext.VBoxUSB (4.2.8) <127 49 32 7 5 4 3 1>
  129    0 0xffffff7f825b7000 0x5000     0x5000     org.virtualbox.kext.VBoxNetFlt (4.2.8) <127 7 5 4 3 1>
  130    0 0xffffff7f825bc000 0x6000     0x6000     org.virtualbox.kext.VBoxNetAdp (4.2.8) <127 5 4 1>


From the system log:
Code: Select all
Apr  8 01:45:28 bmachine sandboxd[81387] ([81379]): mdworker(81379) deny mach-lookup com.apple.ls.boxd
Apr  8 01:45:28 bmachine sandboxd[81387] ([81376]): mdworker(81376) deny mach-lookup com.apple.ls.boxd
Apr  8 01:45:33 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (2,0)
Apr  8 01:45:33 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Library/PrivilegedHelperTools/com.apple.serverd' [113] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (100000,0)
Apr  8 01:46:59 bmachine ARDAgent[483]: error from send No route to host, addr: 192.168.4.141
Apr  8 01:46:59 bmachine ARDAgent[483]: UDPWritePacket result looks wrong -1 No route to host
Apr  8 01:47:04 bmachine ARDAgent[483]: Result from new send: -1 to address '192.168.4.141' errno=64: Host is down
Apr  8 01:47:04 bmachine ARDAgent[483]: error from send Host is down, addr: 192.168.4.141
Apr  8 01:53:38 bmachine mdworker[81419]: Unable to talk to lsboxd
Apr  8 01:53:38 bmachine mdworker[81423]: Unable to talk to lsboxd
Apr  8 01:53:38 bmachine mdworker[81421]: Unable to talk to lsboxd
Apr  8 01:53:38 bmachine mdworker[81422]: Unable to talk to lsboxd
Apr  8 01:53:38 bmachine mdworker[81420]: Unable to talk to lsboxd
Apr  8 01:53:38 bmachine kernel[0]: Sandbox: sandboxd(81430) deny mach-lookup com.apple.coresymbolicationd
Apr  8 01:53:39 bmachine sandboxd[81430] ([81419]): mdworker(81419) deny mach-lookup com.apple.ls.boxd
Apr  8 01:53:39 bmachine sandboxd[81430] ([81423]): mdworker(81423) deny mach-lookup com.apple.ls.boxd
Apr  8 01:53:39 bmachine sandboxd[81430] ([81421]): mdworker(81421) deny mach-lookup com.apple.ls.boxd
Apr  8 01:53:39 bmachine sandboxd[81430] ([81422]): mdworker(81422) deny mach-lookup com.apple.ls.boxd
Apr  8 01:53:39 bmachine sandboxd[81430] ([81420]): mdworker(81420) deny mach-lookup com.apple.ls.boxd
Apr  8 01:54:49 bmachine kernel[0]: zfsx_domount: 'tank2' zfsvfs_create err 2
Apr  8 01:54:50 bmachine kernel[0]: ZFSLabelScheme:stop: 0xffffff802d4f4000 goodbye 'zfs vdev for 'tank2''
Apr  8 01:54:50 --- last message repeated 1 time ---
--
Apr  8 01:54:50 --- last message repeated 1 time ---
Apr  8 01:54:50 bmachine kernel[0]: ZFSLabelScheme:probe: label 'tank2', vdev 5272918757791093917
Apr  8 01:54:51 bmachine kernel[0]: ZFSLabelScheme:start: 'tank2' critical mass with 1 vdev(s) (skip import)
Apr  8 01:54:51 bmachine kernel[0]: ZFSLabelScheme:probe: label 'tank2', vdev 17283157620683678563
Apr  8 01:54:51 bmachine kernel[0]: zfsx_vdm_open: 'tank2' disk3s2
Apr  8 01:54:51 bmachine kernel[0]: zfsx_vdm_open: 'tank2' disk2s2
Apr  8 01:54:53 bmachine fseventsd[38]: event logs in /Volumes/tank2/.fseventsd out of sync with volume.  destroying old logs. (89800 0 161799)
Apr  8 01:54:53 bmachine kernel[0]: zfsx_mount: '/Volumes/tank2'
Apr  8 01:54:53 bmachine kernel[0]: zfsx_unmount: '/Volumes/tank2' forced (umount)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to unlink old log file /Volumes/tank2/.fseventsd/0000000012b9ea12 (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to unlink old log file /Volumes/tank2/.fseventsd/0000000012cc53b0 (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to unlink old log file /Volumes/tank2/.fseventsd/0000000012cc53b1 (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to unlink old log file /Volumes/tank2/.fseventsd/0000000012dd17cf (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to unlink old log file /Volumes/tank2/.fseventsd/0000000012dd17d0 (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to create the uuid file /Volumes/tank2/.fseventsd/fseventsd-uuid
Apr  8 01:54:54 bmachine fseventsd[38]: could not open <</Volumes/tank2/.fseventsd/fseventsd-uuid>> (No such file or directory)
Apr  8 01:54:54 bmachine fseventsd[38]: failed to create the uuid file /Volumes/tank2/.fseventsd/fseventsd-uuid
Apr  8 01:54:54 bmachine fseventsd[38]: disk logger: failed to open output file /Volumes/tank2/.fseventsd/0000000012f643c5 (No such file or directory). mount point /Volumes/tank2/.fseventsd
Apr  8 01:54:55 --- last message repeated 1 time ---
Apr  8 01:54:55 bmachine kernel[0]: zfsx_vdm_close: 'disk3s2'
Apr  8 01:54:55 bmachine kernel[0]: zfsx_vdm_close: 'disk2s2'
Apr  8 01:54:56 bmachine AppleFileServer[28295]: _Assert: /SourceCache/afpserver/afpserver-607.7/afpserver/../afpserver/SharePointSyncTask.cpp, 215 (4294967253)
Apr  8 01:55:26 --- last message repeated 1 time ---
Apr  8 01:55:49 bmachine mdworker[81502]: Unable to talk to lsboxd
--
--
Apr  8 06:51:33 bmachine mdworker[84480]: Unable to talk to lsboxd
Apr  8 06:51:33 bmachine mdworker[84479]: Unable to talk to lsboxd
Apr  8 06:51:33 bmachine mdworker[84477]: Unable to talk to lsboxd
Apr  8 06:51:33 bmachine mdworker[84481]: Unable to talk to lsboxd
Apr  8 06:51:33 bmachine kernel[0]: Sandbox: sandboxd(84488) deny mach-lookup com.apple.coresymbolicationd
Apr  8 06:51:33 bmachine sandboxd[84488] ([84478]): mdworker(84478) deny mach-lookup com.apple.ls.boxd
Apr  8 06:51:33 bmachine sandboxd[84488] ([84480]): mdworker(84480) deny mach-lookup com.apple.ls.boxd
Apr  8 06:51:33 bmachine sandboxd[84488] ([84479]): mdworker(84479) deny mach-lookup com.apple.ls.boxd
Apr  8 06:51:33 bmachine sandboxd[84488] ([84477]): mdworker(84477) deny mach-lookup com.apple.ls.boxd
Apr  8 06:51:33 bmachine sandboxd[84488] ([84481]): mdworker(84481) deny mach-lookup com.apple.ls.boxd
Apr  8 06:53:43 bmachine mdworker[84504]: Unable to talk to lsboxd
Apr  8 06:53:43 bmachine mdworker[84501]: Unable to talk to lsboxd
Apr  8 06:53:43 bmachine mdworker[84502]: Unable to talk to lsboxd
Apr  8 06:53:43 bmachine mdworker[84503]: Unable to talk to lsboxd
Apr  8 06:53:44 bmachine sandboxd[84509] ([84504]): mdworker(84504) deny mach-lookup com.apple.ls.boxd
Apr  8 06:53:44 bmachine kernel[0]: Sandbox: sandboxd(84509) deny mach-lookup com.apple.coresymbolicationd
Apr  8 06:53:44 bmachine sandboxd[84509] ([84501]): mdworker(84501) deny mach-lookup com.apple.ls.boxd
Apr  8 06:53:44 bmachine sandboxd[84509] ([84502]): mdworker(84502) deny mach-lookup com.apple.ls.boxd
Apr  8 06:53:44 bmachine sandboxd[84509] ([84503]): mdworker(84503) deny mach-lookup com.apple.ls.boxd
Apr  8 06:54:49 bmachine kernel[0]: zfsx_domount: 'tank2' zfsvfs_create err 2
Apr  8 06:54:50 bmachine kernel[0]: ZFSLabelScheme:stop: 0xffffff802f7ca600 goodbye 'zfs vdev for 'tank2''
Apr  8 06:54:50 --- last message repeated 1 time ---
--
Apr  8 06:54:50 --- last message repeated 1 time ---
Apr  8 06:54:50 bmachine kernel[0]: ZFSLabelScheme:probe: label 'tank2', vdev 5272918757791093917
Apr  8 06:54:50 bmachine kernel[0]: ZFSLabelScheme:start: 'tank2' critical mass with 1 vdev(s) (skip import)
Apr  8 06:54:50 bmachine kernel[0]: ZFSLabelScheme:probe: label 'tank2', vdev 17283157620683678563
Apr  8 06:54:50 bmachine kernel[0]: zfsx_vdm_open: 'tank2' disk3s2
Apr  8 06:54:50 bmachine kernel[0]: zfsx_vdm_open: 'tank2' disk2s2
Apr  8 06:54:54 bmachine fseventsd[38]: event logs in /Volumes/tank2/.fseventsd out of sync with volume.  destroying old logs. (161799 1 179800)
Apr  8 06:54:54 bmachine kernel[0]: zfsx_mount: '/Volumes/tank2'
Apr  8 06:54:54 bmachine fseventsd[38]: log dir: /Volumes/tank2/.fseventsd getting new uuid: E0CF01C3-D2A7-49E5-B3A9-B0BB6DA9C266
Apr  8 06:54:54 bmachine kernel[0]: zfsx_unmount: '/Volumes/tank2' forced (umount)
Apr  8 06:54:55 bmachine fseventsd[38]: disk logger: failed to open output file /Volumes/tank2/.fseventsd/0000000012f8adeb (No such file or directory). mount point /Volumes/tank2/.fseventsd
Apr  8 06:54:55 --- last message repeated 1 time ---
Apr  8 06:54:55 bmachine AppleFileServer[28295]: _Assert: /SourceCache/afpserver/afpserver-607.7/afpserver/../afpserver/SharePointSyncTask.cpp, 215 (4294967253)
Apr  8 06:54:55 --- last message repeated 1 time ---
Apr  8 06:54:55 bmachine kernel[0]: zfsx_vdm_close: 'disk3s2'
Apr  8 06:54:55 bmachine kernel[0]: zfsx_vdm_close: 'disk2s2'
Apr  8 06:54:56 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (2,0)
Apr  8 06:54:56 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Library/PrivilegedHelperTools/com.apple.serverd' [113] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (100000,0)
Apr  8 06:54:57 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (2,0)
Apr  8 06:54:57 bmachine com.apple.SecurityServer[22]: Succeeded authorizing right 'system.privilege.admin' by client '/Library/PrivilegedHelperTools/com.apple.serverd' [113] for authorization created by '/Applications/Server.app/Contents/ServerRoot/usr/libexec/ServerEventAgent' [201] (100000,0)
Apr  8 06:55:54 bmachine mdworker[84583]: Unable to talk to lsboxd
Apr  8 06:55:54 bmachine sandboxd[84585] ([84583]): mdworker(84583) deny mach-lookup com.apple.ls.boxd
Apr  8 06:55:54 bmachine kernel[0]: Sandbox: sandboxd(84585) deny mach-lookup com.apple.coresymbolicationd
Apr  8 06:55:57 bmachine servermgrd[84580]: -[AccountsRequestHandler(AccountsSystemConfigurationObservation) registerForKeychainEventNotifications]: SecKeychainAddCallback() status: -25297
--
--



From dmesg:
Code: Select all
Sandbox: sandboxd(84337) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84361) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84376) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84402) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84422) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84439) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84458) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84488) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84509) deny mach-lookup com.apple.coresymbolicationd
zfsx_domount: 'tank2' zfsvfs_create err 2
ZFSLabelScheme:stop: 0xffffff802f7ca600 goodbye 'zfs vdev for 'tank2''
ZFSLabelScheme:stop: 0xffffff802f7ca600 goodbye 'zfs vdev for 'tank2''
ZFSLabelScheme:probe: label 'tank2', vdev 5272918757791093917
ZFSLabelScheme:start: 'tank2' critical mass with 1 vdev(s) (skip import)
ZFSLabelScheme:probe: label 'tank2', vdev 17283157620683678563
zfsx_vdm_open: 'tank2' disk3s2
zfsx_vdm_open: 'tank2' disk2s2
zfsx_mount: '/Volumes/tank2'
zfsx_unmount: '/Volumes/tank2' forced (umount)
zfsx_vdm_close: 'disk3s2'
zfsx_vdm_close: 'disk2s2'
Sandbox: sandboxd(84585) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84612) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84781) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(84993) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(85208) deny mach-lookup com.apple.coresymbolicationd
Sandbox: sandboxd(85320) deny mach-lookup com.apple.coresymbolicationd
satadru Offline


 
Posts: 20
Joined: Fri Sep 14, 2012 10:07 pm

Re: zpool getting dropped during scrub

Post by raattgift » Tue Apr 09, 2013 11:42 pm

There isn't enough information to be anywhere close to sure, but it looks like your Silicon Image controller is choking under load.
There could be a wiring problem on the path between it and any of the esata drives.

"The first mirror pool (tank2) keeps dropping during scrubs. (All pools are scrubbing simultaneously.)"

What happens if you stop the scrubs "zpool scrub -s <poolname>" and run them one at a time?

Will tank2 become fully mounted and available if there are no scrubs running at all? What if there are other scrubs (but not tank2) running?

Also, you could try the following:

1. boot, stop all scrubs, zpool export all the pools
2. for i in /dev/dsk/GPTE_* ; do dd if=$i of=/dev/null bs=128k & done
3. watch the logs as all the drives zfs knows about are accessed sequentially

If #3 completes without problem and without that would be very interesting.

It would be useful to check "zpool status -v" during this.

It may also be useful to try importing tank2 with only one side of the mirror at a time to try to tie the problem down to either one of the drives themselves (or the wiring to it) or some other part of the system.
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Link

Post by grahamperrin » Tue Apr 16, 2013 11:41 pm

raattgift wrote:… it looks like your Silicon Image controller is choking …


Towards a multi-topic roundup from recent weeks:

grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to Troubleshooting

Who is online

Users browsing this forum: hlxpgxmum and 1 guest

cron