The panics occur without any clearly visible pattern, although running Thunderbird and performing IMAP synchronization with large mailboxes ( > 3 GB per mailbox file) seems to increase the probability of an immediate kernel panic. However, panics do also occur when Thunderbird is not running.
Another program significantly increasing the crash risk is Spotlight (more precisely: mds and mdimporter) trying to rebuild the Spotlight index after a crash. I think Spotlight is responsible for most if not all crashes shortly after boot after a previous crash.
Relatively often (approx. 1/4 of all cases) I see two panics right after each other, usually first a page fault, followed by an assertion failed. See below for full list. The panics started around 6 days after installing ZEVO.
All panics of type "zero io children" show similar but not identical backtraces.
System configuration:
- Macbook Pro from late 2010
- MacOSX Snow Leopard 10.6.8, fully updated
- 8 GB of RAM
- internal SSD disk with four partitions (five, counting the EFI partition):
s2 is system / boot partition,
s3 is ZFS pool
s4 is L2ARC for that pool,
s5 is another HFS+ volume - ZEVO community edition 1.1.1 (build 2012-09-23)
- MacPorts version 2.1.3
The ZFS pool has three child file systems, with mount points set to /Developer /Users/bj (my main account) and /opt (for holding MacPorts).
On all dataset, I have enable compression. DeDup is Off. Copies is set to 1 (default), normalization is formD.
Here all non-default properties:
- Code: Select all
bj $ zfs get all | grep -v -e '@2013' | grep -v -e 'default$' -e '-$'
NAME PROPERTY VALUE SOURCE
ZFSStore/Developer mountpoint /Developer local
ZFSStore/Developer compression on local
ZFSStore/bj mountpoint /Users/bj local
ZFSStore/bj compression on local
ZFSStore/bj snapdir visible local
ZFSStore/opt mountpoint /opt local
ZFSStore/opt compression on local
A cron job takes automatic snapshots of the Users/bj dataset once every hour between 7:00 and 23:00. Currently the system has 454 snapshots.
The pool size is 288GB, with 188GB allocated, i.e 65% used. 115GB are used by file systems, the rest is used by snapshots.
List of panics, including link to panic file
- zevo-crash-2013-02-25.txt uptime 157h58 Kernel trap at 0xffffff7f8135a768, type 14=page fault
- zevo-crash-2013-02-25_02.txt uptime 00h54 zio.c:474 ZFS assertion failed: *countp > 0 (0x0 > 0x0)
- zevo-crash-2013-02-26_01.txt uptime 11h37 Kernel trap at 0xffffff7f81a7c768, type 14=page fault
- zevo-crash-2013-02-26_02.txt uptime 00h11 zio.c:474 ZFS assertion failed: *countp > 0 (0x0 > 0x0)
- zevo-crash-2013-02-26_03.txt uptime 03h07 Kernel trap at 0xffffff7f80c740a6, type 14=page fault
- zevo-crash-2013-02-26_04.txt uptime 02h44 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-02-26_05.txt uptime 00h00'26 Kernel trap at 0xffffff7f81a3e365, type 14=page fault
- zevo-crash-2013-02-27_01.rtf uptime 15h39 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-02-27_02.rtf uptime 00h16 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-01_01.rtf uptime 23h44 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf576718 0 0xffffff80bf5769a0 ... zio.c:478
- zevo-crash-2013-03-04_01.txt uptime 43h27 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-04_02.txt uptime 00h00'7 metaslab.c:1428 ZFS assertion failed: DVA_IS_VALID(dva)
- zevo-crash-2013-03-04_03.txt uptime 01h57 Kernel trap at 0xffffff7f819ff55d, type 14=page fault
- zevo-crash-2013-03-04_04.txt uptime 08h53 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-04_05.txt uptime 00h15 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-05_01.txt uptime 22h07 Kernel trap at 0xffffff7f819ff55d, type 14=page fault
- zevo-crash-2013-03-06_01.txt uptime 14h42 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf444400 0 0xffffff80bf444688 ... zio.c:478
- zevo-crash-2013-03-07_01.txt uptime 14h26 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf5db720 0xffffff80bf5db9d8 0xffffff80bf5db9a8 ... zio.c:478
- zevo-crash-2013-03-07_02.txt uptime 01h23 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf63ca58 0xffffff80bf63cd10 0xffffff80bf63cce0 ... zio.c:478
- zevo-crash-2013-03-07_03.txt uptime 09h05 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff810824a568 0 0xffffff810824a7f0 ... zio.c:478
- zevo-crash-2013-03-08_01.txt uptime 06h13 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf57e730 0 0xffffff80bf57e9b8 ... zio.c:478
- zevo-crash-2013-03-08_02.txt uptime 00h25 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-08_03.txt uptime 00h18 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf4fb0b8 0xffffff80bf4fb378 0xffffff80bf4fb340 ... zio.c:478
- zevo-crash-2013-03-09_01.txt uptime 05h02 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf8cbd60 0 0xffffff80bf8cbfe8 ... zio.c:478
- zevo-crash-2013-03-09_02.txt uptime 08h43 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80cd4bd7f8 0 0xffffff80cd4bda80 ... zio.c:478
- zevo-crash-2013-03-13_01.txt uptime 38h46 Kernel trap at 0xffffff7f819e2a96, type 14=page fault
- zevo-crash-2013-03-13_02.txt uptime 10h04 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf7c20a0 0 0xffffff80bf7c2328 ... zio.c:478
- zevo-crash-2013-03-13_03.txt uptime 04h23 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf613720 0xffffff80bf6139e0 0xffffff80bf6139a8 ... zio.c:478
- zevo-crash-2013-03-13_04.txt uptime 04h04 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-14_01.txt uptime 15h07 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-14_02.txt uptime 04h20 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-15_01.txt uptime 12h17 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf4ab708 0 0xffffff80bf4ab990 ... zio.c:478
- zevo-crash-2013-03-15_02.txt uptime 02h24 metaslab.c:1428 ZFS assertion failed: DVA_IS_VALID(dva)
- zevo-crash-2013-03-15_03.txt uptime 00h00'6 metaslab.c:1428 ZFS assertion failed: DVA_IS_VALID(dva)
- zevo-crash-2013-03-15_04.txt uptime 00h00'5 metaslab.c:1428 ZFS assertion failed: DVA_IS_VALID(dva)
- zevo-crash-2013-03-15_05.txt uptime 03h55 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff811fbf2ae0 0xffffff811fbf2d98 0xffffff811fbf2d68 ... zio.c:478
- zevo-crash-2013-03-15_06.txt uptime 00h01'3 Kernel trap at 0xffffff7f81a7c0a6, type 14=page fault
- zevo-crash-2013-03-15_07.txt uptime 01h50 metaslab.c:1428 ZFS assertion failed: DVA_IS_VALID(dva)
- zevo-crash-2013-03-15_08.txt uptime 00h13 zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf1bc400 0xffffff80bf1bc6c0 0xffffff80bf1bc688 ... zioc:478
Almost 40 crashes in 18 days renders Zevo pretty unusable. Even the old Z-410 beta have been more stable.
@ Don Brady et.al. : If you need more information, please let me know. I'll also happy to run any extra debug version you may have or to provide any help in solving this that I can give. I have several years of programming experience and some experience in Mac OSX kernel debugging. I needed, I can setup a remote machine for interactive kernel tracing and provide remote access. Feel free to contact me.
The 14 crashes with "zero io_children" are distributed over 3 different backtraces:
- Code: Select all
panic(cpu 0 caller 0xffffff7f81a7c0da): "zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf576718 0 0xffffff80bf5769a0\n"@/staging/zevo/src/uts/common/fs/zfs/zio.c:478
Backtrace (CPU 0), Frame : Return Address
0xffffff80c0263d70 : 0xffffff8000204d15
0xffffff80c0263e70 : 0xffffff7f81a7c0da
0xffffff80c0263ec0 : 0xffffff7f81a7ad31
0xffffff80c0263ef0 : 0xffffff7f81a78369
0xffffff80c0263f30 : 0xffffff7f819e8c39
0xffffff80c0263fa0 : 0xffffff80002c8527
Kernel Extensions in backtrace (with dependencies):
com.getgreenbytes.filesystem.zfs(2012.09.23)@0xffffff7f819db000->0xffffff7f81b1afff
dependency: com.apple.iokit.IOStorageFamily(1.6.3)@0xffffff7f810f0000
- Code: Select all
panic(cpu 1 caller 0xffffff7f81a7c0da): "zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf5db720 0xffffff80bf5db9d8 0xffffff80bf5db9a8\n"@/staging/zevo/src/uts/common/fs/zfs/zio.c:478
Backtrace (CPU 1), Frame : Return Address
0xffffff80c2163d70 : 0xffffff8000204d15
0xffffff80c2163e70 : 0xffffff7f81a7c0da
0xffffff80c2163ec0 : 0xffffff7f81a7ad31
0xffffff80c2163ef0 : 0xffffff7f81a78369
0xffffff80c2163f30 : 0xffffff7f819e8c39
0xffffff80c2163fa0 : 0xffffff80002c8527
Kernel Extensions in backtrace (with dependencies):
com.getgreenbytes.filesystem.zfs(2012.09.23)@0xffffff7f819db000->0xffffff7f81b1afff
dependency: com.apple.iokit.IOStorageFamily(1.6.3)@0xffffff7f810f0000
- Code: Select all
panic(cpu 0 caller 0xffffff7f81a7c0da): "zio_notify_parent: zero io_children [0, 0], err 0, 0xffffff80bf444400 0 0xffffff80bf444688\n"@/staging/zevo/src/uts/common/fs/zfs/zio.c:478
Backtrace (CPU 0), Frame : Return Address
0xffffff80aa4dbd60 : 0xffffff8000204d15
0xffffff80aa4dbe60 : 0xffffff7f81a7c0da
0xffffff80aa4dbeb0 : 0xffffff7f81a7ad31
0xffffff80aa4dbee0 : 0xffffff7f81a78369
0xffffff80aa4dbf20 : 0xffffff7f819e968c
0xffffff80aa4dbfa0 : 0xffffff80002c8527
Kernel Extensions in backtrace (with dependencies):
com.getgreenbytes.filesystem.zfs(2012.09.23)@0xffffff7f819db000->0xffffff7f81b1afff
dependency: com.apple.iokit.IOStorageFamily(1.6.3)@0xffffff7f810f0000
Best regards
Björn