zpool scrub + zfs unmount = kernel panic

This forum is to find answers to problems you may be having with ZEVO Community Edition.

Moderators: jhartley, MSR734, nola

zpool scrub + zfs unmount = kernel panic

Post by s34gull » Sat Dec 08, 2012 1:29 am

So, I've been working on a script (http://code.google.com/p/clairvoyant/) to facilitate backup to a ZFS formatted sparsebundle disk image, and once a week the script kicks off a scrub (as part or standard filesystem check). The scrub backrounds and returns control back to the script, which proceeds to attempt an unmount/remount of the file system (to ensure a clean mount) - this behavior was roughly consistent with what was happening with other synchronous fsck tools. I've since opted to skip the scrub (noting in the log that a scrub should be performed), but before I modified the scripts, issuing a 'zfs unmount' while a 'zpool scrub' was happening in the background resulted in a nasty kernel panic (it also explained why my Mac rebooted overnight, when the first weekly scrub was attempted).

The summary of the KP is below:
Code: Select all
Fri Dec  7 15:20:19 2012
panic(cpu 0 caller 0xffffff7f9cbb01b7): "/staging/zevo/src/uts/common/fs/zfs/zil.c:572 ZFS assertion failed: !keep_first"@/staging/zevo/src/uts/darwin/os/printf.c:43

      Kernel Extensions in backtrace:
         com.getgreenbytes.filesystem.zfs(2012.9.23)[04497DBB-8849-31D8-8496-BE10E5711C53]@0xffffff7f9cba5000->0xffffff7f9cd3ffff
            dependency: com.apple.iokit.IOStorageFamily(1.8)[5BA4CD36-E96D-3A9E-ADFF-A863BBD63BC7]@0xffffff7f9cb78000

BSD process name corresponding to current thread: zfs


The full stack trace is available at https://gist.github.com/4238919

Regards,
Jonathan
Last edited by s34gull on Wed Dec 12, 2012 4:31 pm, edited 1 time in total.
s34gull Offline


 
Posts: 4
Joined: Thu Nov 08, 2012 11:11 am

Re: zpool scrub + zfs unmount = kernel panic

Post by grahamperrin » Sat Dec 08, 2012 1:50 am

How do you define a mount that is not clean?

I should not detract from the subject (kernel panic) so if you like, post to a separate topic. Thanks.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

timing of unmount then (re)mount

Post by grahamperrin » Sat Dec 08, 2012 3:36 am

s34gull wrote:… unmount/remount …


Timing.

Please, did the script specify any waiting period between the two?
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: zpool scrub + zfs unmount = kernel panic

Post by s34gull » Wed Dec 12, 2012 4:24 pm

Please, did the script specify any waiting period between the two?


@grahamperrin => As the 'zpool scrub' operation is asynchronous, and executes in the background for an extended period of time that scales with the size of your zpool, it didn't make sense to issue a sleep. At this point, if an active scrub is detected, the backup script will just exit.
s34gull Offline


 
Posts: 4
Joined: Thu Nov 08, 2012 11:11 am

Re: zpool scrub + zfs unmount = kernel panic

Post by s34gull » Wed Dec 12, 2012 4:30 pm

It seems that this issue is related to another bug posted on these forums. The proposed workaround is to issue a 'zpool export' in lieu of a 'zfs unmount'. This works in my case because the zpool only has a single filesystem, but isn't a general purpose fix. In spite of this workaround, however, I was still experiencing kernel panics when exporting frequently, so I've altered the script to just change the 'readonly' property - the zpool/filesystem stays mounted in between runs.

Code: Select all
Mon Dec 10 10:26:15 2012
panic(cpu 1 caller 0xffffff7f887b01b7): "/staging/zevo/src/uts/darwin/fs/zfs/zfsx_vfsops.c:175 ZFS assertion failed: list_is_empty(&zfsvfs->z_searches_list)"@/staging/zevo/src/uts/darwin/os/printf.c:43

      Kernel Extensions in backtrace:
         com.getgreenbytes.filesystem.zfs(2012.9.23)[04497DBB-8849-31D8-8496-BE10E5711C53]@0xffffff7f887a5000->0xffffff7f8893ffff
            dependency: com.apple.iokit.IOStorageFamily(1.8)[5BA4CD36-E96D-3A9E-ADFF-A863BBD63BC7]@0xffffff7f88778000

BSD process name corresponding to current thread: umount


Full kernel panic is here https://gist.github.com/4271797
s34gull Offline


 
Posts: 4
Joined: Thu Nov 08, 2012 11:11 am

the panic reports appear different

Post by grahamperrin » Thu Dec 13, 2012 12:37 am

At a glance, the 2012-12-10 kernel panic with list_is_empty may be different from the 2012-12-07 panic without that string.

Good to have a workaround, but I remain interested in the earlier panic.

The question about waiting time "between the two" refers to these two things:
unmount/remount
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to Troubleshooting

Who is online

Users browsing this forum: bileyqrkq, ilovezfs and 0 guests

cron