Page 1 of 1

cannot scrub

PostPosted: Sun Jun 05, 2016 11:41 am
by haer22
Guys, I'd appreciate some help on this.

I cannot scrub one of my pools. The system reboots. No panic, no dump, just reboots. No I/O error.

The only way out of this is to disconnect enough disks so the pool does not get imported at boot. As there is a scrub active, it would just import, boot, import, reboot, import...

Then I set "sysctl kstat.zfs.darwin.tunable.zfs_no_scrub_io=1", connect the disks and manually import the pool. Then I can stop the scrub by "zpool scrub -s".

Finally i set "sysctl kstat.zfs.darwin.tunable.zfs_no_scrub_io=0". If this is not done ANY resilvering (due to bad block or whatever) will be ignored. This I think is a bug (or a very bad feature).

Anyway, I have a large pool (12TB) which I cannot scrub. How can I track down the bug??

Re: cannot scrub

PostPosted: Sun Jun 05, 2016 11:57 pm
by Brendon
Hi,

I had a look at the way zfs_no_scrub_io is applied, its identical with upstream, so regardless of whether it's a bug or not, the discussion needs to be held in a different venue.

Regarding your pool, sounds like its corrupt. If your machine is not generating panic logs, then there is little we can do in terms of diagnostics.

Cheers
Brendon

Re: cannot scrub

PostPosted: Mon Jun 06, 2016 6:17 am
by haer22
Thanks for the response Brandon

I have two avenues to choose from
1. zap the pool, re-create and then copy everything from my backup. Problem is that it will take several days and the pool will essentially be down during those days.
2. replace each disk one-by-one and hope that the resilvering will fix the problem. This will keep my pool up and running. But it may not fix the problem.

What does you gut-feeling say?

Is there any "debug-mode" I can put the scrub in so I can see where it stops scanning and from there figure out what is wrong?

Re: cannot scrub

PostPosted: Mon Jun 06, 2016 12:47 pm
by Brendon
Gut feel.

Back it up, zap, restore. I feel that the alternate may be an exercise in futility + it will take a very long time. Having said that how you operate your assets is up to you, I have neither enough context or enough experience to tell you what to do.

- Brendon

Re: cannot scrub

PostPosted: Tue Jun 07, 2016 12:34 pm
by haer22
ok, my gut-feeling agrees with your's. As it will take several days just wanted some extra gut-feelings :)

Here we go, 15.9 TB re-load. I'd better do some extrapolation if I should reconfigure the vdevs while I am zapping the content.