kernel_task constantly running at over 100%

All your general support questions for OpenZFS on OS X.

kernel_task constantly running at over 100%

Postby RobRehnmark » Wed Oct 18, 2017 4:01 pm

kernel_task is running at 100% cpu utilisation.
It's no immediate disaster since I have 6 cores / 12 threads but there is definitely something going on and I think it has to do with ZFS.
I had the problem with instant reboots when trying to scrub but then I could scrub again.
Importing the pool, or at least some of the file systems on it, seems to be problematic. It takes quite a bit of time compared to before.
It seems it's ballooning a bit in memory usage too. I don't think I have used the file system much since rebooting but kernel_task is now at 12,67 GB out of the 24 GB I have.

I would be grateful if someone could advice on how to get the right logs, etc. in order to try to figure out what's going on.

I'm running High Sierra and just installed from source a few days ago.

Code: Select all
  138    1 0xffffff7f82e55000 0x3f8      0x3f8      net.lundman.kernel.dependencies.30 (12.5.0) EABE2046-57AE-4F8D-8EEB-15176843E226
  139    1 0xffffff7f82e56000 0x11f5000  0x11f5000  net.lundman.spl (1.6.2) 47DF1BD7-98FE-38D1-88DF-2BB62C86AAC9 <138 7 5 4 3 1>
  140    1 0xffffff7f8404b000 0x2b4000   0x2b4000   net.lundman.zfs (1.6.2) E26C25E4-D481-3968-B88C-BD1B70BA4403 <139 26 7 5 4 3 1>


Code: Select all
Processes: 413 total, 2 running, 2 stuck, 409 sleeping, 2246 threads                                                                       01:58:33
Load Avg: 2.73, 2.89, 2.72  CPU usage: 1.50% user, 8.96% sys, 89.52% idle    SharedLibs: 248M resident, 60M data, 30M linkedit.
MemRegions: 66402 total, 5064M resident, 287M private, 1932M shared. PhysMem: 23G used (13G wired), 986M unused.
VM: 1893G vsize, 1091M framework vsize, 0(0) swapins, 0(0) swapouts. Networks: packets: 7305008/2689M in, 8801675/9351M out.
Disks: 1190676/22G read, 1567230/19G written.

PID   COMMAND      %CPU  TIME     #TH    #WQ  #PORT MEM    PURG   CMPR PGRP PPID STATE    BOOSTS          %CPU_ME %CPU_OTHRS UID  FAULTS   COW
0     kernel_task  101.6 14:07:10 585/13 0    2     13G+   0B     0B   0    0    running   0[0]           0.00000 0.00000    0    128834+  0
302   WindowServer 4.7   19:11.18 5      2    626   219M   97M    0B   302  1    sleeping *0[1]           0.00167 0.00246    88   1577465+ 25156


Edit:
I tried exporting the pool but got nowhere, it didn't even finish the process.
Then I couldn't even get anywhere with zpool status.

Edit 2:
It seems it was one of the disks going bad.
The smart command failed a couple of times during bios post and now ZFS have removed the drive from the pool, degrading it.
No more 100% kernel_task.
The drive will be replaced.
Super grateful to the people developing OpenZFS on OS X!
RaidZ2 working well for storing my data, OS X Server share for Time Machine, FCPX, iTunes, Photos, etc. on my Hackinstosh.
RobRehnmark
 
Posts: 59
Joined: Sat Oct 17, 2015 12:23 am

Re: kernel_task constantly running at over 100%

Postby lundman » Wed Oct 18, 2017 4:46 pm

It would have been a little interesting to see a spindump while it was going crazy, maybe we have a too tight loop when something is dying...
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: kernel_task constantly running at over 100%

Postby RobRehnmark » Wed Oct 18, 2017 5:16 pm

I should be able to replicate.
Please advice on how to get the info you want.
Super grateful to the people developing OpenZFS on OS X!
RaidZ2 working well for storing my data, OS X Server share for Time Machine, FCPX, iTunes, Photos, etc. on my Hackinstosh.
RobRehnmark
 
Posts: 59
Joined: Sat Oct 17, 2015 12:23 am

Re: kernel_task constantly running at over 100%

Postby lundman » Thu Oct 19, 2017 4:16 pm

sudo spindump

then copy the /tmp/spindump.txt file it generates somewhere for us to get
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: kernel_task constantly running at over 100%

Postby RobRehnmark » Thu Oct 19, 2017 7:39 pm

Please excuse me if this question seems dumb but how do I bring the removed device online again?
I tried using <zpool online POOL DEVICE-ID/NODE NAME/UUID/GUID> but only get "no such device in pool".

Example from terminal:
Code: Select all
pc61:~ robert$ zpool status -L ocean
  pool: ocean
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub canceled on Wed Oct 18 12:39:16 2017
config:

        NAME        STATE     READ WRITE CKSUM
        ocean       DEGRADED     0     0     0
          raidz2-0  DEGRADED     0     0     0
            disk6   ONLINE       0     0     0
            disk3   ONLINE       0     0     0
            disk4   ONLINE       0     0     0
            disk7   ONLINE       0     0     0
            disk2   ONLINE       0     0     0
            disk5   REMOVED      0     0     0

errors: No known data errors
pc61:~ robert$ zpool online ocean disk5
cannot online disk5: no such device in pool
pc61:~ robert$


EDIT:
Ok, so after I ...
Exported and imported with -d /dev it seemed like I could online the device.
But it still said REMOVED.
Another export (-f) and normal import.
Still labeled REMOVED and now it will not accept zpool online
Code: Select all
bash-3.2# zpool online ocean media-41A0CCCD-87AA-834D-B08E-9199274F3250
cannot online media-41A0CCCD-87AA-834D-B08E-9199274F3250: cannot relabel '/private/var/run/disk/by-id/media-41A0CCCD-87AA-834D-B08E-9199274F3250': unable to read disk capacity

It seems I will have to accept that the disk is really dead but maybe I can get it to play nice enough to bring it online for some more trouble and a spindump.
A new drive will arrive in the mail today or tomorrow.
Super grateful to the people developing OpenZFS on OS X!
RaidZ2 working well for storing my data, OS X Server share for Time Machine, FCPX, iTunes, Photos, etc. on my Hackinstosh.
RobRehnmark
 
Posts: 59
Joined: Sat Oct 17, 2015 12:23 am

Re: kernel_task constantly running at over 100%

Postby RobRehnmark » Fri Oct 20, 2017 6:48 am

I'm sorry I never managed to bring the disk online to get the spindump.
The new drive is resilvering now.
Many thanks to all the people working on O3X, my files are safe.
Super grateful to the people developing OpenZFS on OS X!
RaidZ2 working well for storing my data, OS X Server share for Time Machine, FCPX, iTunes, Photos, etc. on my Hackinstosh.
RobRehnmark
 
Posts: 59
Joined: Sat Oct 17, 2015 12:23 am


Return to General Help

Who is online

Users browsing this forum: No registered users and 34 guests

cron