OpenZFS on OS X

by **Haravikk** » Wed Dec 20, 2023 2:54 am

lundman wrote:Heh you only got 14 cycles into the spindump.speed? peculiar, guess it is busy.

So of your busiest threads, you have
Code: Select all
1 Thread 0x2a4a Thread name "z_wr_iss_1" 14 samples (1-14) priority 76 (base 76) cpu time 2.975s (11.6G cycles, 1961.3M instructions, 5.93c/i) 2 Spindump 797.6M 3 mdsync [5649] 4 com.apple.spindump.stackshot_parsing"(28) 1 sample (2) priority 31 (base 31) cpu time 0.061s (255.4M cycles, 319.4M instructions, 0.80c/i) 5 fseventsd [150] 6 Activity Monitor [6079]

Since #1, #5 (ZFS writing) is ending up in vm_map_find_space() your system is completely out of RAM, and is desperate to find more. I think you should tell ZFS
to use less (keep halving), and see if it improves. Could be there is a memory leak as well, that you trigger. The full output of sysctl for ZFS should show that, once it has started to slow down.

RAM isn't the issue; I have 64gb of RAM and I have ZFS configured to use up to 12gb for ARC. By the time I eventually gave up, kernel_task was on around 20gb RAM which is about normal (the SPL size usually reports around 16gb once the ARC has filled up so another 4gb for everything else isn't unreasonable), meanwhile I still had nearly 34gb of free RAM.

A memory leak also seems unlikely as performance was terrible from the moment v2.2.2 was installed and datasets were mounted and being used. Whatever is wrong the problem is immediate and fundamental to how ZFS is operating (or not), but what confuses me is why nobody else seems to be affected? For me the problem is immediate and so severe I have to downgrade to regain use of my system.

I'll try and get a full output from sysctl but is there more debug data I can grab at the same time? Upgrading and downgrading the ZFS version to get one piece of data at a time is extremely time consuming, that's why it's taken me so long to try again. And is it possible to tell ZFS to not use hardware acceleration? This could allow me to eliminate that as a possible cause, otherwise I'll have to try creating an dataset without compression or encryption, though IIRC hardware acceleration also includes some of the hashing methods as well?

by **Haravikk** » Sun Jan 21, 2024 3:47 am

I'm still waiting on a list of other debug data I could gather at the same time as the output from sysctl? Also whether it's possible to disable hardware acceleration to see if that affects the performance issues I'm seeing?

I can try setting up a dataset without encryption and compression and testing that, but my understanding that the newer hardware acceleration also includes hashing, though I guess I could disable that on a dataset as well? But really I'd prefer to be testing against the datasets where I'm seeing the problems, so turning off hardware acceleration entirely and using the software fallbacks (same as is used in v2.1.0 already) seems like it'd be a more useful test.

I often have long-running tasks I can't interrupt so I can't just restart whenever I want to, so it just isn't worth doing if all I'm grabbing is sysctl output, then you'll ask me for something else etc. I'd rather do as much as possible all at once.

But I'd also very much like to know if I can disable hardware acceleration and use software fallbacks somehow; it's one of the main performance related differences between v2.1.0 and newer versions so seems a reasonably likely culprit for the problem, and if the software implementations work as they do for me right now then I could at least continue running a new version of ZFS while gathering more data to fix the hardware acceleration somehow (or improve the self-tests that select hardware over software).

by **Haravikk** » Mon Mar 04, 2024 9:46 am

I'd really like to do more debugging of this problem this weekend but I still don't know what I can do to narrow down the causes and get this fixed? I'm effectively stuck on v2.1.0 because of this issue as every newer version I try on my 2018 Mac Mini is completely unusable because of this major performance problem.

While I can grab output from sysctl -a and try a few other things on my own, I'd really like more options for things to try and data I can gather so that this issue can actually be fixed.

So once again I'm asking, is it possible for me to disable hardware acceleration? I'd really like to test the newer versions with hardware acceleration disabled to a) confirm this is the problem and b) if it is, maybe I can actually maintain my current (non accelerated) performance level and finally upgrade ZFS.

Only other way I can think to do this is to copy one of my datasets into a new dataset with checksums, compression and encryption disabled to see if it exhibits the same major performance issues. But I'd really prefer to do it by disabling hardware acceleration as that would be simpler and be testing directly against datasets that I know will trigger the problem.

by **jawbroken** » Tue Mar 05, 2024 4:21 am

Perhaps I misunderstand what you're trying to test, but can't you use the same _impl tunables you mention in this thread to set the implementation (e.g. to generic) to test without certain forms of hardware acceleration?

by **Haravikk** » Tue Mar 05, 2024 5:43 am

jawbroken wrote:Perhaps I misunderstand what you're trying to test, but can't you use the same _impl tunables you mention in this thread to set the implementation (e.g. to generic) to test without certain forms of hardware acceleration?

If the _impl tunables let me set the method then great, but what should I be setting them to? This is what my output currently looks like for v2.1.0 (the version that has usable if not amazing performance):

Code: Select all: sysctl -a | grep '_impl:' kstat.zfs.darwin.tunable.zfs_vdev_raidz_impl: cycle [fastest] original scalar sse2 kstat.zfs.darwin.tunable.icp_gcm_impl: cycle [fastest] generic kstat.zfs.darwin.tunable.icp_aes_impl: cycle [fastest] generic x86_64 kstat.zfs.darwin.tunable.zfs_fletcher_4_impl: [fastest] scalar superscalar superscalar4 sse2

I'm assuming newer versions will show aesni etc. for me to choose from, but I'm unclear which implementation(s) I'm currently using (cycle maybe? Whatever that is?)

by **jawbroken** » Thu Mar 07, 2024 5:01 am

I think those are just in alphabetical order, so I doubt you're currently using cycle. I don't know if there's any way to see what [fastest] has actually picked from that list. If you want to try to rule out a bug in one of the optimised implementations then I would personally start by just selecting whatever sounds like the baseline version (presumably something like original, generic, generic, scalar). You might find some other hints here.

by **lundman** » Sat Mar 16, 2024 2:45 pm

The cpuid probe function was wrong in pre-2.2.3.rc1 - where it finds out if your CPU has AESNI and all those other features. Can you check what the _impl has available to you with that release instead?

by **Haravikk** » Mon Mar 18, 2024 5:44 am

I tried v2.2.3rc3 on my 2018 Mac Mini today and I seem to be getting the CPUID details I expect (didn't spot the actual CPUID in the log anywhere, is it now only showing the capabilities?):

Code: Select all: log show --last 10m --source --predicate 'sender == "zfs"' --style compact 2024-03-18 11:54:32.080 Df kernel[0:2132] (zfs) <zfs`org_openzfsonosx_zfs_zvol::start(IOService*)> ZFS: Loading module ... 2024-03-18 11:54:32.080 Df kernel[0:2132] (zfs) <zfs`spl_start> SPL: loading 2024-03-18 11:54:32.093 Df kernel[0:2132] (zfs) <zfs`spl_processor_init> CPUID: osxsave sse sse2 sse3 ssse3 sse4.1 sse4.2 2024-03-18 11:54:32.093 Df kernel[0:2132] (zfs) <zfs`spl_processor_init> CPUID: avx avx2 aes pclmulqdq movbe 2024-03-18 11:54:32.093 Df kernel[0:2132] (zfs) <zfs`org_openzfsonosx_zfs_zvol::start(IOService*)> ZFS: hostid set to ac938a5b from UUID '98A28080-37FC-5F46-B29E-EBFD29E61994' 2024-03-18 11:54:33.064 Df kernel[0:2132] (zfs) <zfs`zfs_init> ZFS filesystem version: 5 2024-03-18 11:54:33.078 Df kernel[0:2132] (zfs) <zfs`org_openzfsonosx_zfs_zvol::start(IOService*)> ZFS: Loaded module v2.2.3-rc3, ZFS pool version 5000, ZFS filesystem version 5

However I've been unable to verify whether this has helped with performance as I've encountered a new kernel panic when calling sysctl -a; this makes it impossible for me to grab sysctl values for posting here, or to check new options. I can however check and set known good options without issue.

What are the new sysctl hardware values? Could it be those causing the panic? Otherwise I'm not sure how to narrow it down, but my guess would be one of the sysctl values that fetches live data from the kernel, as option values seem okay?

----------

In the meantime I'm going to try v2.2.2 again; despite the CPUID bug it's showing me all of the implementation methods I expect for my CPU, but I'm going to have to experiment with them to see which is being selected and which is causing me performance problems. Setting kstat.zfs.darwin.tunable.icp_aes_impl=x86_64 seems promising so far, but it's hard to tell as upgrading ZFS seems to have triggered Spotlight to decide it to needs re-index absolutely everything, which is hammering my system a bit at the moment.

Here are the implementations I see under v2.2.2 on my 2018 Mac Mini:

Code: Select all: kstat.zfs.darwin.tunable.icp_aes_impl: cycle fastest generic [x86_64] aesni kstat.zfs.darwin.tunable.icp_gcm_impl: cycle [fastest] generic pclmulqdq kstat.zfs.darwin.tunable.zfs.fletcher_4_impl: [fastest] scalar superscalar superscalar4 sse2 ssse3 kstat.zfs.darwin.tunable.zfs.blake3_impl: cycle [fastest] generic sse2 sse41 kstat.zfs.darwin.tunable.zfs.sha256_impl: cycle [fastest] generic x64 ssse3 kstat.zfs.darwin.tunable.zfs.sha512_impl: cycle [fastest] generic x64

Maybe the CPUID for my 2018 Mac Mini just happens to include bits in the correct places to work despite the CPUID parsing bug?

You also asked for the output of sysctl -a to see if I was experiencing some kind of memory limitations (despite having 64gb of RAM); I've retrieved this for v2.1.0 (version I've been using with okay performance) and v2.2.2 on which I've experienced the performance issues before. I'm attaching these to the post since it's the full output of ZFS entries minus vdev/pool entries.

I haven't had a chance to try 2.2.3rc3 on my older (2009) Mac Mini yet as its copy operation is still running at a snail's pace, it's onto verification but ASR will not allow me to skip it, hopefully will finish sometime today. Last time I use Disk Utility's restore disk feature!

by **Haravikk** » Mon Mar 18, 2024 9:29 am

So I've got both good news and bad news; the good news is that no specific implementation for AES, fletcher 4 etc. appears to be at fault, as I've been able to observe no discernible difference no matter what combination of options I tried. Whatever performance difference they might be making is being totally eclipsed by the poor all around performance I'm suffering on v2.2.2 compared to v2.1.0.

The bad news being that I'm still seeing terrible all around performance on v2.2.2 compared to v2.1.0, and have no idea what the cause might be.

I've noticed that this seems to occur in bursts; sometimes the system is fully responsive for a very short time (maybe a minute or two), same as when running v2.1.0 or with no ZFS datasets mounted, at other times it's crawling along and even simple tasks like switching windows becomes sluggish.

I've grabbed a bunch of spindumps and attached five of them; I tried to grab them whenever I saw kernel_task spiking in CPU usage but it's very hard to do as Activity Monitor is sluggish along with the rest of system. I've also notied the sheer amount of CPU time consumed by both kernel_task (ZFS) and processes that are most affected by the poor performance.

After around 4 hours of uptime kernel_task has 1:02:01.25 in CPU time, next is Windowserver on 24:06.83, then we've got corespotlightd on 21:27.65 because the mds hammering just doesn't seem to want to end, probably because it's badly delayed by the poor system performance.

I just don't know what to make of this, or why I specifically am suffering the problem? My typical usage does have a few mounted datasets (two user accounts, each with around 8 datasets to cover things like media folders with larger recordsizes, cache folders so they can be easily ignored from backups etc.), all of which are encrypted and have zstd compression enabled. But none of that is a problem under v2.1.0, it's only a problem when I try to upgrade to anything more recent.

I'm out of ideas as to what to try or what the potential cause might be; hopefully someone else can see some clues in the spindumps as nothing is standing out to me at all. Whatever it is it's in ZFS, and it's new since v2.1.0, but it doesn't appear to be any of the tunable forms of hardware acceleration as I first theorised (though it's nice to eliminate that I suppose).

My system shouldn't be CPU constrained as I'm on a hexacore i7, and the average load is only around 15% even under v2.2.2, the issue with performance seems to be entirely the load on kernel_task or some kind of bottlenecking somewhere. Meanwhile I'm not memory constrained either as I've got 64gb of RAM, with at least 25gb "free" (cache). I have 12gb assigned to ARC for ZFS, and I set the dirty write buffer to around 6gb as one of my disks is SMR so needs the extra help when writing larger files, this puts kernel task around ~20gb of RAM including overhead, other kernel modules etc. I'm not seeing any sign of an obvious memory leak, as kernel_task's memory usage appears stable.

Also worth mentioning that after downgrading (yet again) back to v2.1.0, Spotlight still wanted to index everything (not sure what triggered it in the first place but it was determined to finish), but the impact of that on the system is negligible – under v2.2.2 I'd see mds spiking up to 90% or higher CPU usage, under v2.1.0 it could still get quite high (I saw mds_stores up at 60% at one point) but at no point does the system ever feel unresponsive, nothing else has high CPU usage etc. So while it's possible mds/mds_stores/corespotlightd are exacerbating any problem, they are definitely not causing it.

by **armdn** » Mon Mar 18, 2024 10:49 pm

Haravikk wrote:
lundman wrote:Heh you only got 14 cycles into the spindump.speed? peculiar, guess it is busy.

So of your busiest threads, you have
Code: Select all
1 Thread 0x2a4a Thread name "z_wr_iss_1" 14 samples (1-14) priority 76 (base 76) cpu time 2.975s (11.6G cycles, 1961.3M instructions, 5.93c/i) 2 Spindump 797.6M 3 mdsync [5649] 4 com.apple.spindump.stackshot_parsing"(28) 1 sample (2) priority 31 (base 31) cpu time 0.061s (255.4M cycles, 319.4M instructions, 0.80c/i) 5 fseventsd [150] 6 Activity Monitor [6079]

Since #1, #5 (ZFS writing) is ending up in vm_map_find_space() your system is completely out of RAM, and is desperate to find more. I think you should tell ZFS
to use less (keep halving), and see if it improves. Could be there is a memory leak as well, that you trigger. The full output of sysctl for ZFS should show that, once it has started to slow down.

RAM isn't the issue; I have 64gb of RAM and I have ZFS configured to use up to 12gb for ARC. By the time I eventually gave up, kernel_task was on around 20gb RAM which is about normal (the SPL size usually reports around 16gb once the ARC has filled up so another 4gb for everything else isn't unreasonable), meanwhile I still had nearly 34gb of free RAM.

A memory leak also seems unlikely as performance was terrible from the moment v2.2.2 was installed and datasets were mounted and being used. Whatever is wrong the problem is immediate and fundamental to how ZFS is operating (or not), but what confuses me is why nobody else seems to be affected? For me the problem is immediate and so severe I have to downgrade to regain use of my system.

I'll try and get a full output from sysctl but is there more debug data I can grab at the same time? Upgrading and downgrading the ZFS version to get one piece of data at a time is extremely time consuming, that's why it's taken me so long to try again. And is it possible to tell ZFS to not use hardware acceleration? This could allow me to eliminate that as a possible cause, otherwise I'll have to try creating an dataset without compression or encryption, though IIRC hardware acceleration also includes some of the hashing methods as well?

Nope. You are not the one with such trouble. I have the same problem on two different machines. Mac Pro 2008 with 32GB or RAM and Mac Pro 2010 with 128GB of RAM and it is the same thing when I started write operations on pools. The pools itself also different, at first I thought it was hard drive corruption somewhere else, but no, there isn't. I downgraded on MP2008 OpenZFS to version 2.1.0 from 2.2.2 and... everything starts to work smooth and nice. Of course I recreated pool.

On 2.1.6 through 2.2.2 the system caught I/O locks and almost unusable when I started writing anything... After several gigabytes write speed downs to 1-3 mb and huge system locks appears.

OpenZFS on OS X

Extreme Performance Issues with v2.1.6 (and v2.2.3)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Re: Extreme Performance Issues with v2.1.6 (and v2.2.2)

Who is online