zlib-ng?

Developer discussions.

zlib-ng?

Postby RJVB » Tue May 23, 2017 1:04 pm

Hi,

A quick glance at the code suggests that ZFS uses zlib for the gzip compression option. If that's correct, has anyone looked into the zlib-ng project (github:dead2/zlib-ng) which aims among others to provide SSE-accelerated compression? The gains aren't astronomical but "significant enough" to be noticeable.
RJVB
 
Posts: 54
Joined: Tue May 23, 2017 12:32 pm

Re: zlib-ng?

Postby lundman » Tue May 23, 2017 3:51 pm

The default compression used to be zlib, or rather, lzjb, but more recently it is recommended to use "lz4", and OpenZFS has been updated to use that on metadata as well, when enabled. lzjb will stick around to be able to read old pools of course. There has been discussions on upgrading lz4 to its improved sibling, but I do not know the state of that yet. It is generally something that upstream OpenZFS will add and we pull in.

Having said that, ZOL had some improvements in fletcher and sha, to use SSE2 etc. Alas, that code did not want to work for us (incompatible assemblers). There is a branch for it should someone want to play with it.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: zlib-ng?

Postby RJVB » Wed May 24, 2017 12:12 am

I know about the different compression types, and have been comparing lz4 with HFS's post-hoc compression (which uses zlib) in practice. Looking purely at compression ratios, HFS compression does significantly better in most cases that interest me (I run a parallelised version of afsctool using an accelerated zlib on my source and build directories, with level 8).
I doubt that it's possible to make it as fast as lz4 (without offloading it to the GPU?) but I'd expect that it wouldn't hurt if ZFS's best compression scheme could become a little faster by just using another zlib implementation.

Some figures from a recent comparison I ran

Code: Select all
# 2.7 Ghz i7, OS X 10.9, -O3 -march=native, clang-4.0, 262Mb source tarball as test
# 4.641 user_cpu 0.083 kernel_cpu 0:04.72 total_time 100.0%CPU {978944M 0F 275R 0I 0O 0k 0w 134c}
# 4.644 user_cpu 0.083 kernel_cpu 0:04.73 total_time 99.7%CPU {978944M 0F 275R 0I 0O 0k 0w 130c}
# 4.671 user_cpu 0.084 kernel_cpu 0:04.75 total_time 100.0%CPU {978944M 0F 275R 0I 0O 0k 0w 125c}
# stock zlib 1.2.11, idem but with -flto:
# 7.704 user_cpu 0.089 kernel_cpu 0:07.80 total_time 99.7%CPU {966656M 0F 272R 0I 0O 0k 0w 245c}
# 7.676 user_cpu 0.089 kernel_cpu 0:07.77 total_time 99.7%CPU {958464M 0F 270R 0I 0O 0k 0w 241c}
# 7.889 user_cpu 0.089 kernel_cpu 0:07.98 total_time 99.7%CPU {958464M 0F 270R 0I 0O 0k 0w 209c}


As to the issue with assembler you mentioned it's not a matter of forgetting to run Apple's `as` without the -q option? :)
Long shot: clang on Linux can generate object files for Mac (with the --target option). Suppose it eats the troublesome assembly in question in that configuration and those files can be used in a ZFS build on Mac. Couldn't those object files then be disassembled (on Mac) to get native assembly code?
RJVB
 
Posts: 54
Joined: Tue May 23, 2017 12:32 pm

Re: zlib-ng?

Postby lundman » Wed May 24, 2017 11:49 pm

We can't just add different compression algorithms ourselves, it has to be universally done by OpenZFS, or pools will be incompatible. Well, I mean, we could, but...

On the assembler thing, it can definitely be done, but not in the 30s I gave it when I tested the commit :)
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: zlib-ng?

Postby RJVB » Thu May 25, 2017 12:56 am

lundman wrote:We can't just add different compression algorithms ourselves, it has to be universally done by OpenZFS, or pools will be incompatible. Well, I mean, we could, but...


I'd never suggest that, evidently. Zlib-ng is still zlib; even when you don't build it in ABI-compatible mode (= as a *drop-in* replacement for "stock" zlib) it still generates zlib compression. Different implementations may not achieve exactly the same level of compression, but they can still de- and re-compress each others files. So the point here is not to introduce a feature upstream doesn't have.

I have only scanned through the code once and very quickly, so I'm not even certain if ZFS includes its own zlib source copy, let alone whether that copy is modified.

TBH, re: introducing new features; the thought did cross my mind that it wouldn't hurt to have the possibility with ZFS like on HFS to run zlib compression "offline" on existing files. A shortcut for
    - set dataset compression to gzip-N (I like using level 8)
    - rewrite all files under the given directory/ies
    - set dataset compression back to what it was
but the only thing that could maybe really benefit from a lowlevel addition is the "rewrite this file" step. (With HFS you have to do that yourself too, to the resource fork at that.)


lundman wrote:On the assembler thing, it can definitely be done, but not in the 30s I gave it when I tested the commit :)


I hear you :) I took just a bit more than that to test my idea of compiling it for MachO with clang under Linux. That doesn't work, at least not for the unrelated example I tried containing gas/linux specific assembler directives. If you tell clang to generate MachO object code it really behaves as if running on Mac. Guess that makes sense :)
RJVB
 
Posts: 54
Joined: Tue May 23, 2017 12:32 pm

Re: zlib-ng?

Postby lundman » Sat May 27, 2017 11:53 pm

RJVB wrote:
I'd never suggest that, evidently. Zlib-ng is still zlib; even when you don't build it in ABI-compatible mode (= as a *drop-in* replacement for "stock" zlib) it still generates zlib compression. Different implementations may not achieve exactly the same level of compression, but they can still de- and re-compress each others files. So the point here is not to introduce a feature upstream doesn't have.


Oh that is interesting, so compatible just a speed up - could be worth testing. In kernel, we just call XNU's built in libz, which could be optimised - who knows. But zlib is setup to let you define the names from headers, so it would not be hard to work around.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: zlib-ng?

Postby RJVB » Sun May 28, 2017 1:51 am

lundman wrote:
RJVB wrote:Oh that is interesting, so compatible just a speed up - could be worth testing. In kernel, we just call XNU's built in libz, which could be optimised - who knows. But zlib is setup to let you define the names from headers, so it would not be hard to work around.


Indeed. For userland applications you get a nicely significant speed-up just by replacing the shared library; it's rarely that simple :) I've done exactly that with the /usr/lib/libz dylib; I have yet to run into issues.

I saw that ZoL also calls the in-kernel zlib implementation. On Linux it is apparently not a good idea to use SIMD code in kexts, but I think that limitation doesn't exist on Mac. The XNU code is open, btw, so it's not impossible to see if its zlib implementation has any kind of optimisations.

Edit: I do have a PR out on zlib-ng that addresses a compatibility issue in the *ABI* but that shouldn't affect code that's built against the library and its headers.
RJVB
 
Posts: 54
Joined: Tue May 23, 2017 12:32 pm

Re: zlib-ng?

Postby lundman » Sun May 28, 2017 5:42 pm

Its in xnu libkern/zlib/

and looks like a regular "zlib 1.2.3" to me, no extra work on it. Could be a fun project for someone to try :)
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: zlib-ng?

Postby RJVB » Sun May 28, 2017 11:29 pm

lundman wrote:Its in xnu libkern/zlib/
and looks like a regular "zlib 1.2.3" to me, no extra work on it. Could be a fun project for someone to try :)


Hah, could also be a fun project to update the version in the kernel while at it :)

No idea what else it might be used for (nor how suicidal it is to build and use one's own kernel on Mac ...)
RJVB
 
Posts: 54
Joined: Tue May 23, 2017 12:32 pm


Return to OpenZFS on OS X Development

Who is online

Users browsing this forum: Google [Bot] and 17 guests

cron