offline (re)compression?

Here you can discuss every aspect of OpenZFS on OS X. Note: not for support requests!

offline (re)compression?

Postby RJVB » Fri Jun 01, 2018 5:58 am

I was comparing equivalent build directories on my Mac (HFS+) and on Linux (ZFS), noticing how the transparent LZ4 compression on the latter is certainly nice but doesn't really get close to the space gain I get with offline HFS compression (ZIP level 8) on the Mac.

The compression setting of a ZFS dataset applies to files (blocks?) being written; IIUC if you change the dataset setting files only change compression when they're rewritten. (What happens when you change the setting in the middle of a large file write?)

Apple's HFS compression as applied with a utility like afsctool is an interesting complement to transparent/online compression as it allows to optimise disk usage at a convenient "offline" moment. If my assumption above is correct, it should be feasible to write a utility or add a command to the `zfs` driver to rewrite given files and files in given directories with another compression setting. You'd then have the best of both worlds: transparent compression to keep disk usage down (and speeds up on slow media) for negligible cost, and targeted offline space optimisation.

In its simplest implementation this would just change the compression parameter temporarily, rewrite the selected files one way or another and then restore the parameter, but more fine-grained control of the parameter would be useful (if it doesn't already exist) so other files being (re)written continue to use the regular dataset compression type. Alternatively, a `zfs recompress` command could be provided that rewrites all dataset files that require updating with the current compression.

Thoughts?
RJVB
 
Posts: 17
Joined: Tue May 23, 2017 12:32 pm

Re: offline (re)compression?

Postby lundman » Sun Jun 03, 2018 7:39 pm

I believe the following procedure should work::

Code: Select all
zfs snapshot pool/stabledata@now
zfs create -o compression=gzip9 pool/longterm
zfs send pool/stabledata@now | zfs recv pool/longterm/stabledata


or a combination of such. Ie, set gzip9 on a new dataset, then copy the files over with send/recv. Once we get zfs recv -o compression= option, we can skip setting the parent inherit value
User avatar
lundman
 
Posts: 454
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: offline (re)compression?

Postby RJVB » Sun Jun 03, 2018 11:48 pm

Yes, that should probably work, with the caveat that you recompress an entire dataset. From the looks of it your proposal is the send/receive equivalent of

```
> zfs create -o compression=gzip9 pool/longterm
> rsync -aAXH <stabledata_mp>/. <longterm_mp>
```

That's not in-place compression, and assumes you have the free space available to hold and additional copy of that entire dataset.
RJVB
 
Posts: 17
Joined: Tue May 23, 2017 12:32 pm

Re: offline (re)compression?

Postby RJVB » Mon Aug 06, 2018 3:12 am

RJVB
 
Posts: 17
Joined: Tue May 23, 2017 12:32 pm


Return to General Discussions

Who is online

Users browsing this forum: No registered users and 0 guests