Auto snapshots

From OpenZFS on OS X
Jump to: navigation, search

Setting up an automatic snapshot regimen[edit]

One of the great strengths of zfs is its ability to create and manage snapshots of your filesystem. Although snapshots can be managed in a manual fashion, it is much easier if you use one of the automated snapshot solutions. This article is focused on bdrewery's zfstools Ruby scripting package.

Obtaining and installing the zfstools script

Download the zfs-auto-snapshot Ruby script as a zip file from Github:

https://github.com/bdrewery/zfstools

Unzip the file by double-clicking it, which will create a folder called zfstools-master. Drag that folder to someplace convenient using the Finder, like your Documents folder.

In Terminal, cd into the zfstools-master directory. Build the gem (can be done as normal user):

cd Documents/zfstools-master
gem build zfstools.gemspec

Install the gem as root (note that the revision string may have been updated to a later version number, check the output of the previous command):

sudo gem install --local zfstools-0.3.6.gem

Set up properties on pools

The zfstools package doesn't rely on a configuration file. Instead, it accomplishes its purpose using a combination of user properties embedded in the datasets or pools, and the timing of how it is called. You will need to set up a custom user property for each pool in advance; the property is com.sun:auto-snapshot, and it can be either true or false. So for a pool where you want snapshots by default, issue this command:

sudo zfs set com.sun:auto-snapshot=true MyPool

You can, of course, set the property true or false for child datasets to control whether they will have snapshots made (the zfs-auto-snapshot script is recursive when it runs, and tries to snapshot every child dataset). Otherwise, the child datasets will inherit whatever property setting you make in the top level pool.

How the zfs-auto-snapshot script works

The basic idea of the script is that you give it two command line parameters, a label and a count. The label may be anything you like, just as long as it is consistent from one invocation of the script to the next. So by convention you might use the label hourly for a script invocation that occurs on an hourly basis, and daily for a script that runs once a day. The script will incorporate the label hourly or daily into the snapshot name, along with the time the snapshot was taken. The count tells the script how many snapshots to keep with the label hourly or daily in its name; if the count that you specify is 24, it will only keep the 24 most recent snapshots.

At this point you can run the script from the command line, and you probably should run it that way at least once to test it out. Make sure that your pool is mounted, and then issue the Terminal command

sudo zfs-auto-snapshot hourly 24

Then check to see if it created a snapshot:

sudo zfs list -r -t snapshot MyPool
MyPool@zfs-auto-snap_hourly-2019-05-10-08h05   72K  -  19.8M  /Volumes/MyPool/.zfs/snapshot/zfs-auto-snap_hourly-2019-05-10-08h05

Creating a launchd job to run the script on a schedule

Now what is needed is a way to run script on a timed basis without user intervention. In the Linux or FreeBSD environment this would be accomplished with the utility ‘cron’, but in Mac OS X we have to work with the launchd daemon. To set up a launchd job we need a plist file to describe it, and we need to store that plist in a directory where launchd expects to find it. Unfortunately, the job will need to run as root, which means it will be in a directory protected by System Integrity Protection (SIP). Thus the first step is temporarily turn off SIP. Save any work that needs saving, and then reboot into the Recovery Partition by holding down the Command and R keys while rebooting. Once it reboots, select Terminal from the Utilities menu, and then issue the command:

csrutil disable

Then reboot again back into your normal system. The details of launchd property list creation would warrant its own complete web site, so it won’t be repeated here. Instead, we’re just going to start with a template file, and modify it for our purposes. You will need a separate plist file for each label that you put into use, and you should probably use a consistent naming convention to keep them straight. The file name you use should probably match the label within the plist.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Disabled</key>
	<false/>
	<key>EnvironmentVariables</key>
	<dict>
		<key>PATH</key>
		<string>/usr/local/bin/</string>
	</dict>
	<key>Label</key>
	<string>local.zfs-auto-snapshot.daily</string>
	<key>Nice</key>
	<integer>1</integer>
	<key>ProgramArguments</key>
	<array>
		<string>/usr/local/bin/zfs-auto-snapshot</string>
		<string>daily</string>
		<string>10</string>
	</array>
	<key>RunAtLoad</key>
	<true/>
	<key>StandardErrorPath</key>
	<string>/var/log/zfstools_errors.log</string>
	<key>StandardOutPath</key>
	<string>/var/log/zfstools_output.log</string>
	<key>StartCalendarInterval</key>
	<dict>
		<key>Hour</key>
		<integer>0</integer>
		<key>Minute</key>
		<integer>2</integer>
	</dict>
</dict>
</plist>

The first thing you may need to tweak is the Label property. This Label property is the identifier that launchd uses, and it is not exactly the same as the interval label passed to the script that we talked about earlier; it is more complicated, in that it has to be unique within the launchd namespace, but agree with the interval label to make sense. This one is set to local.zfs-auto-snapshot.daily; if you copy this file to make it into hourly, change the ending of the property so that it reads local.zfs-auto-snapshot.hourly.

There is an array of strings under ProgramArguments; the second string is the interval label that gets passed to the script, so make sure that agrees with the launchd Label, and also with what you put into the StartCalendarInterval section. This is the script that runs once a day, so the interval label we are using is daily. The third string in the ProgramArguments group tells zfs-auto-snapshot how many of the matching snapshots to keep; in this example, the script will keep snapshots going back 10 days.

The StartCalendarInterval is what determines when and how often the script will get run; the description in the plist above is read as ‘at two minutes past the zeroth hour (midnight), run this script.’ Since the calendar interval doesn't mention a day, it is assumed that the script will run every day. If we added a key for Weekday, and gave it a numeric value, then the script would run on that day only at 2 minutes past midnight, effectively turning it into a weekly script. The rules for formulating a StartCalendarInterval property aren’t hard once you get the hang of it; for some examples I recommend visiting this blog post by Alvin Alexander.

There is nothing magic about the file names I chose for the standard output and standard error; I just picked names that reflected the purpose of what they hold, but you can modify them to suit your taste.

Once you have created or edited your plist file you need to copy it into /Library/LaunchDaemons to have it run as root:

sudo cp local.zfs-auto-snapshot.daily.plist /Library/LaunchDaemons

It's probably a good idea to confirm that your plist file is only writable by root. The output of 'ls -l' should look like this, with only a single w in the third character position of the line, no x's, and three r's:

ls -l /Library/LaunchDaemons | grep local
-rw-r--r--@ 1 root  wheel   904 May 10 21:35 local.zfs-auto-snapshot.daily.plist

If you see a different pattern of r, w, -, and possibly x, you can fix it with the chmod command

sudo chmod 644 local.*

Then go back and run the ls -l command to make sure again.

Finally, you have to load it into the launchd system to have it noticed and run automatically. To do that you use the launchctl command:

cd /Library/LaunchDaemons
sudo launchctl load local.zfs-auto-snapshot.daily.plist

When you're sure that everything is working as intended you can boot back into Recovery mode by restarting with the Command and R keys held down again. Open the Terminal, and issue the command

csrutil enable