OpenZFS on OS X

Posted: **Mon May 23, 2016 3:17 am**

I've got two pools. They seem to be working well - performance is great.

The problem is that, when I reboot, one disappears.

When I try to get the other one back, it has problems reading the cache disk. I couldn't get it back, so I had to re-create it:

# zpool status jupiter
pool: jupiter
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
jupiter ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
disk8 ONLINE 0 0 0
disk9 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
disk11 ONLINE 0 0 0
disk14 ONLINE 0 0 0
cache
disk13 ONLINE 0 0 0

errors: No known data errors

What should I do to prevent this happening in future? Would it make sense to export both volumes at shut down, and re-import them?

Or is there another solution?

Posted: **Mon May 23, 2016 5:05 pm**

You should be very careful with cache disks and OSX with /dev/disk names. If they get renumbered, it will just use whatever disk happens to have the old name.

It is recommended you use invariantdisk path in your import, ie zpool import -d /var/run/disk/by-id

Posted: **Mon May 23, 2016 9:56 pm**

Thank you! That fits perfectly, and makes sense.

I wonder how I can change the existing sets to be ordered by ID..

Posted: **Mon May 23, 2016 10:24 pm**

what lundman said -> zpool import -d /var/run/disk/by-id

Posted: **Tue May 24, 2016 3:47 am**

Yes, I understand that - for the future.

However, I have a running pool at the moment. It takes quite a long time to move all the data out, and then move all of it back again. There must be a config file somewhere where it keeps this information, and it'd be a lot quicker to edit that.

I think this is a bug, actually, it should be easy to fix - all that's necessary is that zpool converts device IDs to unique IDs, and the problem would disappear.

Posted: **Tue May 24, 2016 5:20 am**

I think I can see the problem.

In the file:

/etc/zfs/zpool.cache

It refers to devices as either:

"
disk
guid
path
F/private/var/run/disk/by-id/media-38B45B4D-5617-754A-B307-A305948EE3A5
whole_disk
create_txg
"

or

"
disk
guid
path
/dev/disk14s1
whole_disk
"

This is what is wrong. All that should be necessary is to replace the /dev/disk14s1 type entries with the correct reference in /private/var/run/disk/by-id/

It's probably best to do this from the recovery system, just to be sure nothing gets confused -- though they shouldn't really, the zpool.config file is just a text file, not an XML, or anything nastier, so it should just be read sequentially. I can't see any evidence for a checksum being kept of this file.

It'll save a day's copying, with just a quick re-boot, so it should do the trick.

Has anybody here done this? It would be good to know if it's safe to change it on a live system - I've got the source, so I could check, but it'd be nice if somebody here knows if zpool.cache is usually left alone when no changes are being made (my guess is that it is).

Posted: **Tue May 24, 2016 12:42 pm**

If you do the import the cache file will be rewritten.

- Brendon

OpenZFS on OS X

Rebooting loses one pool

Rebooting loses one pool

Re: Rebooting loses one pool

Re: Rebooting loses one pool

Re: Rebooting loses one pool

Re: Rebooting loses one pool

Re: Rebooting loses one pool

Re: Rebooting loses one pool