Macintosh- zfs "gotchas"

Moderators: jhartley, MSR734, nola

Macintosh- zfs "gotchas"

Post by zslg01 » Tue Jan 22, 2013 2:02 pm

I have been using zfs and doing some experimentation. I have hit an issue or two that bear thinking about.
- First, the whole Finder/zfs interaction. Since both a zpool and a created sub-pool under a pool appear to Mac OS and the Finder as separate mount points, any attempt to move a file from one to the other results in a copy - not a move.
For example assume a zpool named "tank" with a subpool (created via zfs command) named "subtank". Subtank and tank both mount as /Volumes/ mounts and appear to the Finder as separate volumes. Indeed a drag and drop or use of Terminal and a mv command result in duplication of data (even the data is still on the same pool). Thus a command of "mv xxx subtank/xxx results in a copy - not a move even when done with tank as the current directory - and since subtank appears as a sub-directory this makes things very difficult when trying to simply move a file within a pool.

- Watch out for USB drives ! I will repeat it - watch out for USB drives. Many USB enclosures may report the write done before it really is. This can cause a lot of issues when a crash (or a dog hitting the drive cable) happen at an inopportune time.

- I am a bit worried about the status of Zevo and zfs in general on Macs ... Oracle and Apple are not on the best of terms and no one knows when Larry will pitch his next fit. I hope GreenBytes has some solid legal advice on the IP in zfs before they get too far down the road. I would also hope GreenBytes doesn't simply see Zevo as a way to sell more of their product.

- How about some futures info or a roadmap. Unless I can put a pin in the map I am reluctant to commit time and effort to implementing a large MySQL farm. I know where Ubuntu/zfs is and where it is going .. how about Zevo?

-- Now for the good stuff.

- A zfs drive can be moved from Zevo to a current kernel level zfs Linux implementation and back - no problems. DOn't even try to use the Linux FuSE implementation - its way backlevel.
- MySQL seems to run ok on Zevo - not great but ok. I haven't looked at failure scenarios yet. The memory usage is an issue - I had to up my mini from 4 GB to 8 GB.
- Zevo works fine on Firewire 400
zslg01 Offline


 
Posts: 18
Joined: Wed Nov 14, 2012 7:34 pm

Moves work as expected

Post by grahamperrin » Tue Jan 22, 2013 4:51 pm

zslg01 wrote:… any attempt to move a file from one to the other results in a copy - not a move. …


Finder – Edit menu – Move Item Here

  • works as expected with Mountain Lion

mv(1)

  • works as expected with Mountain Lion

Finder drag and drop from one volume to another is normally a copy. For a move, modify with the Command key.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

IP

Post by grahamperrin » Tue Jan 22, 2013 5:00 pm

Not specific to Mac: Intellectual property
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Link

Post by grahamperrin » Tue Jan 22, 2013 5:02 pm

zslg01 wrote:… futures info or a roadmap… 


editions and versions of ZEVO, community and open source
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Macintosh- zfs "gotchas"

Post by zslg01 » Tue Jan 22, 2013 8:38 pm

Graham - try this .... create a pool on a disk, see it mounted in Finder. Now use zfs command to make a zfs "dataset" as a subfolder of that pool. Now try mv or drag and drop a file from the root of the pool to that folder (the "dataset" shows up as both a folder in the pool and as a separate mount point. For me a mv or drag and drop from pool to that sub folder results in a copy not a move.
Maybe I am doing to wrong but it works as expected on linux (mv moves, drag and drop also moves the file).
From the zfs man page ...
Creating a ZFS file system is a simple operation, so the number of file
systems per system is likely to be numerous. To cope with this, ZFS
automatically manages mounting and unmounting file systems without the
need to edit the /etc/fstab file. All automatically managed file sys-
tems are mounted by ZFS at boot time.
and ..

zfs create [-p] [-o property=value] ... filesystem

Creates a new ZFS file system. The file system is automatically
mounted according to the mountpoint property inherited from the
parent.
zslg01 Offline


 
Posts: 18
Joined: Wed Nov 14, 2012 7:34 pm

mv(1) works as expected

Post by grahamperrin » Wed Jan 23, 2013 3:49 pm

Code: Select all
sh-3.2$ ls /Volumes/gjp22/casesensitive
GIMP
sh-3.2$ touch /Volumes/gjp22/Desktop/touched
sh-3.2$ ls -l /Volumes/gjp22/Desktop/touched
-rw-r--r--  1 gjp22  wheel  0 23 Jan 20:47 /Volumes/gjp22/Desktop/touched
sh-3.2$ chgrp 20 /Volumes/gjp22/Desktop/touched
sh-3.2$ mv /Volumes/gjp22/Desktop/touched /Volumes/gjp22/casesensitive
sh-3.2$ ls -l /Volumes/gjp22/Desktop/touched
ls: /Volumes/gjp22/Desktop/touched: No such file or directory
sh-3.2$ ls /Volumes/gjp22/casesensitive
GIMP   touched
sh-3.2$
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Cross reference

Post by grahamperrin » Tue Jan 29, 2013 1:33 am

grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Macintosh- zfs "gotchas"

Post by mk01 » Thu Jan 31, 2013 12:46 pm

@zslg01, but this is point. ZFS is pool based storage. pool is like playground, where you can create independent filesystems (with completely different properties). But by existence of the one pool under, you are loosing the fighting with devices (disks).

move between filesystems is a copy & destroy operation. it's not like just change the location. one should think of the operations differently.

why to move files, when you can rename (move within hierarchy) filesystems? why to make a copy, if you can snapshot and clone within a second?

as for mysql on zfs, put recordsize to 8k, use skip-innodb-doublewrite and turn sync to disabled.

you can turn visibility of mount points in zevo pref pane.
mk01 Offline


 
Posts: 65
Joined: Mon Sep 17, 2012 1:16 am

Re: Macintosh- zfs "gotchas"

Post by BjoKa » Sat Feb 02, 2013 6:29 pm

Hi zslg01,

unfortunately your assumption and Graham's test are wrong.

zslg01 wrote:Graham - try this .... create a pool on a disk, see it mounted in Finder. Now use zfs command to make a zfs "dataset" as a subfolder of that pool. Now try mv or drag and drop a file from the root of the pool to that folder (the "dataset" shows up as both a folder in the pool and as a separate mount point. For me a mv or drag and drop from pool to that sub folder results in a copy not a move.
Maybe I am doing to wrong but it works as expected on linux (mv moves, drag and drop also moves the file).


Really moving a file or directory between ZFS datasets can not work. ZFS datasets are full-blown file systems from the kernel's VFS (virtual file systems) layer's point of view.

What happens behind the scene (slightly simplified and not entirely accurate), if you really move a file within a file system, is the following:
  1. mv check that the source exists and the target not
  2. mv calls the libc function rename("oldname", "newname")
  3. rename constructs a syscall to the kernel
  4. the kernel's VFS looks up the file system holding "oldname"
  5. the kernel's VFS looks up the handle (vnode) of the parent of "oldname", let's call it "op_vnode"
  6. the VFS calls the responsible file system's vnop_rename() of "op_vnode" with "oldname" and "newname" remapped to be relative to op_vnode
  7. the file system does what needs to be done, in case of ZFS it remaps internally to a link() of the new name, followed by an unlink() of the old name.

Since each file system only knows about its own content, the last 7th step can only succeed, if "newname" is actually within the namespace of the file system holding "oldname".
Actually, in case of MacOSX, the VFS does more in step 4 and 5 and among other things, it already looks up the vnode (kernel internal handle uniquely identifying every file, directory, device node etc.) for the parent of "oldname" and of "newname" and returns an error if both belong to different file systems.

So the attempt "move" never reaches ZFS. Instead, both Finder for drag'n'drop and mv receive an error and fall back to a copy followed by a delete.

The fact, that both ZFS datasets belong to the same pool does not matter, since they are independent file systems for the MacOSX kernel.

The same holds true for Linux:
try "strace mv oldname newname" with both files on different datasets in the same pool. You will see:
Code: Select all
#> zfs create zfs-res/td1
#> zfs create zfs-res/td1/td2
#> echo test >/zfs-res/td1/td2/ft
#> strace mv /zfs-res/td1/td2/ft /zfs-res/td1/ft

...
stat("/zfs-res/td1/ft", 0x7fff3e87ac30) = -1 ENOENT (No such file or directory)
lstat("/zfs-res/td1/td2/ft", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
lstat("/zfs-res/td1/ft", 0x7fff3e87a800) = -1 ENOENT (No such file or directory)

rename("/zfs-res/td1/td2/ft", "/zfs-res/td1/ft") = -1 EXDEV (Invalid cross-device link)

unlink("/zfs-res/td1/ft")               = -1 ENOENT (No such file or directory)
open("/zfs-res/td1/td2/ft", O_RDONLY|O_NOFOLLOW) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
open("/zfs-res/td1/ft", O_WRONLY|O_CREAT|O_EXCL, 0600) = 4
fstat(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
...
read(3, "test\n", 131072)               = 5
write(4, "test\n", 5)                   = 5
read(3, "", 131072)                     = 0
utimensat(4, NULL, {{1359843740, 769679000}, {1359843740, 770013000}}, 0) = 0
flistxattr(3, (nil), 0)                 = 0
flistxattr(3, 0x7fff3e87a5b0, 0)        = 0
fchmod(4, 0100644)                      = 0
close(4)                                = 0
close(3)                                = 0
...
unlinkat(AT_FDCWD, "/zfs-res/td1/td2/ft", 0) = 0



Here mv is falling back to a copy plus setting the modification time of the new file to the old times, followed by an unlink().

Hope this clarifies what you see.
Björn
BjoKa Offline


 
Posts: 14
Joined: Sat Feb 02, 2013 3:18 pm
Location: Germany

Thanks

Post by grahamperrin » Sat Feb 02, 2013 9:54 pm

Very interesting.

In my test, the move succeeded. Can you explain what is wrong?

I omitted to mention,

Code: Select all
macbookpro08-centrim:~ gjp22$ zfs get type,mountpoint gjp22/casesensitive
NAME                 PROPERTY    VALUE                         SOURCE
gjp22/casesensitive  type        filesystem                    -
gjp22/casesensitive  mountpoint  /Volumes/gjp22/casesensitive  default
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Next

Return to General Discussion

Who is online

Users browsing this forum: No registered users and 3 guests

cron