Editing Development
Warning: You are not logged in.
Your IP address will be recorded in this page's edit history.The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 13: | Line 13: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
− | $ sudo nvram boot-args="-v keepsyms= | + | $ sudo nvram boot-args="-v keepsyms=y debug=0x144" |
</syntaxhighlight> | </syntaxhighlight> | ||
Line 59: | Line 59: | ||
(lldb) addkext -F /tmp/zfs.kext/Contents/MacOS/zfs 0xffffff7f8ebbf000 | (lldb) addkext -F /tmp/zfs.kext/Contents/MacOS/zfs 0xffffff7f8ebbf000 | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
Then follow the guide for GDB above. | Then follow the guide for GDB above. | ||
Line 215: | Line 210: | ||
Our strategy was to determine how much of the Illumos allocator could be implemented on OS X. After a series of experiments where we implemented significant portions of the kmem code from illumos on top of bmalloc, we had learned enough to take the final step of essentially copying the entire kmem/vmem allocator stack from Illumos. Some portions of the kmem code have been disabled in kmem such as logging, and hot swap CPU support have been disabled due to architectural differences between OS X and Illumos. | Our strategy was to determine how much of the Illumos allocator could be implemented on OS X. After a series of experiments where we implemented significant portions of the kmem code from illumos on top of bmalloc, we had learned enough to take the final step of essentially copying the entire kmem/vmem allocator stack from Illumos. Some portions of the kmem code have been disabled in kmem such as logging, and hot swap CPU support have been disabled due to architectural differences between OS X and Illumos. | ||
− | By default kmem/vmem require a certain level of performance from the OS page allocator. It is easy to overwhelm the OS X page allocator. We tuned vmem to use | + | By default kmem/vmem require a certain level of performance from the OS page allocator. It is easy to overwhelm the OS X page allocator. We tuned vmem to use 512Kb chunks of memory from the page allocator rather than the smaller allocations that vmem prefers. This is less than ideal as it reduces the ability for vmem to smoothly release memory to the page allocator when the machine is under pressure. While we have an adequately performing solution now, there will always be a tension between our allocator and OS X itself. OS X only provides minimal mechanisms to observe and respond to memory pressure in the machine, so we are somewhat limited in what can be achieved in this regard. |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
References: | References: | ||
Line 277: | Line 268: | ||
14/08/2015 5:09:47.000 PM kernel[0]: SPL: mach_kernel : _call_continuation + 0x17 | 14/08/2015 5:09:47.000 PM kernel[0]: SPL: mach_kernel : _call_continuation + 0x17 | ||
− | You can clearly see the kind of memory corruption, the actual corrupted data, which kmem cache was involved, the relative time that the last action occurred, and the stack trace for the last action (which was a call to zfs_kmem_free()) - indicating that spl_start() was implicated in the fault. This event would have | + | You can clearly see the kind of memory corruption, the actual corrupted data, which kmem cache was involved, the relative time that the last action occurred, and the stack trace for the last action (which was a call to zfs_kmem_free()) - indicating that spl_start() was implicated in the fault. This event would have logged on the next allocated after the free and modify occurred. |
=== Compiling to lower OSX versions === | === Compiling to lower OSX versions === | ||
Line 389: | Line 380: | ||
Run the test suite | Run the test suite | ||
− | sudo make | + | sudo make test |
− | |||
− | |||
− | |||
− | |||
− | |||
== Iozone == | == Iozone == | ||
Line 498: | Line 484: | ||
This section is an attempt to outline the differences from ZFS versions of other platforms, as compared to OS X. To assist developers new to the Apple platform, who wishes to assist, or understand, development of the O3X version. | This section is an attempt to outline the differences from ZFS versions of other platforms, as compared to OS X. To assist developers new to the Apple platform, who wishes to assist, or understand, development of the O3X version. | ||
− | === | + | === Reclaim === |
− | + | One of the biggest hassles with OS X is the VFS layer's handling of reclaim. First it is worth noting that "struct vnode" is an opaque type, so we are not allowed to see, nor modify, the contents of a vnode. | |
− | + | (Of course, we could craft a mirror struct of vnode and tailor it to each OS X version where vnode changes. But that is rather hacky.) | |
− | + | ||
− | + | ||
− | + | Following that, the '''only''' place where you can set the '''vtype''' (VREG, VDIR), '''vdata''' (user pointer to hold the ZFS znode), '''vfsops''' (list of filesystem calls "vnops") etc, is '''only in calling vnode_create()'''. | |
− | we | + | So there is no way to "allocate an empty vnode, and set its values later". The FreeBSD method of pre-allocating vnodes, to avoid reclaim, can not be done. |
+ | ZFS will start a new dmu_tx, then call zfs_mknode which will eventually call vnode_create, so we can not do anything with dmu_tx in those vnops. | ||
+ | |||
+ | The problem is, if vnode_create decides to reclaim, it will do so directly, as the same thread. It will end up in vclean() which can call vnop_fsync, vnop_pageout, vnop_inactive and vnop_reclaim. The first three of these calls, we can | ||
+ | use the API call vnode_isrecycled() to detect if these vnops are called "the normal way", or from vclean. If we come from vclean, and the vnode is doomed, we will do as little as possible. We can not open a new TX, and | ||
+ | we can not use mutex locks (panic: locking against ourselves). | ||
+ | |||
+ | Nor is there any way to defer, or delay, a doomed vnode. If vnop_reclaim returns anything but 0, you find the lovely XNU code of | ||
+ | 2205 if (VNOP_RECLAIM(vp, ctx)) | ||
+ | 2206 panic("vclean: cannot reclaim"); | ||
+ | in vfs_subr.c | ||
+ | |||
+ | |||
+ | So, at the moment there is some extra logic in '''zfs_vnop_reclaim''' to handle that we might be re-entrant as the '''vnode_create''' thread. | ||
+ | |||
+ | exception = ((zp->z_sa_hdl != NULL) && | ||
+ | zp->z_unlinked) ? B_TRUE : B_FALSE; | ||
+ | fastpath = zp->z_fastpath; | ||
+ | |||
+ | if both exception and fastpath are FALSE, we can call direct reclaim right there. As in those cases, no final dmu_tx is caused. Following | ||
+ | the zfs_rmnode->zfs_purgedir->zget and similar paths, exception is set to TRUE. | ||
+ | |||
+ | If exception is TRUE, we add the zp to the reclaim_list, and the separate reclaim_thread will call zfs_rmnode(zp). As a separate thread it can handle calling | ||
+ | dmu_tx. | ||
+ | |||
+ | If fastpath is TRUE, we do no more/nothing in zfs_vnop_reclaim. See below. | ||
=== Fastpath vs Recycle === | === Fastpath vs Recycle === | ||
Line 540: | Line 549: | ||
There are two calls to vn_rdwr() in OSX's SPL. The '''spl_vn_rdwr()''' call needs to be used when zfs_onexit is in use. For example, dmu_send.c (zfs recv/send) and zfs_ioc_diff (zfs diff). The XNU implementation of | There are two calls to vn_rdwr() in OSX's SPL. The '''spl_vn_rdwr()''' call needs to be used when zfs_onexit is in use. For example, dmu_send.c (zfs recv/send) and zfs_ioc_diff (zfs diff). The XNU implementation of | ||
− | zfs_onexit (as in calls to ''' getf ''' and ''' releasef ''' ) need to place the internal XNU ''struct fileproc'' in the wrapper ''struct spl_fileproc'' , so that '''spl_vn_rdwr()''' can use it to do IO. | + | zfs_onexit (as in calls to '''getf''' and '''releasef''') need to place the internal XNU ''struct fileproc''' in the wrapper ''struct spl_fileproc'', so that '''spl_vn_rdwr()''' can use it to do IO. |
This is the only way to do IO on a non-file based vnode (ie, pipe or socket). Other places that call vn_rdwr(), for example vdev_file.c, needs to call the regular vn_rdwr. | This is the only way to do IO on a non-file based vnode (ie, pipe or socket). Other places that call vn_rdwr(), for example vdev_file.c, needs to call the regular vn_rdwr. | ||
+ | |||
=== getattr === | === getattr === | ||
XNU has a whole bunch of items that it can ask for in vnop_getattr, including VA_NAME, which is used heavily by Finder (especially in the vfs_vget path). Care is needed here to return the correct name, | XNU has a whole bunch of items that it can ask for in vnop_getattr, including VA_NAME, which is used heavily by Finder (especially in the vfs_vget path). Care is needed here to return the correct name, | ||
− | including for link (hard links) targets. | + | including for link (hard links) targets. VNOP_LOOKUP records the name that was used in the lookup, so that a following stat call (vnop_getattr) on the vnode will return the correct name if VA_NAME is requested. |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + |