Your configuration is reasonable; I have a couple of similar setups on recent mac minis.
Gigabytes of L2ARC will take a long time to fill, even with fairly heavy use -- l2arc_feed_thread() runs periodically and does not move a huge volume of data onto cache vdevs by design. As noted in the lonnnnng "Level 2 ARC" comment block in the standard zfs/arc.c, this is to avoid "clogging" the cache vdev with writes and to avoid churn. It also allows for better scheduling of writes onto the device. The hottest blocks always stay near the head of the MFU queue, so are unlikely to be copied into the L2ARC; L2ARC's existence encourages this result. Additionally, sequential prefetches and writes are not L2ARC eligible unless they are rapidly reused (and therefore are in the MFU queue rather than just the MRU one).
Moreover, the problem with a huge L2ARC is in the number of objects stored in the cache vdevs; each object -- which may range in size from 512 bytes to 256 kibytes -- consumes ~256 bytes of RAM, and in ZEVO there is a low cap on the amount of RAM the entire zfs subsystem will use. If objects are all large, a huge L2ARC is not a problem, but might not be adding any value if it's not serving up many read IOPS (zpool iostat -v will tell you that; zstat's "ARC overall" hit percentage is important; if it's anywhere over 95%, you likely have enough ARC+L2ARC, and will only see diminishing returns as you add more). If there are many small objects, you will end up with less space in the main ARC and performance will tank in two ways: firstly, writes will throttle terribly and secondly reads will be served by the slower L2ARC or even the storage vdevs.
Write throttling is the worst of these, especially if the cache vdev devices have very low read latencies. Whenever *anything* is written through the ZPL, it goes into ARC. (Synchronous writes also go into the ZIL or log vdev). If there is lots of space in the ARC, the transaction group will stay open for several seconds, accumulating dirty ARC objects. The txg then transitions to the quiescing and syncing phases, and the writes are scheduled out in a large burst. However, when there is very little ARC space, the txg transitions much more quickly to the quiescing phase, which will block new writes; it may spend some time quiescing, depending on the write load. Essentially it waits for a tiny number of threads (possibly even one) to send a record's worth of data or to suspend the current write operation. Then, given write pressure and a tiny quiesced txg, writes are synced. However, the act of writing out a txg can also cause other writes to be necessary in the same transaction group -- internal zfs metadata, POSIX metadata ([amc]times; directory data updates), and so forth. Each of these may in turn have to wait on an ARC block to become available through eviction or through further sync activity. However, L2ARC metadata is *not* fully releasable. If there simply is too little space in the ARC because of things that cannot be evicted or released, a pattern of salvage IO occurs (from reducing txg_time and clamping ARC occupancy) such that the pool's storage vdevs are absolutely hammered with tiny IOPS, which generally brings a system to its knees. It may not be possible to recover from that pattern in generic zfs implementations.

Again, you will only see this pattern given sufficient system uptime, a usage pattern that favours L2ARC-eligibility, leading to many arc_buf_hdr_t objects, and few free arc_buf_t ones, as reported by zstat.
It is, however, easy to avoid this particular problem simply by using a smaller-sized set of cache vdevs. Adding many GiB of l2arc is not likely to hugely improve your Mac's performance if you're not using it as a server for dozens of busy clients. A few extra GiB per pool is likely to push your "ARC overall" hits to somewhere above 95%, and you are unlikely to do much better than that.
With large SSDs it is better to make a pool out of slices of two or more of them and to use them for datasets with the most latency-sensitive workloads.
I'm pretty sure your 2x115GiB cache vdevs are essentially wasted, with only hundreds of MiB of occupancy after a day or two of uptime.
The workload you describe is highly sequential and your raidz2 vdev will handle that fine, although there is likely a speed mismatch across the devices in the vdev that won't help you.
After an uptime of a week or two check the occupancy of the cache vdevs (first column of "zpool iostat -v"), and size the cache vdevs to that, or even smaller, since the "alloc" includes stale and unreachable entries that will never be read from the device, and that tends to be a substantial percentage of a large cache vdev.
The space not used, I'd put into place for whatever data you have that can fit in a small low-latency fast pool.
On one of my worstations, that includes my $HOME (with separate datasets for ~/Library/Safari and ~/Library/Saved Application State so that I can snapshot them frequently and recover old ~/Library/Safari/LastSession.plist and the like thanks to local snapdir=on settings), my macports /opt (which requires a LaunchDaemon plist that deals with /opt taking its time to be mounted, which freaks out launchd thanks to macports use of symlinks in /Library/LaunchDaemons), and my squid3 cache_dir.
- Code: Select all
NAME USED AVAIL REFER MOUNTPOINT
...
ssdpool/DATA/opt 32.7Gi 82.6Gi 27.3Gi /opt
ssdpool/DATA/squidcache 21.7Gi 8.34Gi 21.2Gi /Volumes/ssdpool/DATA/squidcache
ssdpool/xxx 117Gi 33.3Gi 110Gi /Users/xxx
ssdpool/xxx/Library 1.49Gi 33.3Gi 272Ki none
ssdpool/xxx/Library/Safari 1.09Gi 33.3Gi 118Mi /Users/xxx/Library/Safari
ssdpool/xxx/Library/SavedApplicationState 411Mi 33.3Gi 37.0Mi /Users/xxx/Library/Saved Application State
...
And in another machine
- Code: Select all
NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
homepool 32.6Gi 159Gi 6.64Mi 45.4Mi 0 159Gi
...
homepool/DATA/XPLANE10 32.6Gi 60.4Gi 1.09Gi 59.3Gi 0 0
As the ARC heats up, performance is noticeably better than when the same data was in a softraid mirror on the same two SSDs.
I have regularly shrunk the boot volume and increased the pool on paired SSDs in my Macintoys that have them.
Likewise, I moved the cache vdevs (for my spinning rust pools, I was 40-60GiB slices of the SATA 3 SSDs, which was a total waste of space) onto fast 128 GiB USB3 flash sticks, and then after some analysis of actual use decided to use per-pool pairs of 5 GiB partitions living on two of those sticks, which has perhaps counterintuitively dramatically improved overall performance and system robustness.
My big frustration of the moment is how hard it is to get all of /Library/Server off the boot volume in 10.8.3 Server without running into enormous problems at system startup time.
Like you, I am tired of *actual* data loss and corruption in JHFS+ volumes, especially corruption that was undetected for some time (and has thus been propagated to backups). ZEVO CE has been great for that, otherwise it'd be all mounted across a network network from OI or FreeBSD 9.1 servers, with attendant performance degradation, administration hassles, security hazards, and so forth.