Contemplating ZFS

Moderators: jhartley, MSR734, nola

Contemplating ZFS

Post by LaMosca » Wed Dec 12, 2012 8:35 pm

ZFS newbie here looking for a more reliable filesystem after a failed Time Machine restore and concern over lots of media files I have.

I'm in the process of moving from a 2006 MacPro to a hackintosh server running Mountain Lion 10.8.2. As Murphy would dictate, the MacPro lost its root disk while I was working on building the hackintosh. I attempted a full restore from my Time Machine backup and it failed at various points in the restore (as I tried different dates to get it to work). I was able to reinstall the OS and then do a restore of much of what I needed, but it made me think about the reliability of HFS+. I have a large iTunes database of movies and music (1.83TB currently) that I would hate to become silently corrupted over time.

My current setup:
ReadyNAS NV+ with 4x1TB disks in a RAID5
Primary source for shared documents, pictures, software install sources (disk images, etc)
Backup source for the iTunes database

Server (mostly now on the hackintosh)
Time Machine Server (for other Macs in the house)
Primary source for iTunes database (OWC Qx2 external enclosure with 4x2TB disks setup in RAID5)
Backup source for documents and pictures (WD external enclosure with 2x1TB disks that are mirrored)

I have backup jobs that run on the NAS to mount/copy files to/from the server via SMB nightly.

On the new server, I'm using a 128GB SSD for the boot disk and have a 2TB WD Caviar black (which I am planning to use for user home directories, and possibly as the new target for the backup of documents and pictures) along with a 3TB Hitachi Deskstar (which I am planning to use for Time Machine backups of both the server boot disk and other Macs in the house).

I am contemplating using ZFS and have been trying to do as much reading on here as I can. I have a couple questions and concerns:

1) Would it be possible (and/or wise) to reformat the OWC RAID5 as ZFS? If not, would setting it to JBOD and handling each server individually be better (eg: setting up a raid10 which would result in ~4TB volume instead of ~6TB as it is now)
2) I've read of SMB issues and since this is currently the method I'm using to perform the backups between the NAS and the server, I'm wondering if I can use another method such as FTP or rsync (which the ReadyNAS also supports).
3) I'm sharing a number of filesystems out via AFP using OS X Server's Sharing feature (in addition to SMB so I could do backups via the NAS). I've read there's a fix ("hack"?) to fix this. How reliable is this?
4) Will Time Machine backups and restores work to remote Macs? I'm not sure how the remote Macs mount this, but I would guess AFP.

I know for local restores to the server, I would need to have the ZFS drivers installed to access it in the case of losing the boot disk (either installing ZFS drivers to the flash drive used for OS install or perform the OS install first, install ZFS drivers to the new boot disk and then restore).

Anything else I may be missing or not have thought about yet?

Sure would be nice if Apple would move to some other filesystem like ZFS :)
LaMosca Offline


 
Posts: 11
Joined: Thu Dec 06, 2012 10:42 pm

link

Post by grahamperrin » Thu Dec 13, 2012 1:26 am

LaMosca wrote:… AFP … fix …


search.php?keywords=AFP&sf=titleonly

Aim for HOW TO: Share ZFS using AFP
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

OWC: link, food for thought

Post by grahamperrin » Thu Dec 13, 2012 1:32 am

LaMosca wrote:… Would it be possible (and/or wise) to reformat the OWC RAID5 as ZFS? …


search.php?keywords=OWC&sf=titleonly

Aim for problem with OWC Mercury Rack Pro
– not answered, but maybe food for thought if you're considering how to configure a 4-bay device from OWC.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

OSx86 Time Machine Server, ZFS, AFP, compression

Post by grahamperrin » Thu Dec 13, 2012 2:07 am

LaMosca wrote:… Time Machine … remote …


Time Machine Network Interface Specification (TMNIS): Time Machine Server Requirements

If not already amongst your thoughts: at the server, disallow sleep of hard disks.

Consider:

  • a child file system for the dataset that is to store the JHFSX sparse bundle disk image for Time Machine
  • making the share point anything other than the root of that ZFS file system
  • depending on the data to be backed up, compression=gzip-9 for that file system.

Compression of bands within the bundle

At http://www.wuala.com/grahamperrin/public/2012/09/22/a/ (Overview of, and goodbye to pool 'blocky-OS'; hello to 'flakylaciebde') the following file includes an analysis of sizes:

  • 2012-09-22 08-05 overview of blocky-OS.txt

– I assume that nearly all objects with a logical size of 8M were bands within the sparse bundle disk image that I (then) used for Time Machine writes to ZFS. In that case, experiments included copies=2 but please note that for all normal uses of ZFS with Time Machine, we should prefer the copies=1 default.

----

LaMosca, I'll be interested to see how this topic fans out after replies from other readers …
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Contemplating ZFS

Post by LaMosca » Fri Dec 14, 2012 2:30 pm

So far, it's just you :)

I'm considering some other future changes as well. Right now, I have a number of non-rack mountable items: ReadyNAS NV+, OWC Qx2 4-disk RAID enclosure, WD two-disk enclosure along with one other disk in an OWC single disk enclosure. Since I have a wall-mount rack, it would be nice to move everything to rack mount enclosures. I've been looking at a couple 1U NAS enclosures (the Synology RS812 is looking adequate and reasonably priced) as well as 4U 10-15 disk port multiplier enclosures (eg: istarusa DAGE410U40DE, redundant power supplies is an available option).

My idea is to get two 2-port eSATA cards with port multiplier support (low-profile to fit in my 2U athena 2U2022S558 server chassis), one port on each would be used to connect to a port on the storage enclosure supporting up to 5 disks. I would then add disks in twos, one going into each 5-disk cage (and a mirror created across the two disks). Can I thus build up a RAID10 pool this way (starting with a minimum of 4 disks)? Is it possible to keep adding sets of two disks (mirrored) to expand storage capacity until I've reached the port multiplier limit of 5 disks per channel (10 total across the two ports)?

Eg: Starting storage set
Port Multiplier Card 1, Port 1 - Connects to eSATA port 1 on disk enclosure covering disk slots 1-5
Disk 1: 2TB - Member 1 of Mirror 1
Disk 2: 2TB - Member 1 of Mirror 2
Port Multiplier Card 2, Port 1 - Connects to eSATA port 2 on disk enclosure covering disk slots 6-10
Disk 1: 2TB - Member 2 of Mirror 1
Disk 2: 2TB - Member 2 of Mirror 2
RAID10 Pool = Stripe across Mirror 1 and Mirror 2

Later, expanding the pool:
Port Multiplier Card 1, Port 1
Disk 3: 2TB - Member 1 of Mirror 3
Port Multiplier Card 2, Port 1
Disk 3: 2TB - Member 2 of Mirror 3
Add Mirror 3 to RAID10

Would this work? Or do you need to set the full number of disks when originally building the RAID10 set?

Also, once reaching full capacity (5 sets of mirrored disks, 10 disks total), could one start replacing the old sets of disks with higher capacity disks and live expand the RAIDset/volume(s)? Obviously,one would need to replace a single disk in a mirror at a time, let it sync, then replace the other.
LaMosca Offline


 
Posts: 11
Joined: Thu Dec 06, 2012 10:42 pm

Re: Contemplating ZFS

Post by ghaskins » Sat Dec 15, 2012 8:33 pm

Hello LaMosca,

LaMosca wrote:I would then add disks in twos, one going into each 5-disk cage (and a mirror created across the two disks). Can I thus build up a RAID10 pool this way (starting with a minimum of 4 disks)?


Yep!

LaMosca wrote:Is it possible to keep adding sets of two disks (mirrored) to expand storage capacity until I've reached the port multiplier limit of 5 disks per channel (10 total across the two ports)?


Yep, in fact, this is one of ZFS' strong suits. What you want to look at specifically are "vdevs" of type "mirror". The thing with ZFS is you cannot alter the basic composition of a vdev once you create it (e.g. a 2-disk mirrored vdev will always desire exactly 2 disks, or a 7-disk raidz vdev will always seek 7 disks, etc). However, you can add vdevs at will, now or in the future, and its kind of like a hybrid of striping and concatenation when you do it. Its like striping in that ZFS (IIUC) will distribute blocks across all vdevs as the data is written. It is like concatenation in the sense that it will not redistribute your existing data if you add new vdevs. So for instance, if you start with 2 vdevs it will stripe across those two. Adding a third, your original data will remain on the first two with new data writing across all three, etc.

LaMosca wrote:Eg: Starting storage set
Port Multiplier Card 1, Port 1 - Connects to eSATA port 1 on disk enclosure covering disk slots 1-5
Disk 1: 2TB - Member 1 of Mirror 1
Disk 2: 2TB - Member 1 of Mirror 2
Port Multiplier Card 2, Port 1 - Connects to eSATA port 2 on disk enclosure covering disk slots 6-10
Disk 1: 2TB - Member 2 of Mirror 1
Disk 2: 2TB - Member 2 of Mirror 2
RAID10 Pool = Stripe across Mirror 1 and Mirror 2


So for simplicity lets call "Member 1 of Mirror 1" simply "1:1", "Member 2 or Mirror 1" -> "2:1", etc.

Your initial setup would be:
Code: Select all
zpool create tank mirror 1:1 2:1
zpool add tank mirror 1:2 2:2


LaMosca wrote:Later, expanding the pool:
Port Multiplier Card 1, Port 1
Disk 3: 2TB - Member 1 of Mirror 3
Port Multiplier Card 2, Port 1
Disk 3: 2TB - Member 2 of Mirror 3
Add Mirror 3 to RAID10


You would then do this:
Code: Select all
zpool add tank mirror 1:3 2:3



LaMosca wrote:Also, once reaching full capacity (5 sets of mirrored disks, 10 disks total), could one start replacing the old sets of disks with higher capacity disks and live expand the RAIDset/volume(s)? Obviously,one would need to replace a single disk in a mirror at a time, let it sync, then replace the other.


I believe the answer is "yes", though I have yet to try it myself. The mechanics of this are via "zpool replace", and my understanding is it will expand onto the larger size once the vdev has been fully upgraded such that all the disks are now the larger size.
Last edited by ghaskins on Sat Dec 15, 2012 11:04 pm, edited 1 time in total.
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Re: Contemplating ZFS

Post by ghaskins » Sat Dec 15, 2012 8:36 pm

BTW: You may want to look into using a SAS HBA instead of an eSATA setup. The SAS HBA/enclosures will generally offer you much more non-blocking bandwidth, and SAS enclosures are backwards compatible with SATA drives. Personally, I am using an LSI 9207-8e together with a Sans Digital TR8X+ 8-bay enclosure. Very happy so far.
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Re: Contemplating ZFS

Post by LaMosca » Sun Dec 16, 2012 7:37 pm

Boy, you add the words "rack mount" to items and they all seem to go up. This is a somewhat reasonable 1U miniSAS/SATA 4-disk enclosure I've found: http://www.istarusa.com/raidage/product ... 04U40BK-MS

I could use two of those, each connected to one port of the LSI 9207-8e. Base cost without any drives would be nearly $1000 and have an 8-drive capacity.

The istarusa 15-disk capacity eSATA PM enclosure is $850 (+$400 for redundant power supply option) and NewerTech 6G eSATA 2-port PM cards are $68/each. So $1400 for 15-disk capacity with redundant power supplies and redundant eSATA cards (although if I use all three sets of 5-disk cages on the enclosure, I would use three of the four ports between the two eSATA PCI cards). One thing is for sure, neither way is cheap!
http://www.istarusa.com/raidage/product ... 15U40DE-PM
http://eshop.macsales.com/item/NewerTech/MXPCIE6GRS/

How did you initially setup your TR8X+ and have you expanded it since then?

The other fun part is I literally have 5U of space remaining in my rack with which to work. 1U of that will be a new NAS. that leaves 4U for storage.

Thank you very much for the RAID 10 / ZFS info. I'm trying to think ahead to a migration strategy and how I could do some testing (to see if replacing drives in the mirror with larger ones will auto-expand once both are replaced) before I put things into production. I'm having some issues with my ReadyNAS so getting that replaced may be the highest priority.
LaMosca Offline


 
Posts: 11
Joined: Thu Dec 06, 2012 10:42 pm

Re: Contemplating ZFS

Post by ghaskins » Sun Dec 16, 2012 9:28 pm

LaMosca wrote:One thing is for sure, neither way is cheap!


Yeah, and SAS will definitely be more expensive, port for port. If you will end up using SATA drives anyway, you wont get all the SAS benefits like a deeper TCQ queue or dual/wide-port operation. In this use case, the main advantage will be that SAS HBAs tend to have a lot more options on the "high end", which means you are much more likely to be able to scale the IO up as you add drives and/or SSDs. A low end eSATA chip might offer the connectivity, but it might not be able to drive all channels to full bandwidth simultaneously, be able to take advantage of PCIe3, etc. But in the end, if cost is more important than performance, SAS will have a hard time competing against eSATA.

LaMosca wrote:How did you initially setup your TR8X+ and have you expanded it since then?


I only bought all the stuff in the last few weeks, so I am still getting it all set up. Right now, I have 6x2TB 7200rpm SATA drives in a raidz vdev. But I have played around with various options, like 3 stripes on 2-way mirrors, etc, simulating something closer to your proposed configuration. The vdev-adds go smooth, but I havent done it after I already had "production" data live on the system, like you would if you were doing the update later in the lifecycle.

LaMosca wrote:I'm trying to think ahead to a migration strategy and how I could do some testing (to see if replacing drives in the mirror with larger ones will auto-expand once both are replaced) before I put things into production.


You can do all kinds of testing of this nature either with something like a set of sparsebundles or even a VM running an alternate ZFS platform like openindiana, freenas, etc. For instance, you could create a pair of, say, 200G sparsebundles, and then replace them with 400G sparsebundles (they are thin provisioned, so you wouldn't actually need 200+200 or 400+400 of space.
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Re: Contemplating ZFS

Post by ghaskins » Sun Dec 16, 2012 10:00 pm

ghaskins wrote:You can do all kinds of testing of this nature either with something like a set of sparsebundles


I just tried this and it does _not_ appear to work as I would have expected. My guess is it might be a limitation of Zevo CE 1.1.1, but it could also be "operator error" ;) Any comments from Zevo experts?

Following this http://serverfault.com/questions/15290/how-to-upgrade-a-zfs-raid-z-array-to-larger-disks-on-opensolaris confirms that it should be possible in ZFS in general. However, I tried making a pair of 200G upgrade to 400G, and the pool remained at 200G even after following the advice to export/import in the article linked above. Also foreboding is the fact that the autoexpand property does not seem to exist in zfs-get under CE 1.1.1. The two points together lead me to believe that the support for expanding the size in general might be missing.

jFYI, in case this is a major requirement for you. Of course, by the time you actually need it, theres a chance this could be supported in Zevo.
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Next

Return to General Discussion

Who is online

Users browsing this forum: ilovezfs and 1 guest

cron