File-based vdevs

Moderators: jhartley, MSR734, nola

File-based vdevs

Post by grahamperrin » Sun Apr 14, 2013 9:38 am

With ZEVO Community Edition 1.1.1:

> … Please do not use file based pools for any important data.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: File-based vdevs

Post by raattgift » Sun Apr 14, 2013 5:34 pm

That's short hand for "there are gotchas" beyond having to know how to import a pool with file devices, knowing how the names of the files themselves will be treated if moved from the precise environment in which they were created, being aware of filesystems which compress or sparseify files or which do not offer POSIX semantics or which are deficient with respect to durability guarantees, all of which can lead to data in the pool becoming temporarily or permanently unavailable, and in some cases may cause a system to crash.

Most of those issues affect all implementations of ZFS, which has always implemented files as first class vdev components, but which has always warned people against using them without explaining why in great detail. Most of it boils down to: configuring is easy; configuring well not so much; maintaining even less so. Also, performance is not going to be very good, especially if the files live in a filesystem that does lots of metadata updates (atimes, zfs internal metadata, etc.), which does not match the pool's ashift, or which does its own checksumming and compression.

ZEVO-specific caveats are likely limited to (a) the dynamic name allocation systems in Mac OS X and (b) the performance impact on the small ARC and vnode/dnode/znode spaces. (a) will probably bite someone doing a zpool import.

All this considered, the position stated on the wiki page is reasonable. Personally, I would say, "it works fine, we use it ourselves, but there are lots of ways to do the wrong thing, and if you do any of those it's your problem not ours; don't call us or call us names if you use file-based vdevs and hit problems, even if it works fine in OI or OS".

Incidentally, "file-based vdev" is an awkward way of putting it. Is a 3-sided mirror vdev with two physical disks and one file a "file-based vdev"? Or does "file-based vdev" mean a vdev that is only one file? (Guess: yes, no).
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Terminology

Post by grahamperrin » Mon Apr 15, 2013 12:04 am

raattgift wrote:… "file-based vdev" is an awkward way of putting it. …


How about the phrase that's currently in the wiki?
    file for a pool device
or something like that …

(I'm not the best person to ask. I spent months confusing this stuff with ZVOL.)
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: File-based vdevs

Post by ilovezfs » Mon Apr 15, 2013 12:37 am

Well, let's be Fundamentalists and quote from the Word:
zpool(1M) wrote:A virtual device describes a single device or a collection of devices organized according to certain performance and fault characteristics. The following virtual devices are supported:
...
file
A regular file. The use of files as a backing store is strongly discouraged. It is designed primarily for experimental purposes, as the fault tolerance of a file is only as good as the file system of which it is a part. A file must be specified by a full path.

http://docs.oracle.com/cd/E19253-01/816-5166/zpool-1m/


And if the "fault tolerance of a file is only as good as the file system of which it is a part," then that would seem to argue for storing your file vdevs (at least locally) on a ZFS file-system composed solely of physical vdevs. Or is zfs(zfs()) just asking for trouble?
ilovezfs Online


 
Posts: 249
Joined: Sun Feb 10, 2013 9:02 am

Re: File-based vdevs

Post by raattgift » Mon Apr 15, 2013 8:27 am

"fault tolerance of a file is only as good as the file system of which it is a part"

Correct, but you can use multiple files and spread them across multiple filesystems and gain diversity robustness, which is precisely the same strategy that ZFS uses with multiple physical disks (when not arranged as single-disk vdevs, as those have zero replication).

A single file on a zfs pool that has nonzero replication is pretty safe, so "as good as" is not such a big warning. However, if you want to move the file somewhere else -- across a network, onto tape or other offline storage media -- then you have to be aware of risks to the data. Multiple files on a zfs pool is also fine -- note my raidz2 example -- and while that's redundant while the backing files are stored on zfs (in a pool with replication!), you can survive damage to the various files in transit or when stored in bit-rotting offline media without data corruption or loss.

"zfs(zfs())" is fine, given the non-exhaustive list of caveats in my previous posting. As targets of zfs send/receive activity for offsite backup or the like, recursive pools are a good fit. They're not really suitable as pools that are meant to be always ONLINE and doing continuous IO.

Pools with vdevs using multiple files are a lot less scary to me than single-disk pools (or otherwise configured with no *pool-level* data replication), which seem fairly popular with users here. Also, a massive nesting of pools-using-files-on-pools-...-on-nonzero-replication-pools-on-SATA3-or-SAS-etc.-disks will still outperform any filesystem backed by a USB2 mass storage device.
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Context

Post by grahamperrin » Wed Jul 10, 2013 8:45 am

Note to self: this topic arose from a mention of file-based vdevs under Need Help with a Backup Solution
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to General Discussion

Who is online

Users browsing this forum: ilovezfs and 0 guests

cron