Contemplating ZFS

Moderators: jhartley, MSR734, nola

Re: Contemplating ZFS

Post by ghaskins » Mon Feb 04, 2013 12:22 am

mk01 wrote:@ghaskins: as you are saying "I think rsync has better unreliable connection recovery options than zfs send" I would not agree.


How so? Note that I said specifically "unreliable connection recovery options". Say you have 64GB of differential data to send (a common value for me when I unload my digital cameras). If you sent 63GB with zfs send and then the connection dies, you have to resend all 64GB on restart (IIUC). With rsync, it just picks up where it left off and sends the last 1GB.

Since my use case is across an 4Mb/s internet connection, where bandwidth is limited (~1.5 days to transfer 64GB) and interruptions are fairly probable, I see this is a giant advantage.

Aside from the inconvenience of starting a 1.5 day transfer over and over, we are ignoring the issue that many have reported about zpools becoming corrupted if a zfs-recv is interrupted. It also ignores the issue that you have to maintain some level of synchronization with your snapshots on both sides of the link with zfs-send, which can be problematic.

Maybe it may look scary on the first full send (few TB filesystem with only final snapshot),


I am not really sure what you are getting at here. Both zfs-send and rsync would be similar in that they both have a relatively large initial transfer, followed by relatively small incremental updates governed by the delta in the dataset. That is not "scary" on either front. It's just physics we have to live with.

but otherwise, if you think of what rsync is doing and what zfs snapshot and then send -I, it's incomparable.

if you change one file on FS with few millions files, many as part of package where actual homogenous state across all files is needed (like dBase databases, mysql databases, even itunes library, iPhoto, Aperture, Logic. rsync would be running hours just to send one new file.


I am not sure you are understanding how rsync works. It only sends the delta, just like zfs-send -I. If there were changes, only the subset of data within the files that changed are transmitted. ZFS has the theoretical advantage that it already knows about the delta in the dataset ahead of time. Rsync has to compute the delta using things like atime updates and generating checksums. However, in practice its very efficient and fast.

For instance, I have 3TB of data across approximately 3 million files, and a typical rsync runs for about 5 minutes, in which it figures out there are only a few dozen megabytes worth of changes to send. If the delta is larger, it takes longer proportional to the amount of data to send, gated by my internet bandwidth. But then again, so does zfs-send.

And if in between transaction occurs (which needs to update multiple files), you will never be able to reconstruct.

snapshot is snapshot, it freezes the state and happens in matter of seconds, the same for send.


Now this is actually a valid point. Being able to atomically snapshot the filesystem gives you a much higher probability of maintaing file-level consistency for changes in flight. For me, this is not an option b/c I can't get reasonable performance out of Zevo CE 1.1.1 and was forced to fall back on JHFS+ (for now). This isn't a big deal for me, either, though, because the data I care the most about are static media files (photos and videos of my family) and wouldn't be subject to transitive updates.

Once the snapshot is received and displayed on the other side, you are happy and sure you are fine.


"Once .. received" are the operative words. See my above comments about unreliable connection restart above where I stated why I think this falls apart with zfs-send.

You can't tell that for rsync. It's impossible. Just go through the various --delete options. Delete before, after, during, during in sequence, during fuzzy... it's scaryx Not that those options are implemented. But they are just the result of all the dramatic risks they try to avoid.


It's not impossible, its quite simple actually. I simply set rsync to perform all of its operations "in place" and to synchronize deletes, etc. The client-side driven script then zfs-snapshots the backend storage at the conclusion. Therefore, if the rsync dies in the middle, the snapshot is never run since the client-side script doesn't complete.

In practice, this means that it doesn't really matter what order you delete (before, after, during, etc) or update files. You only care that each snapshot was generated when an rsync job believed it had successfully completed a replica. If an update dies in the middle no snapshot is created, but the next connection will just continue where it left off.

You might argue that zfs-send preserves data integrity end-to-end, but I speculate that with ZFS and ECC on both ends of the link than an rsync transfer should be not susceptible to introducing errors in your data since TCP/SSH would verify transmission integrity.

The biggest argument I can see to using zfs-send over rsync is if you are using dedup and/or compressible files as they are likely to retain these properties in transmission. For me, I don't use dedup and my files are mostly incompressible anyway. However, my link is relatively slow, the data set is large, and the potential for interrupted transfers is relatively high. Using rsync over zfs-send a no-brainer (for me) even when zfs-send becomes a viable option. But I can see someone running, say, an environment where the connection is more reliable, the bandwidth more abundant, and dedup is being used going the zfs-send route. It's just not for me.

Based on all this, and your valid point about the atomic-snapshot benefit, the ideal solution (for me, at least) would be to snapshot the local filesystem, rsync from the snapshot to the backup server, snapshot the backup server, and then delete the local snapshot. I think this would give me the best of both worlds and avoid any of the problems I mentioned with zfs-send.

Kind Regards,
-Greg
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Re: Contemplating ZFS

Post by mk01 » Mon Feb 04, 2013 8:51 am

ghaskins wrote:Kind Regards,
-Greg


oh man, I didn't want to open such huge discussion , just wanted to join the discussion ;) so just few points for clarification and believe me, target is not take rsync out of your operations. And believe me again, I exactly know, how rsync works. That fact, which you pointed out as a benefit is doing rsync so slow and dangerous (dangerous - my opinion, slow - is fact). you will probably this never realize as an issue, because the data you are using. Large files are actually like snapshots with sorter time between doing them. If the transfer breaks, you probably resume where you like, I have to start from closest snapshot (what is 15 minute, if it runs automatically and I will realize only from log-file reading? And now the point. Rsync needs to startstart from beginning of the list of files to transfer and do the operation . Rsync needs to realize, what is missing, updated, the same. Over and over again.It's like an article I red somewhere on the intrnet, that to compare two files is really fast, because it just needs a hash of the file to compare. But how the hash is crated? By reading the whole file. And filesystem is always changing world. On my smallest Mac I have which is macbook, it takes 20 minutes only to send the list (850Gb and cca 3 milions of files). Imagine, your session will break every 20minutes. I know I know, rsync has fuzzy traversing implemented, but this closes one hole, opens another one. Simply design of files transfer is not for 21st century. It works until you have one machine, with one user and only with coordination I can admit that authors of rsync know what they are doing, but simply this concept had no place in the world decade ago. And again, it's still good for some operations, the same way as snail mail.

And if you stated at the beginning, that rsync just sync the difference, this needs to be proved and checked. Answer for that? Copy on write? Oh, ZFS is that again? zfs send doesn't waste time with that. Just sends stream of data. Believe me, that mathematical models and modern solutions for low level design of transactions, cloud filesystems etc is there for a reason.

And please again, I don't force you my ideas and experience. Just trying to put other angle to the view. And will not change anything on the fact, that rsync if ideal solutions for even when the design is from scratch really bad :)) Idea of filesystem developed back in 80's is bad. ... And still solving the purpose. Like HFS, NTSF, and majority of others ;)

And you really make me smile (mean like smile, when you are happy, not like be funny) because exactly that is possible, you can create snapshot, repipe it into continuos file and you are back with rsync. If it works for you? ...

Thank for interesting discussion... don't tell my wife. She hates me for this. She can't stand more than 15 minutes. Next time I will explain the stories around crashes and data lose.
mk01 Offline


 
Posts: 65
Joined: Mon Sep 17, 2012 1:16 am

Re: Contemplating ZFS

Post by ghaskins » Mon Feb 04, 2013 10:58 am

mk01 wrote:Rsync needs to realize, what is missing, updated, the same. Over and over again.It's like an article I red somewhere on the intrnet, that to compare two files is really fast, because it just needs a hash of the file to compare. But how the hash is crated? By reading the whole file. And filesystem is always changing world.


Note that the default mode of rsync is to pre-qualify files for consideration based on their modification-time and size (unless you override with the --checksum option, which I do not). The default mode only computes the checksum when the time/size differ. There is no way my system could compute the actual checksums in about 5 minutes with 3TB of data within 3 million files. It would take several hours, at the least.

Simply design of files transfer is not for 21st century. It works until you have one machine, with one user and only with coordination I can admit that authors of rsync know what they are doing, but simply this concept had no place in the world decade ago. And again, it's still good for some operations, the same way as snail mail.


It seems this is more of an argument that you should snapshot your data atomically before sending it more than it is an argument in favor of one method of replication over another. And its a point I fully agree with (its just not currently an option for me).

And if you stated at the beginning, that rsync just sync the difference, this needs to be proved and checked. Answer for that? Copy on write? Oh, ZFS is that again?


I am taking advantage of zfs cow/snapshots on the backend for this purpose. If your argument is that rsync is somehow unreliable at creating replicas, please provide more information. I believe rsync is used by many many people for backups, so if its broken there are many that would want to know about it.

zfs send doesn't waste time with that. Just sends stream of data.


Like I said, this is a theoretical advantage of zfs-send. Since ZFS actively maintains the checksums and b-trees within the metadata, they don't need to be re-computed to generate a delta. However, in practice rsync ends up being fairly fast and efficient at this (at least for my 3TB/3M dataset) so its not an issue.

Believe me, that mathematical models and modern solutions for low level design of transactions, cloud filesystems etc is there for a reason.


Again, this is just an argument in favor of snapshotting before you replicate (which I agree with). It had nothing to do with how you replicate.

And will not change anything on the fact, that rsync if ideal solutions for even when the design is from scratch really bad :)) Idea of filesystem developed back in 80's is bad. ... And still solving the purpose. Like HFS, NTSF, and majority of others ;)


There is a ton that ZFS does well. I just don't think zfs-send when you have large deltas over an unreliable and slow link is one of them.

And you really make me smile (mean like smile, when you are happy, not like be funny) because exactly that is possible, you can create snapshot, repipe it into continuos file and you are back with rsync. If it works for you? ...


I assume you are arguing that my statement about the ideal solution being snapshots+rsync is comical b/c that is the same thing that zfs-send is doing? If so, I still argue it is not because of all the points I brought up regarding the "unreliable connection recovery options" that started this subtopic.

Kind Regards,
-Greg
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Re: Contemplating ZFS

Post by mk01 » Mon Feb 04, 2013 3:49 pm

I wanted to say exact opposite. this must have been my poor english. I was smiling because I always like to see good and creative solutions.

And no I never intended to say, that I have informations about bugs in rsync. No, on a contrary. It;s miracle, that it's working to happiness of many people. And again, I'm not questioning rsync's quality of programming work as a robust tools. But the desing, desing.

Thinks which are bothering me and I was explaining, try using rsync on for collaboration and in groups. Try to build upon such system data center backups solution for regulated environments - I'm thinking of conformity to and predicability. It's impossible.

How you know, that after resync (resume) it will start at correct location of the file? How you now, that part of the file on one or other side is not changed already? You are not thinking, that need to traverse whole filesystem just to get list of targets is greatly ineffective? You really means, that changed timestamp means changed data? How you can check, that the data meant for transfer have been delivered?

It's impossible to just send one huge structured photo Library intact without blocking the access before. How would you transfer your 200GB Aperture Library even only to backup and be work with it at same time? How you can manage. How you even can take a pool of data and say, it's really backup from one exact day?

Async data transfer for files with resume and error correction was there in times of Zyxel modems 25 years ago. It was sufficient. That time.

And I'm attempted to again open topic of speed on a filesystem with 1kb files and numbers of millions. You will kill the system with IO operations and no result.
mk01 Offline


 
Posts: 65
Joined: Mon Sep 17, 2012 1:16 am

Re: Contemplating ZFS

Post by ghaskins » Tue Feb 05, 2013 12:13 am

mk01 wrote:Thinks which are bothering me and I was explaining, try using rsync on for collaboration and in groups. Try to build upon such system data center backups solution for regulated environments - I'm thinking of conformity to and predicability. It's impossible.


I think you are confusing the basic concept of backing up from snapshots (which we both agree is a good thing) with the transport mechanism used to perform the backup. I fully agree that if you try to perform a backup of any kind against a _live_ filesystem that is actively changing, you run the risk of creating a backup of an inconsistent state. To prevent this, you should ideally only backup from data that has all IO quiesced, such as by taking the file-system offline or by taking a snapshot (ideally low-overhead snapshots, such as those offered by ZFS).

This isn't a specific problem with rsync, however. Any backup solution that operates against a continuously changing filesystem would have the same problem. The biggest difference between rsync and zfs-send in this regard is that zfs-send _requires_ (presumably by design) that you send from a snapshot. And by virtue of the fact that if you are using zfs-send, you are using zfs, and therefore you have a guarantee that a robust and low-overhead snapshot facility is available to you. Rsync, on the other hand, will operate in either mode, leaving the policy decision to the user and their choice of underlying filesystem, caveat emptor. However, as I previously noted, an rsync that is executed against a quiesced dataset (e.g. snapshot) should be equally reliable in replicating the data. Likewise, an rsync that is executed against a live filesystem that consists primarily of files that do not change (as is the case for me) it will be equally reliable, at least w.r.t. the stable files (which are the only ones I _really_ care about).

If you recall, my original statement was "I think even if I were running ZFS native on the primary filesystem, I might still be inclined to use the rsync+snapshot model because I think rsync has better unreliable connection recovery options than zfs send" and I stand by that. Zfs send, to my knowledge, is simply not a great solution if your dataset differential is large and your link is slow and unreliable. Rsync used in conjunction with zfs snapshots should be just as robust/reliable at backing up your data (at least for 3TB/3M, not sure about the systems with Petabytes ;), yet it would also address the shortcomings in the interrupted zfs-send recovery scenario. I only omit the snapshot step now simply b/c I don't have any viable options to do so on JHFS+, coupled with the fact that my particular dataset doesn't strictly require it.

Kind Regards,
-Greg
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Usability of a snapshot of a live file system

Post by grahamperrin » Wed Feb 06, 2013 3:36 pm

I take the view that a well designed app should cope gracefully with its on-disk data after, say, an interruption to power.

So for example: if Mail.app or a related process is writing to any of the following files at midnight, when a ZFS snapshot is made – 

  • Envelope Index
  • Envelope Index-shm
  • Envelope Index-wal

– then a rollback to that point would be no worse (to Mail) than if a power cut had occurred at midnight.

Is that reasonable?

I guess that whilst it's preferable to minimise the number of open files before performing a snapshot, that's not always practical.

I should emphasise that I'm not working with business-critical data. My use case is ZEVO on a laptop.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Usability of a snapshot of a live file system

Post by ghaskins » Wed Feb 06, 2013 3:54 pm

grahamperrin wrote:I take the view that a well designed app should cope gracefully with its on-disk data after, say, an interruption to power.

So for example: if Mail.app or a related process is writing to any of the following files at midnight, when a ZFS snapshot is made – 

  • Envelope Index
  • Envelope Index-shm
  • Envelope Index-wal

– then a rollback to that point would be no worse (to Mail) than if a power cut had occurred at midnight.

Is that reasonable?


Yes. Iff the snapshot is atomic (which I believe ZFS' are), and the application is suitably designed, the state of the snapshot should be no different than if power were cut at that same instant. Most applications that use some kind of underlying database would fall into this category. This doesn't mean you are guaranteed to have all of your data persisted, of course (it could have been in the middle of writing an email, document, or photo at the moment the snapshot was made and that partial update would be rolled back. What it does mean is that the on-disk state (within the snapshot) should be self consistent. For operations that do not manage their persistence well, you may find some files inconsistent (for instance, copying a large jpg file in the middle of the snapshot could result in a half-completed jpg within the snapshot. But this is no different than if the power had gone out at that instant, either.

I guess that whilst it's preferable to minimise the number of open files before performing a snapshot, that's not always practical.


I don't think that is strictly necessary. If you are referring to the conversation about rsync, I think the issue raised was for the cases where you do not (or cannot) perform a snapshot. In those cases, you could work-around the lack of a snapshot facility by taking the system offline. However, really the point of (well designed) snapshots is to be able to do it against a live system so you can avoid the inconvenience of going offline.

Kind Regards,
-Greg
ghaskins Offline


 
Posts: 52
Joined: Sat Nov 17, 2012 9:37 am

Previous

Return to General Discussion

Who is online

Users browsing this forum: ilovezfs and 1 guest

cron