Performance when creating large number of files (100,000s)

All your general support questions for OpenZFS on OS X.

Performance when creating large number of files (100,000s)

Postby abc123 » Sun Jun 25, 2017 5:42 am

I've occasionally had to deal with creating a large number of files (200k-400k) when checking out some git repositories and ZFS 1.6.1 (and 1.5.2) seems to grind to a halt when doing so. The same also occurs when I tried to copy Xcode.app from my HFS partition to my ZFS partition on my MBP. But both are on the same SSD so I have done a slightly more thorough test on the Mac Pro.

It's a dual quad-core 2008 Mac Pro, 8GB RAM and 4 hard disks. It's got a raidz1 zpool (tank) consisting of 3 x 1TB WD hard disks running El Capitan but also is set up to dual boot with FreeBSD. The 4th HD is the boot disk which is 500GB split between HFS+ for El Capitan and a root zpool (zroot) for FreeBSD. This means I can test with both OS X and FreeBSD performing operations on the same pool.

The pool was created with O3X 1.5.2 with ashift=12 and normalisation=formD, case-sensitive and checksum=skein. I've not got any settings in zsysctl.conf.

For the test, I've tar'd Xcode.app which (according to find has 215k+ files and 68k directories). I've stored this on the HFS+ boot partition in El capitan and then extracted it on to the raidz1 pool.

Code: Select all
$ time tar xf ~/Downloads/xcode.tar

real   54m21.762s
user   0m6.074s
sys   2m12.978s


Which is about what I was seeing when I tried to copy it via Finder. I gave up after the estimated time dropped from a few minutes to 2 hours after about 1gb transfer.

To then compare with FreeBSD (xcode.tar stored on it's root zpool and expanded to tank)

Code: Select all
$ time tar xf ~/xcode.tar

real   5m4.308s
user   0m5.234s
sys   2m54.182s


I'm wan quite surprised just how much quicker FreeBSD was than OS X. I'd expected a different and expected OS X to be slower but not 10x slower to the same pool on the same hardware.

And just as an extra comparison, on OS X I unpacked it from the zpool to the HFS+ partition:

Code: Select all
$ time tar xf /data/temp/xcode.tar

real   3m41.661s
user   0m4.549s
sys   1m30.865s


Which is quicker than the FreeBSD time but then HFS+ is doing much less work when writing.

If I just copy the tar across from HFS+ to tank, then it behaves much as you'd expect:

Code: Select all
$ time cp ~/Downloads/xcode.tar .

real   6m32.625s
user   0m0.022s
sys   0m21.794s


Still slower than FreeBSDs un-tar but perfectly usable.

I've not done any tuning on either system but not sure if this is a tuning issue or just the O3X implementation and the way it interacts with the El Capitan kernel.

The same happens if I then rm -rf the extracted Xcode.app. OS X took 50m31.783s and FreeBSD only took 1m20.689s.

Any thoughts or suggestions? I don't do this sort of large file operation often but it would be good to at least try and understand where the huge disparity comes from and if there's anything that can be done about it or do I just need to leave this sort of operations to run over-night?

For everything else, the performance of O3X-1.6.1 seems very good. I"ve got a lot of development projects (mostly C++) and Windows/Linux VMs running on it and it all runs very well. It's only when I do something which involves a huge number of files.

Thanks

Russell
abc123
 
Posts: 63
Joined: Mon Jan 30, 2017 11:46 pm

Re: Performance when creating large number of files (100,000

Postby lundman » Sun Jun 25, 2017 4:03 pm

That certainly doesn't look right, it should not be that slow. We haven't got to the optimization phase yet, but that does seem slow. One thing that could be done, is flamegraphs after it has run for a while, to see where it spends all the time.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Performance when creating large number of files (100,000

Postby abc123 » Sun Jun 25, 2017 8:29 pm

This was run using master on my MBP where the internal SSD is split between HFS+ and a zpool. I can run it later this week on the raidz1 pool but both systems exhibit similar behaviour. The first 1gb or so copies very quickly then the copy grids to a half near enough. (This was just a copy using Finder).

https://www.dropbox.com/s/as5lp2mo0pltu ... s.svg?dl=0

Not sure what this shows though.

Thanks

Russell
abc123
 
Posts: 63
Joined: Mon Jan 30, 2017 11:46 pm

Re: Performance when creating large number of files (100,000

Postby Brendon » Mon Jun 26, 2017 1:14 am

For me, mac pro 2013, 6 core, 32G RAM, Thunderbolt raizd2 of 5 disks. Source was the internal ssd.

jerry:tmp brendy$ time tar -xf /Users/brendy_tmp/xcode.tar

real 8m17.961s
user 0m4.578s
sys 1m13.062s

The rm was slow though.

- Brendon
Brendon
 
Posts: 286
Joined: Thu Mar 06, 2014 12:51 pm

Re: Performance when creating large number of files (100,000

Postby abc123 » Mon Jun 26, 2017 5:00 am

Was this with master to 1.6.1? Would you expect any of the changes on master to have speed things up? Was your rm time a lot longer than the tar time?

I can move the machine to running master later this week and try again. I'll do some repeat tests.

Out of interest can you just do a copy with Finder directly to the zpool and see if that yields the same time as tar? I first noticed this when trying to just copy Xcode.app to my pool (and checking out the large git repository). I tar'd it so I could run the same operation on FreeBSD and OS X and compare.

FWIW, this is a plain install of El Capitan with nothing else running on it other than whatever OS X does in the background anyway.

Thanks

Russell
abc123
 
Posts: 63
Joined: Mon Jan 30, 2017 11:46 pm

Re: Performance when creating large number of files (100,000

Postby Brendon » Mon Jun 26, 2017 12:41 pm

Yeah similar results, was heading for 20 minutes. I've also had a subsequent tar operation take approx that long as well. I'm not trying to say there isnt a problem, looks like it takes off like a rocket then slows to a crawl.

Cheers
Brendon
Brendon
 
Posts: 286
Joined: Thu Mar 06, 2014 12:51 pm

Re: Performance when creating large number of files (100,000

Postby abc123 » Mon Jun 26, 2017 8:11 pm

I guess it's hard to know if the improvements you have seen over my tests are down to the newer hardware you have or not. If it starts off well and then slows to a crawl on both machines then that does sound like similar behaviour.

I'll move to master later this week anyway just to compare and put um some flame graphs from it in case anyone is interested.

Thanks for testing it.
abc123
 
Posts: 63
Joined: Mon Jan 30, 2017 11:46 pm

Re: Performance when creating large number of files (100,000

Postby lundman » Wed Jun 28, 2017 9:55 pm

brendon provided some initial flamegraphs, but they just showed that the system was mostly idle. Deeper exploration required.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Performance when creating large number of files (100,000

Postby abc123 » Wed Jun 28, 2017 10:05 pm

OK thanks. I've just moved my Mac Pro over to running master so if there's any more info I can provide let me know.
abc123
 
Posts: 63
Joined: Mon Jan 30, 2017 11:46 pm


Return to General Help

Who is online

Users browsing this forum: No registered users and 19 guests