How to work out why performance is slow?

All your general support questions for OpenZFS on OS X.

How to work out why performance is slow?

Postby zisper » Fri Oct 28, 2016 1:35 pm

Hi;

I have an array of 4 3Tb disks, 2 mirrored VDEVs attached via thunderbolt. I'm getting very slow performance on it, and am trying to figure out why. Here's the current speeds (pool is scrubbing, so it should be pretty max maxed out?) These speeds seem consistent with what I'm getting from copying files around.
Code: Select all
zpool iostat -v 5
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
PlayPool                                        1.96T  3.49T    123    136  12.2M  6.07M
  mirror                                        1.36T  1.37T    113     64  11.9M  2.50M
    media-7D5A5263-F73A-C841-8F1D-16AA3345F709      -      -     57     32  5.98M  1.25M
    media-6EA69111-A906-9B41-B8A9-33D457B225AF      -      -     56     31  5.90M  1.25M
  mirror                                         613G  2.12T      9     71   380K  3.57M
    media-FCAC5C92-96E6-C84B-A5CE-8ABA81176602      -      -      4     35   182K  1.78M
    media-4F980611-DE99-924D-9F5A-3FAB9FA8BDF5      -      -      4     36   198K  1.78M
----------------------------------------------  -----  -----  -----  -----  -----  -----
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
PlayPool                                        1.96T  3.49T     85    149  8.68M  7.70M
  mirror                                        1.36T  1.37T     81     71  8.54M  3.82M
    media-7D5A5263-F73A-C841-8F1D-16AA3345F709      -      -     40     36  4.10M  1.91M
    media-6EA69111-A906-9B41-B8A9-33D457B225AF      -      -     41     35  4.44M  1.91M
  mirror                                         613G  2.12T      4     77   141K  3.88M
    media-FCAC5C92-96E6-C84B-A5CE-8ABA81176602      -      -      1     38  79.5K  1.94M
    media-4F980611-DE99-924D-9F5A-3FAB9FA8BDF5      -      -      2     38  61.6K  1.94M
----------------------------------------------  -----  -----  -----  -----  -----  -----
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
PlayPool                                        1.96T  3.49T    100    154  10.6M  10.6M
  mirror                                        1.36T  1.37T     98     69  10.5M  4.15M
    media-7D5A5263-F73A-C841-8F1D-16AA3345F709      -      -     52     35  5.77M  2.08M
    media-6EA69111-A906-9B41-B8A9-33D457B225AF      -      -     45     34  4.76M  2.08M
  mirror                                         613G  2.12T      2     84  73.2K  6.41M
    media-FCAC5C92-96E6-C84B-A5CE-8ABA81176602      -      -      1     42  32.9K  3.21M
    media-4F980611-DE99-924D-9F5A-3FAB9FA8BDF5      -      -      1     42  40.4K  3.21M
----------------------------------------------  -----  -----  -----  -----  -----  -----
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
PlayPool                                        1.96T  3.49T    130    131  14.2M  6.69M
  mirror                                        1.36T  1.37T    128     53  14.2M  2.49M
    media-7D5A5263-F73A-C841-8F1D-16AA3345F709      -      -     63     26  7.14M  1.25M
    media-6EA69111-A906-9B41-B8A9-33D457B225AF      -      -     64     26  7.05M  1.25M
  mirror                                         613G  2.12T      1     78  35.0K  4.20M
    media-FCAC5C92-96E6-C84B-A5CE-8ABA81176602      -      -      1     38  5.39K  2.10M
    media-4F980611-DE99-924D-9F5A-3FAB9FA8BDF5      -      -      0     39  29.6K  2.10M
----------------------------------------------  -----  -----  -----  -----  -----  -----


Here's the current settings:
Code: Select all
zfs get all PlayPool
NAME      PROPERTY               VALUE                  SOURCE
PlayPool  type                   filesystem             -
PlayPool  creation               Mon Jul 18 20:14 2016  -
PlayPool  used                   1.96T                  -
PlayPool  available              3.32T                  -
PlayPool  referenced             3.12M                  -
PlayPool  compressratio          1.03x                  -
PlayPool  mounted                yes                    -
PlayPool  quota                  none                   default
PlayPool  reservation            none                   default
PlayPool  recordsize             128K                   default
PlayPool  mountpoint             /Volumes/PlayPool      default
PlayPool  sharenfs               off                    default
PlayPool  checksum               on                     default
PlayPool  compression            lz4                    local
PlayPool  atime                  on                     local
PlayPool  devices                on                     default
PlayPool  exec                   on                     default
PlayPool  setuid                 on                     default
PlayPool  readonly               off                    default
PlayPool  zoned                  off                    default
PlayPool  snapdir                hidden                 default
PlayPool  aclmode                passthrough            default
PlayPool  aclinherit             restricted             default
PlayPool  canmount               on                     default
PlayPool  xattr                  on                     default
PlayPool  copies                 1                      default
PlayPool  version                5                      -
PlayPool  utf8only               on                     -
PlayPool  normalization          formD                  -
PlayPool  casesensitivity        insensitive            -
PlayPool  vscan                  off                    default
PlayPool  nbmand                 off                    default
PlayPool  sharesmb               off                    default
PlayPool  refquota               none                   default
PlayPool  refreservation         none                   default
PlayPool  primarycache           all                    default
PlayPool  secondarycache         all                    default
PlayPool  usedbysnapshots        0                      -
PlayPool  usedbydataset          3.12M                  -
PlayPool  usedbychildren         1.96T                  -
PlayPool  usedbyrefreservation   0                      -
PlayPool  logbias                latency                default
PlayPool  dedup                  off                    default
PlayPool  mlslabel               none                   default
PlayPool  sync                   standard               default
PlayPool  refcompressratio       1.90x                  -
PlayPool  written                3.12M                  -
PlayPool  logicalused            2.03T                  -
PlayPool  logicalreferenced      2.76M                  -
PlayPool  filesystem_limit       none                   default
PlayPool  snapshot_limit         none                   default
PlayPool  filesystem_count       none                   default
PlayPool  snapshot_count         none                   default
PlayPool  snapdev                hidden                 default
PlayPool  com.apple.browse       on                     default
PlayPool  com.apple.ignoreowner  off                    default
PlayPool  com.apple.mimic_hfs    off                    default
PlayPool  shareafp               off                    default
PlayPool  redundant_metadata     all                    default
PlayPool  overlay                off                    default


And the original creation command:
Code: Select all
 create -f -o ashift=12 -O casesensitivity=insensitive -O normalization=formD -O compression=lz4 PlayPool disk5


It's obviously been built up with additional disks quite a bit since then.

When I first boot the computer, it actually seems to run about as fast as the hardware will allow - but then it rapidly (~1min?) decays back to it's current state - averaging maybe 10M/s. (The scrub tells me it's averaging 4.42M/s)

Any ideas on what I should investigate to fix this?
zisper
 
Posts: 16
Joined: Wed Jul 20, 2016 2:48 am

Re: How to work out why performance is slow?

Postby Sharko » Sat Oct 29, 2016 9:25 am

I don't want to start a scrub right now, or I would try and give you comparable numbers on my machine (Mac Pro w/quad 2.8GHz Nehalem, similar zpool 4 x 2TB encrypted disks, 24GB ram, L2ARC limited to 8GB, internal SATA II bus). I do know that general performance for large complex copy operations (like a clone using Carbon Copy cloner to fresh disk) runs at about 20 MB/sec on average, similar to performance across a USB 2 interface. In everyday use, however, it doesn't feel that slow because of all the disk caching that ZFS does (I use it as my home directory for my daily driver user).

You might want to post details of your CPU and RAM and whether the underlying disks are encrypted, just as context.

Kurt
Sharko
 
Posts: 230
Joined: Thu May 12, 2016 12:19 pm

Re: How to work out why performance is slow?

Postby zisper » Sat Oct 29, 2016 2:08 pm

Sure, it's a mid 2011 mac mini, 2.3 GHz i5, 8GB memory. I haven't set any memory limits on zfs usage, but it's currently only using about 1.5GB. No encryption.

Code: Select all
sysctl kstat.spl.misc.spl_misc.os_mem_alloc
kstat.spl.misc.spl_misc.os_mem_alloc: 1531445248

sysctl kstat | grep arc_max
kstat.zfs.darwin.tunable.zfs_arc_max: 0
kstat.zfs.darwin.tunable.l2arc_max_block_size: 16777216


I seem to recall when I first set it up with just one mirrored vdev, that is was running at ~80MB/sec? Which wasn't fast for the capabilities of the interfaces, but faster than the internal drive - so more than good enough. I thought that adding a second vdev should theoretically improve things, but it seemed to slow it down _slightly_. Now however, it's basically crawling. See the current scrub statistics:

Code: Select all
zpool status
  pool: PlayPool
 state: ONLINE
  scan: scrub in progress since Thu Oct 27 20:06:40 2016
    560G scanned out of 2.25T at 2.36M/s, 210h34m to go
    0 repaired, 24.30% done
config:

   NAME                                            STATE     READ WRITE CKSUM
   PlayPool                                        ONLINE       0     0     0
     mirror-0                                      ONLINE       0     0     0
       media-7D5A5263-F73A-C841-8F1D-16AA3345F709  ONLINE       0     0     0
       media-6EA69111-A906-9B41-B8A9-33D457B225AF  ONLINE       0     0     0
     mirror-1                                      ONLINE       0     0     0
       media-FCAC5C92-96E6-C84B-A5CE-8ABA81176602  ONLINE       0     0     0
       media-4F980611-DE99-924D-9F5A-3FAB9FA8BDF5  ONLINE       0     0     0

errors: No known data errors


Looking at the memory & cpu usage, it doesn't look like it's bounded by either of those things - but I wasn't sure on the best way to check that - I'm just looking at the history in iStat menus currently.
zisper
 
Posts: 16
Joined: Wed Jul 20, 2016 2:48 am

Re: How to work out why performance is slow?

Postby Brendon » Sat Oct 29, 2016 2:52 pm

Hi,

Seems to me that there is certainly something going on with your setup. I would consider checking the smart status of your drives, ensure they are healthy as a starting point since you are reporting degradation over time. Out of interest what enclosure(s) are you using? Anything unusual about your drives - small, slow, old?

Scrub operations typically start slow, and speed up. My last scrub ~1.5TB to scrub took 8 hours.

I would expect that for copying large files you should see peaks well over 200MB/sec. I can well exceed that with a 5x7200 rpm disk thunderbolt 1 raidz2 (Lacie 5BIG), in fact the fastest I/O I've seen has been approaching 500MB/sec peak. This is hanging off a mac pro, but it was just as quick when driven by a 2011 27" iMac.

Cheers
Brendon
Brendon
 
Posts: 286
Joined: Thu Mar 06, 2014 12:51 pm

Re: How to work out why performance is slow?

Postby zisper » Sun Oct 30, 2016 1:24 pm

Well, I've been doing some tinkering. (Probably more than I should've.) I actually created a new pool (using some of the same disks) of two disks in a mirrored configuration (only 1 VDEV). They're both 7200RPM Toshibas, and in it's current state I've seen peak transfer rates of 130-140 (or slightly higher) and it seems to be averaging ~80.
However, if I log into a user that's got a home account on that pool then its _very_ variable in speed, and often back to the slow transfer rates. (I probably should've checked that before splitting up the original pool... I actually still have that pool, I just stripped its mirrors out so I might rebuild it to see what happens then try logging in with both accounts, though I guess it'll have the same behaviour.) So, something (I'm looking at you iPhoto!) must be requesting a lot of data from the pool and messing up the transfer rates? I'd just not expected the raw rate shown by iostat to drop that much though. Overall it still seems much faster with the single vdev than the original dual one though.

Does it make sense that something else accessing the pool could affect the raw transfer rates shown by iostat so much?
zisper
 
Posts: 16
Joined: Wed Jul 20, 2016 2:48 am


Return to General Help

Who is online

Users browsing this forum: No registered users and 13 guests

cron