Rapidly deteriorating performance

Moderators: jhartley, MSR734, nola

Rapidly deteriorating performance

Post by leafmuncher » Fri Nov 09, 2012 12:12 am

I have the following pool set up:

Code: Select all
  pool: Jungle
 state: ONLINE
 scan: scrub canceled on Thu Nov  8 11:13:51 2012
config:

   NAME                                           STATE     READ WRITE CKSUM
   Jungle                                         ONLINE       0     0     0
     mirror-0                                     ONLINE       0     0     0
       GPTE_BC0E811C-8DE7-4A95-8A71-9E7A525494DD  ONLINE       0     0     0  at disk4s2
       GPTE_50EB154C-79D7-4C2B-9653-A6BA9CF465C7  ONLINE       0     0     0  at disk2s2
     mirror-1                                     ONLINE       0     0     0
       GPTE_868ECDB0-5D93-48D8-BDFD-59BB25C1D2E2  ONLINE       0     0     0  at disk5s2
       GPTE_B23AFB99-6F8B-4C6D-9758-D92787E79D74  ONLINE       0     0     0  at disk6s2
   cache
     GPTE_3394835C-C6B9-4715-A5EA-65D7E46A0B93    ONLINE       0     0     0  at disk3s2


When I first boot, Xbench gives awesome numbers for Disk Test, around ~2000 or so.

However, after a few hours of uptime, for some reason performance gets massively degraded as you can see below.

Image

This is zstat during an XBench run:

Code: Select all
v2012.09.23     89 threads        1 mount        99072 vnodes     19:11:38
____________________________________________________________________________
             KALLOC      KERNEL/MAPS        TOTAL         EQUITY
  WIRED     197 MiB    1867 MiB/1876         2065 MiB      10.09%
  PEAK      236 MiB    2012 MiB              2249 MiB
  VMPAGE     207596 (IN)      37406 (OUT)      37405 (SYNC)      11616 (MDS)
____________________________________________________________________________
                     HITS                  MISSES
  ARC overall:        72% (34571162)          28% (13060311)
  ARC demand data:    93% (11118398)           7% (828994)
  ARC demand meta:    87% (19260831)          13% (2754167)
  ARC prefetch data:  26% (3056934)           74% (8537543)
  ARC prefetch meta:  54% (1134999)           46% (939607)
  DMU zfetch:         92% (46887000)           8% (3818889)
____________________________________________________________________________
     SIZE     SLAB    AVAIL    INUSE    TOTAL     PEAK  KMEM CACHE NAME
       72     4096    92783    23762   116545   149930  kmem_slab_cache
       24     4096   222783   527381   750164   819469  kmem_bufctl_cache
       88     4096      214     1001     1215    25200  taskq_ent_cache
      360     4096        4       18       22       22  taskq_cache
      824     8192        5        4        9    75141  zio_cache
       48     4096       81        2       83    76526  zio_link_cache
       80     4096     7428    99072   106500   125050  sa_cache
      840     8192     3511   104516   108027   202572  dnode_t
      216     4096   108069   112323   220392   383724  dmu_buf_impl_t
      200     4096    69858  3018162  3088020  3200760  arc_buf_hdr_t
      104     4096    84190     9214    93404   263834  arc_buf_t
      192     4096       19        1       20      820  zil_lwb_cache
      400     4096     1678    99072   100750   125020  znode_t


Any ideas?
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

Re: Rapidly deteriorating performance

Post by leafmuncher » Fri Nov 09, 2012 12:21 am

I just unmounted the pool, and then re-imported it, and performance is back up to normal:

Image

I have no clue what's causing it, but something is making it slow way down.
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

Re: Rapidly deteriorating performance

Post by grahamperrin » Fri Nov 09, 2012 6:47 am

Output please from diskutil list

Thanks
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Rapidly deteriorating performance

Post by leafmuncher » Fri Nov 09, 2012 1:39 pm

Code: Select all
$ diskutil list
/dev/disk0
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *80.0 GB    disk0
   1:                        EFI                         209.7 MB   disk0s1
   2:                        ZFS                         79.7 GB    disk0s2
/dev/disk1
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *479.9 GB   disk1
   1:                        EFI                         209.7 MB   disk1s1
   2:                  Apple_HFS Fast Boot               479.1 GB   disk1s2
   3:                 Apple_Boot Recovery HD             650.0 MB   disk1s3
/dev/disk2
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *3.0 TB     disk2
   1:                        EFI                         209.7 MB   disk2s1
   2:                        ZFS                         3.0 TB     disk2s2
/dev/disk3
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *120.0 GB   disk3
   1:                        EFI                         209.7 MB   disk3s1
   2:                        ZFS                         119.7 GB   disk3s2
/dev/disk4
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *3.0 TB     disk4
   1:                        EFI                         209.7 MB   disk4s1
   2:                        ZFS                         3.0 TB     disk4s2
/dev/disk5
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *2.0 TB     disk5
   1:                        EFI                         209.7 MB   disk5s1
   2:                        ZFS                         2.0 TB     disk5s2
/dev/disk6
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *2.0 TB     disk6
   1:                        EFI                         209.7 MB   disk6s1
   2:                        ZFS                         2.0 TB     disk6s2
/dev/disk7
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:             zfs_pool_proxy Jungle                 *5.0 TB     disk7
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

Re: Rapidly deteriorating performance

Post by leafmuncher » Fri Nov 09, 2012 1:42 pm

It survived overnight without performance loss, the only thing I can think of is that I didn't run Time Machine, but I have no clue if that's related.
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

Re: Rapidly deteriorating performance

Post by grahamperrin » Fri Nov 09, 2012 2:24 pm

Thanks for the output. I was looking first to see whether you had both Apple_HFS and ZFS slices on any one disk.

Please, which OS?

How much memory?

Output please from:

Code: Select all
zfs get available Jungle


Has the dataset in that pool ever had less than twenty percent free?

With Lion or greater

If/when the performance issue recurs, aim for Apple-provided sysdiagnose:

> gathers system-wide diagnostic information helpful in
> investigating system performance issues.

sysdiagnose(1) OS X Manual Page or (with OS X) view the page in Terminal.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Rapidly deteriorating performance

Post by leafmuncher » Fri Nov 09, 2012 3:23 pm

Mountain Lion 10.8.2, 20GB of RAM

Code: Select all
$ sudo zfs get available Jungle
NAME    PROPERTY   VALUE   SOURCE
Jungle  available  3.74Ti  -


Never has it had less than 20% free.

I'm a developer, so if that's helpful, let me know :)

When I was looking at the issue with dd & iostat yesterday, I noticed something which might be interesting. The tps (transactions per second) was high, but the KB/t was very small (compared to when it was fast). This might have nothing to do with it, of course...
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

Re: Rapidly deteriorating performance

Post by grahamperrin » Sat Nov 10, 2012 1:22 am

… I'm a developer …


Everyone loves a good developer :) if you have time to spare from ZEVO, the MacZFS project is recruiting.

… tps (transactions per second) was high, but the KB/t was very small (compared to when it was fast). …


zpool get ashift Jungle

A value of 12 may be sane and proper but with ZFS in its current form, some uses of RAID-Z are suboptimal. Some links in this topic, which you might have seen already: Performance Observation – whether any of the linked material can explain the tps that you observe, I don't know.

Also FYI RAID-Z on-disk format

Postscript: does any of that translate to your mirror configuration? (Note to self: don't post this type of stuff when you've been up since 04:00. Tinnitus and face-grabbing insomniac cats are no excuse for careless reading of zpool stuff!)

I just noticed, the cache for your pool. Which bus, and are other devices on the same bus? Food for thought: Performance issue with FW800 connection order?
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: Rapidly deteriorating performance

Post by leafmuncher » Sat Nov 10, 2012 2:33 pm

Unfortunately I'm tied up 125% of my time with Plex (www.plexapp.com) already :)

Are there plans for MacZFS to update to v28?

The performance issues I was seeing (haven't seen them in two days) appeared to be more pathological and sporadic than anything related to, say, ashift, Given that just exporting and importing "fixed" the issue, it led me to hypothesize that it might have something to do with *wild guess* memory fragmentation or something of that ilk.

I'll keep an eye on it and see if it recurs. I appreciate all your insight and advice :)
leafmuncher Offline


 
Posts: 6
Joined: Wed Oct 17, 2012 3:46 pm

cross reference

Post by grahamperrin » Sun Nov 11, 2012 6:46 am

leafmuncher wrote:… Are there plans for MacZFS to update to v28? …


That's a question for developers of MacZFS. From what I can gather today:

editions and versions of ZEVO, community and open source: MacZFS: towards ZFS pool version 28
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 0 guests

cron