Mirror not using both disks for read - 1.6.1

All your general support questions for OpenZFS on OS X.

Mirror not using both disks for read - 1.6.1

Postby zenomt » Mon Jul 10, 2017 6:50 pm

this past weekend i upgraded my computer to Sierra (from Mavericks), and to 1.6.1 from 1.5.2.

after i finished with the checklist items for my OS upgrade, i noticed that when doing reads on my zpool (2-disk mirror + ZIL & cache on SSD), only one of the disks was being read instead of the usual "both" almost all the time (but not entirely all the time, but like 100:1). i tried rebooting a few times and the favored disk wasn't consistent boot-to-boot, but it did stick for that boot. i didn't see any tunables that sounded likely. i tried several parallel large reads and they were all funneled through the one favored drive, while the other stayed mostly idle.

writes still went to both disks simultaneously, as expected.

a "zpool scrub" read from both disks simultaneously also, as expected. interestingly, after running a scrub for about 30 seconds and then stopping it, ZFS started reading from both disks for ordinary reads like the good old days.

today i ran a scrub to completion (since it was due), and afterward the read behavior had reverted to one favored disk. running a scrub again for 30ish seconds and stopping it restored read-from-both-disks.

has anyone else seen this?
zenomt
 
Posts: 7
Joined: Tue Feb 21, 2017 7:35 pm
Location: Santa Cruz, CA US

Re: Mirror not using both disks for read - 1.6.1

Postby FadingIntoBlue » Mon Jul 10, 2017 11:45 pm

Yes I noticed the same problem a month or two ago. Pool consisting of two mirrors, in a single 4 disk enclosure, Thunderbolt 2. Thought initially it was a disk on its last legs, swapped it out, same issue with the new disk. Completely rebuilt pool, still the same issue. Further work, its not the same slot in the enclosure either.

Code: Select all
sysctl -a | grep kext
spl.kext_version: 1.6.1-1
zfs.kext_version: 1.6.1-1


Code: Select all
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
TankPool-New                                     203G  6.15T      9  1.03K  42.8K   106M
  mirror                                        79.7G  2.64T      4    435  21.2K  41.6M
    media-B38B1AB5-BA64-9345-8554-C7721A105A4B      -      -      4    217  19.7K  20.8M
    media-E0FAAF85-5E4E-D741-8B10-0F0B3D27E86C      -      -      0    217  1.50K  20.8M
  mirror                                         123G  3.50T      5    619  21.7K  64.0M
    media-EE4F73F8-7246-B94D-B3EA-64347531FE0B      -      -      2    310  10.6K  32.0M
    media-2936BC29-2580-E947-ACE8-DF72C96AC86E      -      -      2    308  11.1K  32.0M
----------------------------------------------  -----  -----  -----  -----  -----  -----



Haven't tried the start/stop scrub trick, will give that a go.
FadingIntoBlue
 
Posts: 106
Joined: Tue May 27, 2014 12:25 am

Re: Mirror not using both disks for read - 1.6.1

Postby zenomt » Sun Jul 23, 2017 12:16 pm

does anyone with familiarity with the code (especially changes since 1.5.2) have an idea what could be causing this?

my disks are also in a thunderbolt2 enclosure (OWC Thunderbay IV).
zenomt
 
Posts: 7
Joined: Tue Feb 21, 2017 7:35 pm
Location: Santa Cruz, CA US

Re: Mirror not using both disks for read - 1.6.1

Postby FadingIntoBlue » Mon Jul 24, 2017 2:15 am

My enclosure is also an OWC Thunderbay IV

Haven't had time to test the start/stop scrub yet, should have this coming weekend.
FadingIntoBlue
 
Posts: 106
Joined: Tue May 27, 2014 12:25 am

Re: Mirror not using both disks for read - 1.6.1

Postby rottegift » Sun Jul 30, 2017 6:32 pm

Only one side of the mirror will get used for reads if it can satisfy all the reads promptly.

Promptness can be gauged by looking at the output of zpool iostat -vl and zpool iostat -vw. If latency is sufficiently low, you probably should not care about reading from both halves of the mirror. (Also, it is possible that you will not get much improvement from spreading out reads; if they're sequential (even if short) reads, then it is probably lower-latency to stream from one (TCQ-or-NCQ-capable) device rather than deal with the protocol chatter to set up transfers between host and several devices, or to schedule who talks on the bus at any given moment.)

You could try experimenting with the mirror scheduling tunables (the ones ending in _inc in "man zfs-module-parameters" assuming your MANPATH covers where zfs is installed, however beware that the manpage is from zfsonlinux and is neither wholly complete nor wholly accurate, and additionally we have not made all of the parameters tunables in o3x).

The defaults (from sysctl -h, with a suitable LC_ environment)

Code: Select all
kstat.zfs.darwin.tunable.zfs_vdev_mirror_rotating_seek_offset: 1,048,576
kstat.zfs.darwin.tunable.zfs_vdev_mirror_rotating_inc: 0
kstat.zfs.darwin.tunable.zfs_vdev_mirror_rotating_seek_inc: 5
kstat.zfs.darwin.tunable.zfs_vdev_mirror_non_rotating_inc: 0
kstat.zfs.darwin.tunable.zfs_vdev_mirror_non_rotating_seek_inc: 1


and in particular try increasing the appropriate zero values. UTSL, vdev_mirror.c: vdev_mirror_child_select().

If you don't already know how to go about changing these values dynamically, then you probably should not try. :-)

Moreover, if you go about changing *any* tunables and blow off your own toes (e.g. wreck performance forcing a reboot, get a kernel panic, or lose data), it's probably not o3x's fault. :-)

ETA: if all your reads are coming from one side of a multi-sided (two-or-more-way) mirror, then that side will take writes last, since it can rely on the DTL (dirty time log) on the other sides, i.e., sequential reading (and even random reading if the drive is low-enough latency) imposes less slow-down when committing writes into durable storage.
rottegift
 
Posts: 26
Joined: Fri Apr 25, 2014 12:00 am

Re: Mirror not using both disks for read - 1.6.1

Postby zenomt » Sat Aug 12, 2017 11:43 am

i'll take a look at these parameters next time my pool is in this state, in case they're somehow different. right now, with distributed reads happening, these parameters seem to have the values you posted.

there's definitely something kooky going on when all reads are stuck on one device, because it happens even if i do 2+ parallel reads of different big files (i tried it up to 5 parallel reads). in that case there's no way it's faster to funnel all those reads through one device.
zenomt
 
Posts: 7
Joined: Tue Feb 21, 2017 7:35 pm
Location: Santa Cruz, CA US

Re: Mirror not using both disks for read - 1.6.1

Postby zenomt » Thu Oct 12, 2017 10:31 am

setting kstat.zfs.darwin.tunable.zfs_vdev_mirror_rotating_seek_inc=1 seems to have fixed this issue for me. i made the change on aug 16 and both disks get used including after a reboot.
zenomt
 
Posts: 7
Joined: Tue Feb 21, 2017 7:35 pm
Location: Santa Cruz, CA US

Re: Mirror not using both disks for read - 1.6.1

Postby FadingIntoBlue » Fri Oct 13, 2017 12:05 am

setting kstat.zfs.darwin.tunable.zfs_vdev_mirror_rotating_seek_inc=1 seems to have fixed this issue for me. i made the change on aug 16 and both disks get used including after a reboot.


Thanks for that, just gave it a whirl, rebooted, and can confirm even spread of reads across the mirror.
FadingIntoBlue
 
Posts: 106
Joined: Tue May 27, 2014 12:25 am


Return to General Help

Who is online

Users browsing this forum: Google [Bot] and 35 guests