unexpected force when adding a cache vdev

This forum is to find answers to problems you may be having with ZEVO Community Edition.

Moderators: jhartley, MSR734, nola

unexpected force when adding a cache vdev

Post by grahamperrin » Sun Mar 24, 2013 10:32 am

First:

Code: Select all
macbookpro08-centrim:~ gjp22$ zpool add gjp22 cache /dev/disk3
could not create pool: permission denied
macbookpro08-centrim:~ gjp22$ sudo zpool add gjp22 cache /dev/disk3
Password:


Then, on entry of the password, forced unmount of the file systems of the pool.

The force took me by surprise:


Code: Select all
2013-03-24 10:29:51.678 sudo[26827]:    gjp22 : TTY=ttys003 ; PWD=/Volumes/gjp22 ; USER=root ; COMMAND=/usr/sbin/zpool add gjp22 cache /dev/disk3
2013-03-24 10:29:53.000 kernel[0]: ZFSLabelScheme:willTerminate: this 0xffffff802eaa0900 provider 0xffffff802dfda600 '%noformat%'
2013-03-24 10:29:53.000 kernel[0]: ZFSLabelScheme:stop: 0xffffff802eaa0900 goodbye '%noformat%'
2013-03-24 10:29:57.000 kernel[0]: ZFSLabelScheme:probe: label '???', vdev 17043972105408950164
2013-03-24 10:29:57.000 kernel[0]: zfsx_unmount: '/opt' forced (umount)
2013-03-24 10:30:08.000 kernel[0]: zfsvfs_teardown: '/opt' (txg_wait_synced in 11017 ms)
2013-03-24 10:30:08.000 kernel[0]: zfsx_unmount: '/Volumes/gjp22/casesensitive' forced (umount)
2013-03-24 10:30:12.000 kernel[0]: ZFSLabelScheme::terminate(kIOServiceSynchronous) timeout
2013-03-24 10:30:12.000 kernel[0]: zfsvfs_teardown: '/Volumes/gjp22/casesensitive' (txg_wait_synced in 4547 ms)
2013-03-24 10:30:21.000 kernel[0]: zfsx_unmount: '/Volumes/gjp22' forced (umount)


gjp22 is my home directory so the unmount was disruptive. But not disastrous – I used the key chord for sysdiagnose then remotely with ssh, I observed a full send and receive (two other pools), which had begun before the unmount issue, complete without error. Then sudo shutdown -r now got as far as it could and I forced a restart (Control-Command-Power).

Question

Is it normal for addition of a cache vdev to require unmount of the pool's file systems?
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: unexpected force when adding a cache vdev

Post by raattgift » Tue Mar 26, 2013 2:45 pm

No, that is not correct behaviour.

Unless you do something which causes the health property (zpool get health poolanme) to be neither ONLINE or DEGRADED, then all the datasets should not change in availability.

That is, if they were mounted, they should stay mounted; if they were unmounted, they should stay unmounted, even as you add or remove vdevs attach or detach vdev components, (auto)expand the pool, split the pool, and so forth.

Adding or removing a cache vdev should not even result in a transition from ONLINE to DEGRADED, much less out of both of those.

You should check the logs to see if there was an error that took the pool out of one of those states. Perhaps the new USB configuration or traffic caused some I/O or probe errors. That happens, unfortunately, and is outside the scope of ZFS (as a filesystem/subsystem), and would just as likely affect the availability of other storage.

FWIW, I lived with a cache vdev that was a softraid volume on a Thunderbolt SSD (other volumes were used for softraid mirroring, and ZFS log vdevs). Unfortunately softraid's IOKit attach dance leads to a new UUID being assigned to the volume at each reboot, so I would regularly have to do a zpool remove poolname GPTE_olduid; zpool add poolname cache /dev/diskXX. I also have added and removed USB3 superspeed-capable memory sticks in as cache vdevs (their UUIDs persist) from time to time. I have not seen any filesystems be unmounted as a result.

I have, however, seen errors post in a now known to be flaky WD USB3 My Book 1140 when adding the memory stick cache vdevs on the same bus; those did in fact cause the device to be marked FAULTED, but without a change in availability of the pool's datasets, because sufficient replication remained on the other devices in the vdev. Yay, zfs!

If the faulting device took the vdev offline (or was the only storage device in the pool, which is what you have done from time to time), data unavailability would be certain, but the symptom should follow the setting of the "zpool get failmode poolname" property. The default, "wait", is unlikely to lead to an unmount, but it is possible because of corner cases in state machines in the non-ZFS parts of the operating system.

The risk of an expanding "stalled" process state bubble is another good reason never to run a pool with vdevs with zero replication (the dataset property count > 1 does not count). It's probably a good idea to set the pool failmode property to "fail" if one really has to use such a pool. "wait" is really meant to buy some grace time in the worst possible case when you are scrambling around to find a suitable spare disk with which to begin immediate resilvering. It's between worthless and counterproductive for a pool that cannot recover by attaching a new device to a vdev.
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Re: unexpected force when adding a cache vdev

Post by raattgift » Tue Mar 26, 2013 2:52 pm

Oops, I said you should look in the logs. This below is from your previous posting :

"2013-03-24 10:29:53.000 kernel[0]: ZFSLabelScheme:willTerminate: this 0xffffff802eaa0900 provider 0xffffff802dfda600 '%noformat%'
2013-03-24 10:29:53.000 kernel[0]: ZFSLabelScheme:stop: 0xffffff802eaa0900 goodbye '%noformat%'
2013-03-24 10:29:57.000 kernel[0]: ZFSLabelScheme:probe: label '???', vdev 17043972105408950164"

That's sufficient to guess (although not without a zpool status -v) that something went wrong in the "diskutil erasedisk" (which zpool execs) that led to the cache vdev being unusable. Subsequently, the storage vdev failed a probe, which probably made your pool transition to UNAVAILABLE. The subsequent umounts are a cleanup (what's your pool failmode property?).

Best guess as to underlying: something hanged the USB bus. Insufficient electrical power, perhaps?
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

A little more detail

Post by grahamperrin » Wed Mar 27, 2013 3:04 am

Code: Select all
macbookpro08-centrim:68f896193f gjp22$ zpool get failmode gjp22
NAME   PROPERTY  VALUE     SOURCE
gjp22  failmode  continue  default


Code: Select all
macbookpro08-centrim:~ gjp22$ zpool status -v gjp22
  pool: gjp22
 state: ONLINE
 scan: scrub repaired 0 in 20h27m with 0 errors on Mon Mar 25 18:52:26 2013
config:

   NAME                                         STATE     READ WRITE CKSUM
   gjp22                                        ONLINE       0     0     0
     GPTE_71B8BDA2-3EBA-4B91-9E1C-2AE2B1DAAD06  ONLINE       0     0     0  at disk8s2
   cache
     GPTE_2605CCB0-67B7-4C93-A4B1-83EF764CE617  ONLINE       0     0     0  at disk2s2

errors: No known data errors


Code: Select all
macbookpro08-centrim:~ gjp22$ zpool iostat -v gjp22
                                                 capacity       operations       bandwidth
pool                                          alloc    free    read   write    read   write
-------------------------------------------  ------  ------  ------  ------  ------  ------
gjp22                                         476Gi  95.5Gi      20      18   287Ki   476Ki
  GPTE_71B8BDA2-3EBA-4B91-9E1C-2AE2B1DAAD06   476Gi  95.5Gi      20      18   287Ki   476Ki
cache                                            -      -      -      -      -      -
  GPTE_2605CCB0-67B7-4C93-A4B1-83EF764CE617  7.13Gi     8Mi       8       2   244Ki   121Ki
-------------------------------------------  ------  ------  ------  ------  ------  ------


More on that cache vdev today under Benchmark/check L2ARC performances.

(Yes, very low end. An 8 GB Kingston USB flash drive, thumb drive. Not recommended for L2ARC for production purposes ;-) … and whilst we can describe my approach to testing as ropey, I shouldn't rush to describe the Kingston device itself as ropey.)

I'll use zdb to see what happened, a few weeks or months ago, when I first added the same drive to the same pool.

(I did unexpectedly 'lose' my home directory whilst in use, once before, but I can't recall the situation. Watch this space …)
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Links, and looking back

Post by grahamperrin » Wed Mar 27, 2013 9:58 pm

grahamperrin wrote:… (I did unexpectedly 'lose' my home directory whilst in use, once before …


Actually, twice before. The first of these two was my careless/uneducated use of the gpt command:


I'll use zdb to see what happened, a few weeks or months ago, when I first added the same drive to the same pool.


Code: Select all
macbookpro08-centrim:~ gjp22$ sudo zdb -h gjp22 | grep "zpool add gjp22 cache"
2012-12-16.20:58:54 zpool add gjp22 cache /dev/disk4
2013-03-24.10:30:23 zpool add gjp22 cache /dev/disk3
macbookpro08-centrim:~ gjp22$ clear


Now focusing on 2012-12-16 around 20:58, a few minutes beforehand:

Code: Select all
Dec 16 20:53:49 macbookpro08-centrim.home sudo[7136]:    gjp22 : TTY=ttys003 ; PWD=/Volumes/gjp22/Documents/com/github/jollyjinx/ZFS-TimeMachine/028ba53 ; USER=root ; COMMAND=/usr/sbin/zpool add gjp22 cache /dev/disk3
Dec 16 20:54:41 localhost bootlog[0]: BOOT_TIME 1355691281 0


I don't know what went wrong there but from the history presented by zdb, it's clear that addition failed.

Then, a few minutes later … reminding myself that I was probably not logged in as gjp22 (probably not using the pool) when the cache vdev was added:

Code: Select all
Dec 16 20:57:51 macbookpro08-centrim.home sudo[404]: bbsadmin-l : TTY=ttys001 ; PWD=/Users/bbsadmin-l ; USER=root ; COMMAND=/usr/sbin/zpool status -v
Dec 16 20:58:24 macbookpro08-centrim.home sudo[411]: bbsadmin-l : TTY=ttys001 ; PWD=/Users/bbsadmin-l ; USER=root ; COMMAND=/usr/sbin/zpool import
Dec 16 20:58:41 macbookpro08-centrim.home sudo[415]: bbsadmin-l : TTY=ttys001 ; PWD=/Users/bbsadmin-l ; USER=root ; COMMAND=/usr/sbin/zpool add gjp22 cache /dev/disk4
Dec 16 20:59:24 macbookpro08-centrim.home sudo[421]: bbsadmin-l : TTY=ttys001 ; PWD=/Users/bbsadmin-l ; USER=root ; COMMAND=/usr/sbin/zpool status -v


Selected messages from kernel:

Code: Select all
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: -------------------------------------------------------------------
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: | Z F S
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: |
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: | Copyright (c) 2005, 2010, Oracle and/or its affiliates.
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: | Portions, Copyright (c) 2012 GreenBytes, Inc.
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: | Portions, Copyright (c) 2011 Delphix
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: | Portions Copyright (c) 2011 Nexenta Systems, Inc.
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: -------------------------------------------------------------------
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: zevo build 2012.09.23
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: max heap_arena_size: 3221225472, max maps 3092
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: maximum znodes: 99072
Dec 16 20:55:23 macbookpro08-centrim kernel[0]: zfsx_sysctl: hide children 1 --> 0
Dec 16 20:55:31 macbookpro08-centrim kernel[0]: ZFSLabelScheme:probe: label 'gjp22', vdev 18001602949135038910
Dec 16 20:55:31 macbookpro08-centrim kernel[0]: ZFSLabelScheme:start: 'gjp22' critical mass with 1 vdev(s) (importing)
Dec 16 20:55:31 macbookpro08-centrim kernel[0]: zfsx_kev_importpool:'gjp22' (13173756898033222216)
Dec 16 20:55:34 macbookpro08-centrim kernel[0]: zfsx_vdm_open: 'gjp22' disk2s2
Dec 16 20:55:57 macbookpro08-centrim kernel[0]: zfsx_mount: '/Volumes/gjp22'
Dec 16 20:55:58 macbookpro08-centrim kernel[0]: zfsx_mount: '/Volumes/gjp22/casesensitive'
Dec 16 20:58:48 macbookpro08-centrim kernel[0]: zfsx_vdm_open: couldn't find vdevMedia for 'disk4s2'
Dec 16 20:58:50 macbookpro08-centrim kernel[0]: ZFSLabelScheme:probe: label '???', vdev 17043972105408950164
Dec 16 20:58:50 macbookpro08-centrim kernel[0]: zfsx_vdm_open: 'gjp22' disk4s2
Dec 16 20:59:58 macbookpro08-centrim kernel[0]: ZFSLabelScheme:willTerminate: this 0xffffff80358cb400 provider 0xffffff8034f8a900 '%noformat%'
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: zfs_vdm_completion: media 0xffffff8034f8a900, media is not present (6), 0 bytes (of 512)
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: zfs_vdm_completion: media 0xffffff8034f8a900, media is not present (6), 0 bytes (of 2560)
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: zfs_vdm_completion: media 0xffffff8034f8a900, media is not present (6), 0 bytes (of 8192)
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: ________________________________________
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: ZFS WARNING: 'error from: fs.zfs.probe_failure'
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: pool: 'gjp22'
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: vdev_type: 'disk'
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: vdev_path: '/dev/dsk/GPTE_64F61AFF-9EBC-4661-9520-7803CD1B8EE4'
Dec 16 21:00:08 macbookpro08-centrim kernel[0]: prev_state: 0
Dec 16 21:00:13 macbookpro08-centrim kernel[0]: DataTraveler 400::terminate(kIOServiceSynchronous) timeout
Dec 16 21:01:11 macbookpro08-centrim kernel[0]: zfsx_search_start: holding 'gjp22/casesensitive@...searching...'
Dec 16 21:01:11 macbookpro08-centrim kernel[0]: search opts: 0x8000004d, 1024 max matches
Dec 16 21:01:11 macbookpro08-centrim kernel[0]: search attr: 0x02000800 0x0000 0x0000
Dec 16 21:01:11 macbookpro08-centrim kernel[0]: return attr: 0x02038800 0x0000 0x0000
Dec 16 21:01:15 macbookpro08-centrim kernel[0]: 65 matches from 4041 files, 424 directories in 4 sec
Dec 16 21:01:15 macbookpro08-centrim kernel[0]: zfsx_search_harvest: job -544626327320 all done
Dec 16 21:01:16 macbookpro08-centrim kernel[0]: zfsx_search_start: holding 'gjp22@...searching...'
Dec 16 21:01:17 macbookpro08-centrim kernel[0]: search opts: 0x8000004d, 1024 max matches
Dec 16 21:01:17 macbookpro08-centrim kernel[0]: search attr: 0x02000800 0x0000 0x0000
Dec 16 21:01:17 macbookpro08-centrim kernel[0]: return attr: 0x02038800 0x0000 0x0000
Dec 16 21:01:22 macbookpro08-centrim kernel[0]: considerRebuildOfPrelinkedKernel prebuild rebuild has expired
Dec 16 21:01:22 macbookpro08-centrim kernel[0]: USBMSC Identifier (non-unique): 5B9108000040 0x951 0x1614 0x110
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: ZFSLabelScheme:probe: label '???', vdev 17043972105408950164
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: zfs_vdm_completion: media 0xffffff8034f8a900, media is not present (6), 0 bytes (of 8192)
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: ________________________________________
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: ZFS WARNING: 'error from: fs.zfs.probe_failure'
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: pool: 'gjp22'
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: vdev_type: 'disk'
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: vdev_path: '/dev/dsk/GPTE_64F61AFF-9EBC-4661-9520-7803CD1B8EE4'
Dec 16 21:01:26 macbookpro08-centrim kernel[0]: prev_state: 0
Dec 16 21:01:34 macbookpro08-centrim kernel[0]: Sandbox: sandboxd(442) deny mach-lookup com.apple.coresymbolicationd
Dec 16 21:02:05 macbookpro08-centrim kernel[0]: traverse_snapshot: job -544568400064 time expired
Dec 16 21:02:05 macbookpro08-centrim kernel[0]: 85 matches from 51382 files, 6310 directories in 48 sec
Dec 16 21:02:06 macbookpro08-centrim kernel[0]: search_free_bins: free 1 bins...
Dec 16 21:02:21 macbookpro08-centrim kernel[0]: zfsx_search_start: holding 'gjp22@...searching...'
Dec 16 21:02:21 macbookpro08-centrim kernel[0]: search opts: 0x8000004d, 1024 max matches
Dec 16 21:02:21 macbookpro08-centrim kernel[0]: search attr: 0x02000800 0x0000 0x0000
Dec 16 21:02:21 macbookpro08-centrim kernel[0]: return attr: 0x02038800 0x0000 0x0000
Dec 16 21:04:46 macbookpro08-centrim kernel[0]: ZFSLabelScheme:willTerminate: this 0xffffff80358cb400 provider 0xffffff8038e70a00 '%noformat%'
Dec 16 21:04:46 macbookpro08-centrim kernel[0]: ZFSLabelScheme:stop: 0xffffff80358cb400 goodbye '%noformat%'


Again, I'm not sure what happened on that date. It's possible that the stick was then in the old USB 2.0 hub, which I'm planning to replace. Equally possible that I intentionally removed the stick without taking it offline, just to see how the Mac behaved.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: unexpected force when adding a cache vdev

Post by raattgift » Thu Mar 28, 2013 7:22 am

"zpool history poolname | grep add" is likely to be faster than invoking zdb...

ZEVO's default failmode pool property is continue; I didn't realize that until now. Sun's was wait. There are tradeoffs, but ZEVO's choice may be the right one for its port. However, as this causes the zfs subsystem to report IO errors, it is fully understandable that the diskmanagement subsystem would unmount the affected filesystems when it concludes they are unreachable.

Given that you seem to be plagued by errors on your pools that are not replication-protected, you might consider experimenting with the failmode setting to see whether stalling apps (and other subsystems) is better for you than returning errors to them. You should probably consider Growl or the like to draw attention to failures, if you don't already.

(Unlike you, I have not yet had a catastrophic pool fault with ZEVO (I have also not so coincidentally not made any use of a single-USB2-device-storage-vdev), but I personally prefer the failmode=wait semantics and have consequently now set that on all the pools hanging off Macs I control).

I believe you use the dataset copies=n | n > 1 property to try to recover from errors, but this may not help when the data is actually correct on the drive but being mangled by e.g. a USB<>ATA bridge chipset, and may even be counterproductive (at the very least an unnecessary COW of an already-correct data block and Merkle subtree is scheduled).

Another option you could experiment with is the checksum=off dataset property. This only affects new data written to the dataset in question, and not metadata (in this context that's the Merkle tree, not POSIX directory data, which is treated as ordinary data). You could then use userland tools to analyse corruption in the memory->disk,disk->memory paths.

(I would do neither; I'd attach another side to the storage vdev to make it a mirror, and if both drives return errors during or after the resulting resilver, that would be useful information. If only the first drive is erroring, that'd also be useful information, mainly informing me that it should be retired from use.)
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Re: unexpected force when adding a cache vdev

Post by grahamperrin » Thu Mar 28, 2013 11:01 am

raattgift wrote:… dataset copies=n | n > 1 property to try to recover from errors …


That's a different pool, just one hard disk drive – the LaCie 1 TB Big Disk Extreme (300797EK) under my devices with known imperfections or suspected issues.

… seem to be plagued by errors on your pools that are not replication-protected …


I don't feel plagued – the issue in this topic, affecting pool gjp22, seems exceptional. The pool is otherwise very reliable for me.

There's underlying Core Storage, which I have learnt to trust (as far as possible) as an approach to encryption with ZEVO. Whilst I don't imagine Core Storage as a contributing factor to this issue … 

I wonder whether diskutil, or a related framework or daemon, is bugged; and whether the bug is exposed only with the combination of two pool technologies. I'll try to find a post from a few months ago.

I have Growl, but don't always use it. More often I have HardwareGrowler running.

I'll consider a change of failmode – thanks.
Last edited by grahamperrin on Fri Mar 29, 2013 11:37 am, edited 1 time in total.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: unexpected force when adding a cache vdev

Post by grahamperrin » Thu Mar 28, 2013 9:55 pm

grahamperrin wrote:… I'll try to find a post from a few months ago. …


Found, but I don't know whether it's relevant to this issue: discussions (topics 5 and 74) of a diskutil bug involving %noformat%
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Link

Post by grahamperrin » Fri Mar 29, 2013 5:38 am

raattgift wrote:… I personally prefer the failmode=wait semantics …


zpool failmode in ZEVO Community Edition 1.1.1
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to Troubleshooting

Who is online

Users browsing this forum: hlxpgxmum, ilovezfs and 0 guests

cron