Cannot remove log device in encrypted pool

Here you can discuss every aspect of OpenZFS on OS X. Note: not for support requests!

Cannot remove log device in encrypted pool

Postby DanielSmedegaardBuus » Mon May 13, 2019 12:06 pm

Hi :)

I'm running my pool on Linux ATM, but I had this same problem when it was on macOS, so I'm hoping it's okay to ask here.

I had my pool set up with both a cache and a log device, using partitions on my main SSD at the time. This was on a mini that eventually had severe issues running the pool, so I exported and imported the pool again on a MacBook Pro, which was much more powerful. The SSD was wiped on the mini, including the partitions with the log and cache.

This shouldn't matter, and WRT data, it doesn't, all my data is fine. I removed the cache device without any issues, but whenever I try to remove the log device, I get this error:

Code: Select all
daniel@titanic ~ sudo zpool remove titanic 3470644392946588961
cannot remove 3470644392946588961: Mount encrypted datasets to replay logs.


Thing is, the encrypted datasets are mounted and happy (actually just the one, "titanic" itself (I haven't yet created any other filesystems)).

AFAICT, there's no "force" flag I can use here, so I'm wondering what to do. Shortly, I'll want to add a new log device from the SSD on this new setup, but I'm fearing that if I can't remove the old one, chances are it won't let me "replace" it with it being absent...

Any ideas?

Thanks :)
DanielSmedegaardBuus
 
Posts: 38
Joined: Thu Aug 28, 2014 11:00 pm

Re: Cannot remove log device in encrypted pool

Postby DanielSmedegaardBuus » Tue May 14, 2019 1:17 pm

Hmmm... This gets more interesting.

I was now at the point where I was ready to add back a log and a cache device. So I thought, given that I'm facing this issue of the missing log device that I cannot remove, I'd try replacing it first.

So, I create a partition on my Air's SSD, and do

Code: Select all
daniel@titanic ~ sudo zpool replace -f titanic 3470644392946588961 /dev/disk/by-partlabel/slog
cannot replace 3470644392946588961 with /dev/disk/by-partlabel/slog: new device has a different optimal sector size; use the option '-o ashift=N' to override the optimal size


Huh? That's weird, I think. So I google, and it seems like — please do correct me if I'm wrong — that when you add or replace devices in a pool, it defaults back to the detrimental ashift=9 value that practically no-one uses? Please, please do tell me that I'm wrong, because I've replaced other drives in this pool without any explicit ashift option, and of course they're 4k drives, all non-old drives are. If they've been inserted as ashift=9, that's soooo sucky. I cannot currently check it, as the ZoL sources compile to a semi-broken state on at least Ubuntu, so I have to link so libs from folders to other folders, and apparently zdb is still broken even with those hacks, so I cannot see the info from there :/ ...

And even with an ashift=9 replacement, the pool has now started scrubbing, which to me makes no sense as it's a log device, so what's there to scrub?

Code: Select all
  pool: titanic
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 14 23:01:45 2019
   1.04T scanned at 1.63G/s, 1.04T issued at 1.63G/s, 22.4T total
   0B resilvered, 4.62% done, 0 days 03:43:47 to go
config:

   NAME                                        STATE     READ WRITE CKSUM
   titanic                                     DEGRADED     0     0     0
     raidz3-0                                  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31DA------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX51D4------  ONLINE       0     0     0
       wwn-0x5000000000000001                  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX61D5------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WXK1E9------  ONLINE       0     0 17.1K
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX71D6------  ONLINE       0     0     0
   logs
     replacing-1                               DEGRADED     0     0     0
       slog                                    ONLINE       0     0     0


(And no, it's not the 17.1K checksum device causing the resilver, that's a different story :D )

Or is this simply a metadata-only resilver? (Given the very fast resilver speeds, it looks like it)

Well, either way, my intent was actually to replace the old log device that I was not allowed to remove, in an attempt to maybe be allowed to remove it, and then add a smaller log device (old one 16G, seems like I should go for 1 or 2G instead), but I'm not allowed to do that either:

Code: Select all
daniel@titanic ~ sudo zpool remove titanic /dev/disk/by-partlabel/slog
cannot remove /dev/disk/by-partlabel/slog: operation not supported on this type of pool


I dont' get it. What type of pool? Why? Why would you not be allowed to remove a log device from any type of pool that allows you to add it, and what types of pools are there? OMG, someone help me :D
DanielSmedegaardBuus
 
Posts: 38
Joined: Thu Aug 28, 2014 11:00 pm

Re: Cannot remove log device in encrypted pool

Postby lundman » Wed May 15, 2019 3:47 pm

You probably already tried removing it by using guid, like you did in the first port right? It is most peculiar that it says replacing, and yet only shows one device. Have you had any luck at all with 1.9rc1 ?
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan

Re: Cannot remove log device in encrypted pool

Postby DanielSmedegaardBuus » Thu May 16, 2019 10:24 am

Hmm... Good point about it not showing the device being replaced. I'm certain that that's a copy/paste error — it also doesn't show the "errors: No known data errors" at the bottom.

I actually created a report on ZoL's github here: https://github.com/zfsonlinux/zfs/issues/8748

On that are all the remove/detach commands I've tried. And indeed, adding a log device without specifying an ashift value does initialise it with the default ashift=9.

Cheers :)
DanielSmedegaardBuus
 
Posts: 38
Joined: Thu Aug 28, 2014 11:00 pm


Return to General Discussions

Who is online

Users browsing this forum: No registered users and 12 guests