Pool I/O currently suspended

All your general support questions for OpenZFS on OS X.

Pool I/O currently suspended

Postby AnWa » Thu Nov 21, 2024 9:19 am

Hm.

I am trying to access a ZFS pool that has been cold for some time, to transfer the data out of it. After several attempts, the pool shows up in Finder, but I still can't access its contents, not even through the terminal.

When running "zpool status -v" I get the error message "pool I/O is currently suspended". It also says that resilvering Is in progress on one HDD, but that it's been going on since 1 August 2023 and still is only 0.05% done.

What do I do?
AnWa
 
Posts: 22
Joined: Tue Nov 04, 2014 9:57 am

Re: Pool I/O currently suspended

Postby Haravikk » Thu Dec 12, 2024 3:39 pm

Pool I/O currently suspended means one or more required disks is currently unavailable, so all I/O to the pool has been frozen until the missing disk(s) come back.

This happens when a pool doesn't have enough redundancy to go into a degraded state – for example, if you had a pool consisting of a mirrored pair of disks (two disks containing identical data), if one of those disks fails (or gets disconnected) the pool becomes "degraded" (missing disk, but should still have enough copies of the data to continue operating). But if you lose the second disk as well it will be suspended, as there's nowhere for I/O to go. If the disk(s) are only temporarily unavailable due to a connection issue, then after they are reconnected you should be able to use zpool clear to resume operation.

However, I've had cases where a disk was reconnected and then "disconnected" immediately upon reading the same bad sector again – in this case you're in trouble because if you don't have another redundant disk to read from you may not be able to recover the data enough to continue using the pool.

If you have a disk that's continuing a resilver that was never completed, then it seems likely you may have also lost the disk it was trying to resilver from, which would result in the suspended state (the resilvering disk isn't ready for use, and there's no other disk to read from/write to).

We'd need more information to confirm though – can you give the output of `zpool status -v`? If any of your disks are showing as offline/unavailable you might want to look at them in disk utility to verify they are connected.
Haravikk
 
Posts: 99
Joined: Tue Mar 17, 2015 4:52 am

Re: Pool I/O currently suspended

Postby AnWa » Thu Jan 16, 2025 6:23 am

Here is the output:

server:~ username$ sudo zpool import
pool: array_2
id: 7027660346350053619
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.
config:

array_2 ONLINE
raidz2-0 ONLINE
media-A455D448-62F3-4C1B-9B2B-4A15467676FE ONLINE
media-872CDBD9-55E9-488E-BA9D-57FFF4CF3256 ONLINE
media-48193237-9CFD-41DA-ADA1-8A9C816BB599 ONLINE
media-961225AB-F504-4D57-BA2E-A5E5DD5886BD ONLINE
media-170340EC-B30F-47E9-B6FA-FD57D3A81BC2 ONLINE
media-F730C181-CF39-4187-BCE1-1E2CA0CEC63B ONLINE
media-28D9CCAF-B327-40A6-B40A-73DADB56B952 ONLINE
media-4CE8920B-BB57-4BAB-85D7-32CAB1A9853C ONLINE
server:~ username$ sudo zpool status -v
no pools available


I am now attempting to reimport the array.

If one or two discs are lost: is there any way to extract the files that are stored on the remaining "good" discs?
AnWa
 
Posts: 22
Joined: Tue Nov 04, 2014 9:57 am

Re: Pool I/O currently suspended

Postby Haravikk » Thu Jan 16, 2025 9:28 am

AnWa wrote:If one or two discs are lost: is there any way to extract the files that are stored on the remaining "good" discs?

In a raidz2 you should be able to lose two disks without any data loss (or going into a suspended state), as any six complete (not resilvering) disks should be enough to continue operating, if you've lost two disks and have one that's resilvering then you don't have enough redundancy, you would need to somehow bring back one of the missing disks.

Your import command is showing all of your disks as online, but not how many of these were resilvering – do you know how many disks are being resilvered?

The fact that they're all online right now makes me wonder if you're experiencing some kind of connection fault, e.g- loose cables or something wrong with the controller. How are you attaching all of these disks to your computer, are they in one or more enclosures of some kind?

One thing to note that I should have said is, if ZFS is showing disks as unavailable, if you think the disks are available (they appear in Disk Utility) another command you can try is "zpool reopen array_2" (minus quotes) as this will tell ZFS to try reconnecting to the disks, which may unsuspend I/O.

If you just want to recover data, one other option you might try is to import the pool as read only using "zpool import -o readonly=on array_2", this will make the entire pool readonly which might make it easier to copy data off datasets in the event that some kind of writing is what is causing your faults.
Last edited by Haravikk on Tue Jan 21, 2025 8:46 am, edited 1 time in total.
Haravikk
 
Posts: 99
Joined: Tue Mar 17, 2015 4:52 am

Re: Pool I/O currently suspended

Postby AnWa » Tue Jan 21, 2025 2:04 am

The discs are in an external disc cabinet. It's connected to the computer with a cable that is fixed in place at both ends, so there shouldn't be a physical connection issue.

I am currently trying the zpool import -f command. It might take a while.
AnWa
 
Posts: 22
Joined: Tue Nov 04, 2014 9:57 am

Re: Pool I/O currently suspended

Postby AnWa » Thu Jan 23, 2025 10:53 pm

The pool mounted, but I can't access its contents. In both Finder and the terminal, nothing shows up. (The terminal operation never finishes.) I also can't unmount the pool, since it's busy.

On the good side, the resilvering seems to be moving forward. It's now at 4.43 %, so it should be done any decade now...
AnWa
 
Posts: 22
Joined: Tue Nov 04, 2014 9:57 am

Re: Pool I/O currently suspended

Postby gea » Fri Jan 24, 2025 1:33 am

A very slow pool up to io errors indicates one or more bad/semidead disks or bad cables.
If the pool is mounted and working, run
sudo zpool iostat -vly 5 1

This will give a list of all disks with their bandwith and wait values.
In a pool all disks should behave similar. If a disk is much slower it is probably bad.

As the pool will work in a degraded state with up to two disks removed you can disconnect such disks and check result. ZFS is uncritical to such. If you unplug three disks the pool is offline but come back when you reconnect disks.

If you can identify a bad or weak disk, replace or do an intensive check ex boot with a Hirens bootstick (Win PE) and use WD data lifeguard for an intensive surface check with a possible repair.
gea
 
Posts: 27
Joined: Tue Jan 23, 2024 9:56 am

Re: Pool I/O currently suspended

Postby tangles » Sat Jan 25, 2025 3:31 pm

Yep
Identify problematic disk(s)
Shutdown
Remove them
Startup
Expect normal pool mounting again
If so, copy data off quick smart
Count your ZFZ blessings

To suss whether you have a bay/cabling issue:
Shutdown again
Offset/ move along each disk to the accompanying bay
Startup
If the pool mounts and is accessible/functional as you’d expect (even though missing disk(s) still), then it suggests disk issue rather than cabling.
If you encounter similar problems as before, then likely bay/cabling issue and your disks are probably fine
tangles
 
Posts: 203
Joined: Tue Jun 17, 2014 6:54 am

Re: Pool I/O currently suspended

Postby Haravikk » Sun Apr 27, 2025 2:39 am

Did you ever get anywhere with this?

I just wanted to add that I recently had to do a disk recovery for someone so I wanted to add some notes on that, because if you have a pool without enough redundancy to repair itself, then cloning one of the failed disks may be be your only option after trying everything else.

If you find that one of your disks is consistently failing, you may want to try cloning it onto a new disk of the same size to see if you can get ZFS to accept the clone as a replacement. To do this you'll want to use the tool ddrescue – it's possible to just use the regular dd tool but ddrescue has extra features that make it better for recovering problem drives.

To use ddrescue you'll run the command something like so:

Code: Select all
ddrescue /dev/diskX /dev/rdiskY ~/Desktop/ddrescue.mapfile


Where /dev/diskX is replaced with the correct path for the failing disk (e.g- /dev/disk4) – you want to use the whole disk device (not a partition) and you want to use the buffered device to reduce the risk of kernel panics (it's just safer for a disk you suspect is failing). Meanwhile /dev/rdiskY is the path for your replacement disk (e.g- /dev/rdisk5) – I'm using the unbuffered (rdiskY instead of diskY) device in this case because it will be a lot faster than buffered writes, and ddrescue tracks its progress so there's limited risk from interruptions. Lastly, the ~/Desktop/ddrescue.mapfile bit tells ddrescue where to store its progress – you want this because ddrescue can take a long time and if it gets interrupted for any reason it can use the mapfile to resume.

Since the clone target is a device, you'll probably need to add the -f flag (ddrescue -f /dev/diskX etc.), but I prefer to keep that off until I've triple checked I've entered the correct devices. You may also want to add the -s flag with the size of the source disk in bytes so you can get an accurate progress reading, you can get the size in bytes using: diskutil info diskX

For safety it is probably best to do this without attempting to import your pool (or restart if you already tried and got stuck), this may make it trickier to work out which disk you need to specify, so triple check you've got the correct source disk (and target replacement disk). And always triple check the device IDs if you ever need to resume ddrescue, as they can change after a restart or disk disconnection.

Once you've cloned the disk (or as much as ddrescue can manage, as there may be sectors it simply cannot read) then you can shutdown, remove the old disk, then try to import your pool with the cloned replacement (ZFS should recognise it in place of the old disk, but you can try zpool online if it doesn't).
Haravikk
 
Posts: 99
Joined: Tue Mar 17, 2015 4:52 am


Return to General Help

Who is online

Users browsing this forum: No registered users and 77 guests

cron