I said, I would report as soon as the pool expansion is completed.
This is the case now.
I replaced 6x2TB disks in a raidz2 pool with 8TB drives one by one.
The short version is: This took more than a month.
The replacements took this many hours:
1st disk: 52,5
2nd disk: 50,1
3rd disk: 190,5
4th disk: 137,1 + 70
5th disk 169
6th disk: 42,8 + 78,1
The 4th and 6th disks have been done twice, because the software replacement did not finish when the resilvering was done, and exporting and reimporting the pool caused a fresh resilvering to be triggered.
Here comes the long version:
The installed ZFS version was 1.3.0 and I had an array of raidz2 with 6x2TB installed in a Promise Pegasus R6 case connected via Thunderbolt.
- Code: Select all
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
zfsraid 7.1Ti 6.3Ti 863Gi 89% 3234860 1809126368 0% /Volumes/zfsraid
I intended to expand the pool by replacing the 2TB disks with 8TB disks. I did this one at a time, so to still maintain some level of redundancy during replacement.
The first disk was replaced the naive way: I physically replaced one disk, imported the pool and invoked the zpool replace command, so resilvering from a degraded state.
The result was:
- Code: Select all
pool: zfsraid
state: ONLINE
scan: resilvered 1.60T in 52h28m with 0 errors on Thu Apr 16 21:33:21 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk4 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk3 ONLINE 0 0 0
disk8 ONLINE 0 0 0
errors: No known data errors
For the second disk, I bought an external USB3 dock (similar to an external HDD case, but more like a docking station).
I connected the new disk via the external dock and invoked zpool replace with old and new disks connected, thus resilvering from a completely healthy pool.
The result was:
- Code: Select all
Sun Apr 19 03:01:56 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.60T in 50h8m with 0 errors on Sun Apr 19 03:00:58 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk10 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk8 ONLINE 0 0 0
errors: No known data errors
For the third disk, I replaced the disk within the Pegasus case, but connected the old disk via the USB3 dock, thus resilvering from a healthy pool as well.
The result was:
- Code: Select all
Mon Apr 27 04:46:28 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.60T in 190h29m with 0 errors on Mon Apr 27 02:41:28 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk8 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
disk9 ONLINE 0 0 0
errors: No known data errors
For the fourth disk, I connected the new one to the USB3 and resilvered from a healthy pool. I thought, that the heat from running the disks non-stop for over a week immediately before the third replacement might have been the problem causing it to take over a week.
I set the resilver delay to 0 by invoking
- Code: Select all
sudo sysctl set kstat.zfs.darwin.tunable.zfs_resilver_delay=0
The result was:
- Code: Select all
Sat May 2 22:25:35 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.34T in 137h8m with 0 errors on Sat May 2 22:22:52 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk8 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-5 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk2 ONLINE 0 0 0
errors: No known data errors
Though, I have to say, that the computer crashed in-between with a kernel panic (caused by zfs), and I lost some time <20h. Also you might notice, that the "(resilvering)" remark is gone, but it's still reporting to replace the disk. I wasn't aware of that and physically replaced the disk with the resilvered replica and as soon as the pool was imported again, resilvering started from the beginning.
I provided the old disk via the USB3 dock.
Moreover, ZFS started to cause trouble in the form of kernel panics at random times.
I migrated to version 1.3.2 RC1 (available here https://openzfsonosx.org/forum/viewtopic.php?f=20&t=2262) in between, after consultation with ilovezfs. This was really a success and no kernel panic occurred during this replacement anymore.
The result of that 2nd (unnecessary) resilvering of the fourth disk was:
- Code: Select all
Tue May 5 21:17:39 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.60T in 70h1m with 0 errors on Tue May 5 21:14:41 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk3 ONLINE 0 0 0
disk7 ONLINE 0 0 0
errors: No known data errors
The fifth disk pretty much did not raise any new insights, just took really long again. Disk physically replaced before resilvering, but old disk at USB3 dock:
- Code: Select all
Tue May 12 22:59:59 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.60T in 169h3m with 0 errors on Tue May 12 22:38:59 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk3 ONLINE 0 0 0
disk7 ONLINE 0 0 0
errors: No known data errors
The sixth and last disk was physically replaced before the resilvering again, and the old disk was connected via the USB3 dock.
Moreover, I set kernel parameters:
- Code: Select all
sudo sysctl set kstat.zfs.darwin.tunable.zfs_top_maxinflight=128
sudo sysctl set kstat.zfs.darwin.tunable.zfs_resilver_delay=0
This went fine, at first:
- Code: Select all
Wed May 13 20:23:29 CEST 2015
pool: zfsraid
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
2.72T scanned out of 9.43T at 72.2M/s, 27h4m to go
465G resilvered, 28.88% done
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk4 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk5 ONLINE 0 0 0
replacing-4 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk10 ONLINE 0 0 0 (resilvering)
disk7 ONLINE 0 0 0
errors: No known data errors
But then ZFS started to go nuts:
- Code: Select all
Wed May 13 23:21:31 CEST 2015
pool: zfsraid
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
3.45T scanned out of 9.43T at 71.9M/s, 24h14m to go
588G resilvered, 36.54% done
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk4 ONLINE 0 0 4 (resilvering)
disk6 ONLINE 0 0 7 (resilvering)
disk5 ONLINE 0 0 8 (resilvering)
replacing-4 ONLINE 0 0 4
disk2 ONLINE 0 0 0 (resilvering)
disk10 ONLINE 0 0 0 (resilvering)
disk7 ONLINE 0 0 0
errors: No known data errors
Eventually resulting in another kernel panic (the resulting reboot reset the kernel parameters):
- Code: Select all
Anonymous UUID: 42E5B6EB-0C42-8F48-CBE4-05F96D8E3409
Thu May 14 00:00:09 2015
*** Panic Report ***
panic(cpu 5 caller 0xffffff8028617cc2): Kernel trap at 0xffffff7fa8d046a7, type 14=page fault, registers:
CR0: 0x000000008001003b, CR2: 0x0000000000000048, CR3: 0x000000002bbf1000, CR4: 0x00000000001626e0
RAX: 0x0000000000000000, RBX: 0xffffff821bd03e60, RCX: 0x0000000000000248, RDX: 0x0000000000000000
RSP: 0xffffff821bd03dd0, RBP: 0xffffff821bd03e90, RSI: 0x0000000000000000, RDI: 0x0000000000000000
R8: 0x0000000000988d64, R9: 0xffffff82072e0068, R10: 0x00027f9cc2458afc, R11: 0x00027f9cc1acfd98
R12: 0xffffff8244f6e218, R13: 0x0000000000000003, R14: 0xffffff8244f6ddc8, R15: 0xffffff8244f6de80
RFL: 0x0000000000010246, RIP: 0xffffff7fa8d046a7, CS: 0x0000000000000008, SS: 0x0000000000000010
Fault CR2: 0x0000000000000048, Error code: 0x0000000000000000, Fault CPU: 0x5
Backtrace (CPU 5), Frame : Return Address
0xffffff821bd03a80 : 0xffffff802852bda1
0xffffff821bd03b00 : 0xffffff8028617cc2
0xffffff821bd03cc0 : 0xffffff8028634b73
0xffffff821bd03ce0 : 0xffffff7fa8d046a7
0xffffff821bd03e90 : 0xffffff7fa8d04cb1
0xffffff821bd03ec0 : 0xffffff7fa8d419dd
0xffffff821bd03ef0 : 0xffffff7fa8d3ef6f
0xffffff821bd03f50 : 0xffffff7fa8c809a0
0xffffff821bd03fb0 : 0xffffff80286125b7
Kernel Extensions in backtrace:
net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7fa8c76000->0xffffff7fa8caafff
net.lundman.zfs(1.3.1)[C77DD3AA-EFD4-3799-BB77-8C5BA45AB071]@0xffffff7fa8cab000->0xffffff7fa8e5efff
dependency: com.apple.iokit.IOStorageFamily(2.0)[76E50D45-C97B-3ED1-97C5-94E6E0EB4514]@0xffffff7fa8c47000
dependency: net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7fa8c76000
BSD process name corresponding to current thread: kernel_task
Mac OS version:
14D136
Kernel version:
Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
Kernel UUID: 4B3A11F4-77AA-3D27-A22D-81A1BC5B504D
Kernel slide: 0x0000000028200000
Kernel text base: 0xffffff8028400000
__HIB text base: 0xffffff8028300000
System model name: MacBookPro10,1 (Mac-C3EC7CD22292981F)
System uptime in nanoseconds: 703261194582040
last loaded kext at 692612093051717: com.apple.driver.AppleUSBCDC 4.3.3b1 (addr 0xffffff7fab5af000, size 20480)
last unloaded kext at 692736590201020: com.apple.driver.AppleUSBCDC 4.3.3b1 (addr 0xffffff7fab5af000, size 16384)
loaded kexts:
com.promise.driver.stex 5.2.10
org.virtualbox.kext.VBoxNetAdp 4.3.16
org.virtualbox.kext.VBoxNetFlt 4.3.16
org.virtualbox.kext.VBoxUSB 4.3.16
org.virtualbox.kext.VBoxDrv 4.3.16
com.delantis.kext.tcpblocknke 4.0.0
net.lundman.zfs 1.3.1
net.lundman.spl 1.3.1
com.Cycling74.driver.Soundflower 1.6.6
com.apple.iokit.SCSITaskUserClient 3.7.5
com.apple.filesystems.smbfs 3.0.1
com.apple.filesystems.udf 2.5
com.apple.driver.AppleHWSensor 1.9.5d0
com.apple.filesystems.autofs 3.0
com.apple.driver.ApplePlatformEnabler 2.2.0d4
com.apple.driver.AGPM 110.19.5
com.apple.driver.X86PlatformShim 1.0.0
com.apple.iokit.IOBluetoothSerialManager 4.3.4f4
com.apple.driver.AppleMikeyHIDDriver 124
com.apple.driver.AppleOSXWatchdog 1
com.apple.driver.AppleMikeyDriver 272.18
com.apple.driver.AppleHDA 272.18
com.apple.driver.AudioAUUC 1.70
com.apple.iokit.IOUserEthernet 1.0.1
com.apple.driver.AppleUpstreamUserClient 3.6.1
com.apple.driver.AppleIntelHD4000Graphics 10.0.6
com.apple.Dont_Steal_Mac_OS_X 7.0.0
com.apple.GeForce 10.0.2
com.apple.driver.AppleThunderboltIP 2.0.2
com.apple.driver.AppleHWAccess 1
com.apple.driver.AppleHV 1
com.apple.driver.AppleSMCLMU 2.0.7d0
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport 4.3.4f4
com.apple.driver.AppleSMCPDRC 1.0.0
com.apple.driver.AppleLPC 1.7.3
com.apple.driver.AppleMuxControl 3.10.22
com.apple.driver.AppleIntelSlowAdaptiveClocking 4.0.0
com.apple.driver.AppleMCCSControl 1.2.11
com.apple.driver.AppleIntelFramebufferCapri 10.0.6
com.apple.driver.AppleUSBTCButtons 240.2
com.apple.driver.AppleUSBTCKeyboard 240.2
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless 1.0.0d1
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.BootCache 36
com.apple.driver.XsanFilter 404
com.apple.iokit.IOAHCIBlockStorage 2.7.1
com.apple.driver.AppleUSBHub 705.4.2
com.apple.driver.AppleSDXC 1.6.5
com.apple.driver.AirPort.Brcm4360 930.37.3
com.apple.driver.AppleAHCIPort 3.1.2
com.apple.driver.AppleUSBEHCI 705.4.14
com.apple.driver.AppleUSBXHCI 710.4.11
com.apple.driver.AppleSmartBatteryManager 161.0.0
com.apple.driver.AppleACPIButtons 3.1
com.apple.driver.AppleRTC 2.0
com.apple.driver.AppleHPET 1.8
com.apple.driver.AppleSMBIOS 2.1
com.apple.driver.AppleACPIEC 3.1
com.apple.driver.AppleAPIC 1.7
com.apple.driver.AppleIntelCPUPowerManagementClient 218.0.0
com.apple.nke.applicationfirewall 161
com.apple.security.quarantine 3
com.apple.security.TMSafetyNet 8
com.apple.driver.AppleIntelCPUPowerManagement 218.0.0
com.apple.iokit.IOSCSIParallelFamily 3.0.0
com.apple.driver.AppleThunderboltPCIUpAdapter 2.0.2
com.apple.driver.AppleThunderboltDPOutAdapter 4.0.6
com.apple.iokit.IOUSBMassStorageClass 3.7.2
com.apple.iokit.IOSCSIBlockCommandsDevice 3.7.5
com.apple.kext.triggers 1.0
com.apple.iokit.IOSerialFamily 11
com.apple.driver.DspFuncLib 272.18
com.apple.kext.OSvKernDSPLib 1.15
com.apple.iokit.IOSurface 97.4
com.apple.nvidia.driver.NVDAGK100Hal 10.0.2
com.apple.nvidia.driver.NVDAResman 10.0.2
com.apple.iokit.IOBluetoothHostControllerUSBTransport 4.3.4f4
com.apple.iokit.IOBluetoothFamily 4.3.4f4
com.apple.driver.AppleHDAController 272.18
com.apple.iokit.IOHDAFamily 272.18
com.apple.iokit.IOAudioFamily 203.3
com.apple.vecLib.kext 1.2.0
com.apple.iokit.IOUSBUserClient 705.4.0
com.apple.driver.AppleSMBusPCI 1.0.12d1
com.apple.driver.AppleBacklightExpert 1.1.0
com.apple.driver.AppleGraphicsControl 3.10.22
com.apple.iokit.IOSlowAdaptiveClockingFamily 1.0.0
com.apple.driver.X86PlatformPlugin 1.0.0
com.apple.driver.AppleSMC 3.1.9
com.apple.driver.IOPlatformPluginFamily 5.9.1d7
com.apple.driver.AppleSMBusController 1.0.13d1
com.apple.iokit.IOAcceleratorFamily2 156.14
com.apple.AppleGraphicsDeviceControl 3.10.22
com.apple.iokit.IONDRVSupport 2.4.1
com.apple.iokit.IOGraphicsFamily 2.4.1
com.apple.driver.AppleUSBMultitouch 245.2
com.apple.iokit.IOUSBHIDDriver 705.4.0
com.apple.driver.AppleThunderboltDPInAdapter 4.0.6
com.apple.driver.AppleThunderboltDPAdapterFamily 4.0.6
com.apple.driver.AppleThunderboltPCIDownAdapter 2.0.2
com.apple.driver.AppleUSBMergeNub 705.4.0
com.apple.driver.AppleUSBComposite 705.4.9
com.apple.driver.CoreStorage 471.20.7
com.apple.iokit.IOSCSIArchitectureModelFamily 3.7.5
com.apple.driver.AppleThunderboltNHI 3.1.7
com.apple.iokit.IOThunderboltFamily 4.2.2
com.apple.iokit.IO80211Family 730.60
com.apple.driver.mDNSOffloadUserClient 1.0.1b8
com.apple.iokit.IONetworkingFamily 3.2
com.apple.iokit.IOAHCIFamily 2.7.5
com.apple.iokit.IOUSBFamily 720.4.4
com.apple.driver.AppleEFINVRAM 2.0
com.apple.driver.AppleEFIRuntime 2.0
com.apple.iokit.IOHIDFamily 2.0.0
com.apple.iokit.IOSMBusFamily 1.1
com.apple.security.sandbox 300.0
com.apple.kext.AppleMatch 1.0.0d1
com.apple.driver.AppleKeyStore 2
com.apple.driver.AppleMobileFileIntegrity 1.0.5
com.apple.driver.AppleCredentialManager 1.0
com.apple.driver.DiskImages 396
com.apple.iokit.IOStorageFamily 2.0
com.apple.iokit.IOReportFamily 31
com.apple.driver.AppleFDEKeyStore 28.30
com.apple.driver.AppleACPIPlatform 3.1
com.apple.iokit.IOPCIFamily 2.9
com.apple.iokit.IOACPIFamily 1.4
com.apple.kec.corecrypto 1.0
com.apple.kec.pthread 1
com.apple.kec.Libm 1
Model: MacBookPro10,1, BootROM MBP101.00EE.B07, 4 processors, Intel Core i7, 2.3 GHz, 16 GB, SMC 2.3f36
Graphics: Intel HD Graphics 4000, Intel HD Graphics 4000, Built-In
Graphics: NVIDIA GeForce GT 650M, NVIDIA GeForce GT 650M, PCIe, 1024 MB
Memory Module: BANK 0/DIMM0, 8 GB, DDR3, 1600 MHz, 0x80AD, 0x484D5434314753364D465238432D50422020
Memory Module: BANK 1/DIMM0, 8 GB, DDR3, 1600 MHz, 0x80AD, 0x484D5434314753364D465238432D50422020
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0xEF), Broadcom BCM43xx 1.0 (7.15.166.24.3)
Bluetooth: Version 4.3.4f4 15601, 3 services, 18 devices, 1 incoming serial ports
Network Service: Thunderbolt Ethernet, Ethernet, en4
Network Service: Wi-Fi, AirPort, en0
PCI Card: pci105a,8760, RAID Controller, Thunderbolt@194,0,0
PCI Card: pci105a,8760, RAID Controller, Thunderbolt@10,0,0
Serial ATA Device: APPLE SSD SM256E, 251 GB
USB Device: USB-SATA Bridge
USB Device: Hub
USB Device: FaceTime HD Camera (Built-in)
USB Device: Hub
USB Device: My Book
USB Device: Hub
USB Device: BRCM20702 Hub
USB Device: Bluetooth USB Host Controller
USB Device: Apple Internal Keyboard / Trackpad
Thunderbolt Bus: MacBook Pro, Apple Inc., 23.4
Thunderbolt Device: Pegasus2-R, Promise Technology, Inc., 1, 19.2
Thunderbolt Device: Pegasus R-Series, Promise Technology, Inc., 3, 22.2
After the crash, resilvering continued normally again:
- Code: Select all
Thu May 14 00:16:59 CEST 2015
pool: zfsraid
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
3.53T scanned out of 9.43T at 47.6M/s, 36h6m to go
595G resilvered, 37.43% done
config:
NAME STATE READ WRITE CKSUM
zfsraid DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
disk8 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-4 DEGRADED 0 0 0
media-4DFD3E7B-CDA3-CF4C-B364-17B142373380 ONLINE 0 0 0
4566845490322943837 UNAVAIL 0 0 0 was /dev/disk10s1
disk7 ONLINE 0 0 0
errors: No known data errors
But as soon as the CPU got load, the new disk was not resilvering anymore, but instead other disks:
- Code: Select all
Thu May 14 00:26:50 CEST 2015
pool: zfsraid
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
3.75T scanned out of 9.43T at 176M/s, 9h24m to go
595G resilvered, 39.77% done
config:
NAME STATE READ WRITE CKSUM
zfsraid DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
disk8 ONLINE 0 0 6 (resilvering)
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-4 DEGRADED 0 0 0
media-4DFD3E7B-CDA3-CF4C-B364-17B142373380 ONLINE 0 0 0
4566845490322943837 UNAVAIL 0 0 0 was /dev/disk10s1
disk7 ONLINE 0 0 2 (resilvering)
errors: No known data errors
Then, the resilvering of the actually replacing disks continued at some point:
- Code: Select all
Thu May 14 00:28:20 CEST 2015
pool: zfsraid
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
3.79T scanned out of 9.43T at 187M/s, 8h46m to go
595G resilvered, 40.13% done
config:
NAME STATE READ WRITE CKSUM
zfsraid DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
disk8 ONLINE 0 0 6 (resilvering)
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-4 DEGRADED 0 0 4
media-4DFD3E7B-CDA3-CF4C-B364-17B142373380 ONLINE 0 0 0 (resilvering)
4566845490322943837 UNAVAIL 0 0 0 was /dev/disk10s1
disk7 ONLINE 0 0 2 (resilvering)
errors: No known data errors
But it still touched other disks, too. So I rebooted the system, but kept CPU load low. Now it was resilvering both, the old and the new disk:
- Code: Select all
pool: zfsraid
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue May 12 23:34:27 2015
3.88T scanned out of 9.43T at 4.80M/s, 336h32m to go
595G resilvered, 41.17% done
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-4 ONLINE 0 0 4
media-4DFD3E7B-CDA3-CF4C-B364-17B142373380 ONLINE 0 0 0 (resilvering)
disk9 ONLINE 0 0 0 (resilvering)
disk8 ONLINE 0 0 0
errors: No known data errors
Just when I thought, it's done, I realized that it happened again: The resilvering was done, but the replacement did not complete.
Exporting and reimporting started another resilvering. I tried several things including a downgrade as you can read in this thread (viewtopic.php?f=26&t=2276).
- Code: Select all
Thu May 14 18:27:02 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.51T in 42h46m with 0 errors on Thu May 14 18:20:42 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk5 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk4 ONLINE 0 0 0
replacing-4 ONLINE 0 0 4
media-4DFD3E7B-CDA3-CF4C-B364-17B142373380 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk8 ONLINE 0 0 0
errors: No known data errors
Finally, I upgraded to the 1.3.2 RC1 version again and let the additional resilvering do its job.
- Code: Select all
Fri May 22 11:38:23 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 1.57T in 78h9m with 0 errors on Fri May 22 06:42:47 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk10 ONLINE 0 0 0
disk8 ONLINE 0 0 0
errors: No known data errors
Both times the replacement has not been completed, although the resilvering has, I modified kernel parameters. Maybe there is a correlation, maybe not. I don't know.
But this time (with kernel params reset due to the crash/restart), it was successful and I could offline one disk and online it again with the -e flag set in order to expand the pool, which worked like a charm:
- Code: Select all
Fri May 22 11:40:28 CEST 2015
pool: zfsraid
state: ONLINE
scan: resilvered 116K in 0h0m with 0 errors on Fri May 22 11:40:20 2015
config:
NAME STATE READ WRITE CKSUM
zfsraid ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
disk9 ONLINE 0 0 0
disk2 ONLINE 0 0 0
disk7 ONLINE 0 0 0
disk6 ONLINE 0 0 0
disk10 ONLINE 0 0 0
disk8 ONLINE 0 0 0
errors: No known data errors
- Code: Select all
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
zfsraid 29Ti 10Ti 18Ti 36% 3995554 39312909752 0% /Volumes/zfsraid