Replacing all disks of a raidz2 for pool expansion: A report

Here you can discuss every aspect of OpenZFS on OS X. Note: not for support requests!

Replacing all disks of a raidz2 for pool expansion: A report

Postby rattlehead » Wed May 27, 2015 6:08 am

Hello everybody,

I said, I would report as soon as the pool expansion is completed.
This is the case now.

I replaced 6x2TB disks in a raidz2 pool with 8TB drives one by one.

The short version is: This took more than a month.
The replacements took this many hours:
1st disk: 52,5
2nd disk: 50,1
3rd disk: 190,5
4th disk: 137,1 + 70
5th disk 169
6th disk: 42,8 + 78,1

The 4th and 6th disks have been done twice, because the software replacement did not finish when the resilvering was done, and exporting and reimporting the pool caused a fresh resilvering to be triggered.


Here comes the long version:

The installed ZFS version was 1.3.0 and I had an array of raidz2 with 6x2TB installed in a Promise Pegasus R6 case connected via Thunderbolt.
Code: Select all
Filesystem   Size   Used  Avail Capacity iused      ifree %iused  Mounted on
zfsraid     7.1Ti  6.3Ti  863Gi    89% 3234860 1809126368    0%   /Volumes/zfsraid


I intended to expand the pool by replacing the 2TB disks with 8TB disks. I did this one at a time, so to still maintain some level of redundancy during replacement.
The first disk was replaced the naive way: I physically replaced one disk, imported the pool and invoked the zpool replace command, so resilvering from a degraded state.
The result was:
Code: Select all
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.60T in 52h28m with 0 errors on Thu Apr 16 21:33:21 2015
config:

   NAME        STATE     READ WRITE CKSUM
   zfsraid     ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       disk7   ONLINE       0     0     0
       disk4   ONLINE       0     0     0
       disk5   ONLINE       0     0     0
       disk6   ONLINE       0     0     0
       disk3   ONLINE       0     0     0
       disk8   ONLINE       0     0     0

errors: No known data errors


For the second disk, I bought an external USB3 dock (similar to an external HDD case, but more like a docking station).
I connected the new disk via the external dock and invoked zpool replace with old and new disks connected, thus resilvering from a completely healthy pool.
The result was:
Code: Select all
Sun Apr 19 03:01:56 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.60T in 50h8m with 0 errors on Sun Apr 19 03:00:58 2015
config:

   NAME        STATE     READ WRITE CKSUM
   zfsraid     ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       disk7   ONLINE       0     0     0
       disk9   ONLINE       0     0     0
       disk2   ONLINE       0     0     0
       disk10  ONLINE       0     0     0
       disk6   ONLINE       0     0     0
       disk8   ONLINE       0     0     0

errors: No known data errors


For the third disk, I replaced the disk within the Pegasus case, but connected the old disk via the USB3 dock, thus resilvering from a healthy pool as well.
The result was:
Code: Select all
Mon Apr 27 04:46:28 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.60T in 190h29m with 0 errors on Mon Apr 27 02:41:28 2015
config:

   NAME        STATE     READ WRITE CKSUM
   zfsraid     ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       disk8   ONLINE       0     0     0
       disk5   ONLINE       0     0     0
       disk7   ONLINE       0     0     0
       disk6   ONLINE       0     0     0
       disk4   ONLINE       0     0     0
       disk9   ONLINE       0     0     0

errors: No known data errors


For the fourth disk, I connected the new one to the USB3 and resilvered from a healthy pool. I thought, that the heat from running the disks non-stop for over a week immediately before the third replacement might have been the problem causing it to take over a week.
I set the resilver delay to 0 by invoking
Code: Select all
sudo sysctl set kstat.zfs.darwin.tunable.zfs_resilver_delay=0


The result was:
Code: Select all
Sat May  2 22:25:35 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.34T in 137h8m with 0 errors on Sat May  2 22:22:52 2015
config:

   NAME             STATE     READ WRITE CKSUM
   zfsraid          ONLINE       0     0     0
     raidz2-0       ONLINE       0     0     0
       disk8        ONLINE       0     0     0
       disk5        ONLINE       0     0     0
       disk6        ONLINE       0     0     0
       disk7        ONLINE       0     0     0
       disk4        ONLINE       0     0     0
       replacing-5  ONLINE       0     0     0
         disk9      ONLINE       0     0     0
         disk2      ONLINE       0     0     0

errors: No known data errors


Though, I have to say, that the computer crashed in-between with a kernel panic (caused by zfs), and I lost some time <20h. Also you might notice, that the "(resilvering)" remark is gone, but it's still reporting to replace the disk. I wasn't aware of that and physically replaced the disk with the resilvered replica and as soon as the pool was imported again, resilvering started from the beginning.
I provided the old disk via the USB3 dock.
Moreover, ZFS started to cause trouble in the form of kernel panics at random times.
I migrated to version 1.3.2 RC1 (available here https://openzfsonosx.org/forum/viewtopic.php?f=20&t=2262) in between, after consultation with ilovezfs. This was really a success and no kernel panic occurred during this replacement anymore.
The result of that 2nd (unnecessary) resilvering of the fourth disk was:
Code: Select all
Tue May  5 21:17:39 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.60T in 70h1m with 0 errors on Tue May  5 21:14:41 2015
config:

   NAME        STATE     READ WRITE CKSUM
   zfsraid     ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       disk6   ONLINE       0     0     0
       disk4   ONLINE       0     0     0
       disk2   ONLINE       0     0     0
       disk5   ONLINE       0     0     0
       disk3   ONLINE       0     0     0
       disk7   ONLINE       0     0     0

errors: No known data errors


The fifth disk pretty much did not raise any new insights, just took really long again. Disk physically replaced before resilvering, but old disk at USB3 dock:
Code: Select all
Tue May 12 22:59:59 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.60T in 169h3m with 0 errors on Tue May 12 22:38:59 2015
config:

   NAME        STATE     READ WRITE CKSUM
   zfsraid     ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       disk6   ONLINE       0     0     0
       disk4   ONLINE       0     0     0
       disk2   ONLINE       0     0     0
       disk5   ONLINE       0     0     0
       disk3   ONLINE       0     0     0
       disk7   ONLINE       0     0     0

errors: No known data errors


The sixth and last disk was physically replaced before the resilvering again, and the old disk was connected via the USB3 dock.
Moreover, I set kernel parameters:
Code: Select all
sudo sysctl set kstat.zfs.darwin.tunable.zfs_top_maxinflight=128
sudo sysctl set kstat.zfs.darwin.tunable.zfs_resilver_delay=0
.
This went fine, at first:
Code: Select all
Wed May 13 20:23:29 CEST 2015
  pool: zfsraid
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    2.72T scanned out of 9.43T at 72.2M/s, 27h4m to go
    465G resilvered, 28.88% done
config:

   NAME             STATE     READ WRITE CKSUM
   zfsraid          ONLINE       0     0     0
     raidz2-0       ONLINE       0     0     0
       disk9        ONLINE       0     0     0
       disk4        ONLINE       0     0     0
       disk6        ONLINE       0     0     0
       disk5        ONLINE       0     0     0
       replacing-4  ONLINE       0     0     0
         disk2      ONLINE       0     0     0
         disk10     ONLINE       0     0     0  (resilvering)
       disk7        ONLINE       0     0     0

errors: No known data errors


But then ZFS started to go nuts:
Code: Select all
Wed May 13 23:21:31 CEST 2015
  pool: zfsraid
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    3.45T scanned out of 9.43T at 71.9M/s, 24h14m to go
    588G resilvered, 36.54% done
config:

   NAME             STATE     READ WRITE CKSUM
   zfsraid          ONLINE       0     0     0
     raidz2-0       ONLINE       0     0     0
       disk9        ONLINE       0     0     0
       disk4        ONLINE       0     0     4  (resilvering)
       disk6        ONLINE       0     0     7  (resilvering)
       disk5        ONLINE       0     0     8  (resilvering)
       replacing-4  ONLINE       0     0     4
         disk2      ONLINE       0     0     0  (resilvering)
         disk10     ONLINE       0     0     0  (resilvering)
       disk7        ONLINE       0     0     0

errors: No known data errors


Eventually resulting in another kernel panic (the resulting reboot reset the kernel parameters):
Code: Select all
Anonymous UUID:       42E5B6EB-0C42-8F48-CBE4-05F96D8E3409

Thu May 14 00:00:09 2015

*** Panic Report ***
panic(cpu 5 caller 0xffffff8028617cc2): Kernel trap at 0xffffff7fa8d046a7, type 14=page fault, registers:
CR0: 0x000000008001003b, CR2: 0x0000000000000048, CR3: 0x000000002bbf1000, CR4: 0x00000000001626e0
RAX: 0x0000000000000000, RBX: 0xffffff821bd03e60, RCX: 0x0000000000000248, RDX: 0x0000000000000000
RSP: 0xffffff821bd03dd0, RBP: 0xffffff821bd03e90, RSI: 0x0000000000000000, RDI: 0x0000000000000000
R8:  0x0000000000988d64, R9:  0xffffff82072e0068, R10: 0x00027f9cc2458afc, R11: 0x00027f9cc1acfd98
R12: 0xffffff8244f6e218, R13: 0x0000000000000003, R14: 0xffffff8244f6ddc8, R15: 0xffffff8244f6de80
RFL: 0x0000000000010246, RIP: 0xffffff7fa8d046a7, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000000048, Error code: 0x0000000000000000, Fault CPU: 0x5

Backtrace (CPU 5), Frame : Return Address
0xffffff821bd03a80 : 0xffffff802852bda1
0xffffff821bd03b00 : 0xffffff8028617cc2
0xffffff821bd03cc0 : 0xffffff8028634b73
0xffffff821bd03ce0 : 0xffffff7fa8d046a7
0xffffff821bd03e90 : 0xffffff7fa8d04cb1
0xffffff821bd03ec0 : 0xffffff7fa8d419dd
0xffffff821bd03ef0 : 0xffffff7fa8d3ef6f
0xffffff821bd03f50 : 0xffffff7fa8c809a0
0xffffff821bd03fb0 : 0xffffff80286125b7
     Kernel Extensions in backtrace:
        net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7fa8c76000->0xffffff7fa8caafff
        net.lundman.zfs(1.3.1)[C77DD3AA-EFD4-3799-BB77-8C5BA45AB071]@0xffffff7fa8cab000->0xffffff7fa8e5efff
           dependency: com.apple.iokit.IOStorageFamily(2.0)[76E50D45-C97B-3ED1-97C5-94E6E0EB4514]@0xffffff7fa8c47000
           dependency: net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7fa8c76000

BSD process name corresponding to current thread: kernel_task

Mac OS version:
14D136

Kernel version:
Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
Kernel UUID: 4B3A11F4-77AA-3D27-A22D-81A1BC5B504D
Kernel slide:     0x0000000028200000
Kernel text base: 0xffffff8028400000
__HIB  text base: 0xffffff8028300000
System model name: MacBookPro10,1 (Mac-C3EC7CD22292981F)

System uptime in nanoseconds: 703261194582040
last loaded kext at 692612093051717: com.apple.driver.AppleUSBCDC   4.3.3b1 (addr 0xffffff7fab5af000, size 20480)
last unloaded kext at 692736590201020: com.apple.driver.AppleUSBCDC   4.3.3b1 (addr 0xffffff7fab5af000, size 16384)
loaded kexts:
com.promise.driver.stex   5.2.10
org.virtualbox.kext.VBoxNetAdp   4.3.16
org.virtualbox.kext.VBoxNetFlt   4.3.16
org.virtualbox.kext.VBoxUSB   4.3.16
org.virtualbox.kext.VBoxDrv   4.3.16
com.delantis.kext.tcpblocknke   4.0.0
net.lundman.zfs   1.3.1
net.lundman.spl   1.3.1
com.Cycling74.driver.Soundflower   1.6.6
com.apple.iokit.SCSITaskUserClient   3.7.5
com.apple.filesystems.smbfs   3.0.1
com.apple.filesystems.udf   2.5
com.apple.driver.AppleHWSensor   1.9.5d0
com.apple.filesystems.autofs   3.0
com.apple.driver.ApplePlatformEnabler   2.2.0d4
com.apple.driver.AGPM   110.19.5
com.apple.driver.X86PlatformShim   1.0.0
com.apple.iokit.IOBluetoothSerialManager   4.3.4f4
com.apple.driver.AppleMikeyHIDDriver   124
com.apple.driver.AppleOSXWatchdog   1
com.apple.driver.AppleMikeyDriver   272.18
com.apple.driver.AppleHDA   272.18
com.apple.driver.AudioAUUC   1.70
com.apple.iokit.IOUserEthernet   1.0.1
com.apple.driver.AppleUpstreamUserClient   3.6.1
com.apple.driver.AppleIntelHD4000Graphics   10.0.6
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.GeForce   10.0.2
com.apple.driver.AppleThunderboltIP   2.0.2
com.apple.driver.AppleHWAccess   1
com.apple.driver.AppleHV   1
com.apple.driver.AppleSMCLMU   2.0.7d0
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.driver.AppleSMCPDRC   1.0.0
com.apple.driver.AppleLPC   1.7.3
com.apple.driver.AppleMuxControl   3.10.22
com.apple.driver.AppleIntelSlowAdaptiveClocking   4.0.0
com.apple.driver.AppleMCCSControl   1.2.11
com.apple.driver.AppleIntelFramebufferCapri   10.0.6
com.apple.driver.AppleUSBTCButtons   240.2
com.apple.driver.AppleUSBTCKeyboard   240.2
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless   1.0.0d1
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib   1.0.0d1
com.apple.BootCache   36
com.apple.driver.XsanFilter   404
com.apple.iokit.IOAHCIBlockStorage   2.7.1
com.apple.driver.AppleUSBHub   705.4.2
com.apple.driver.AppleSDXC   1.6.5
com.apple.driver.AirPort.Brcm4360   930.37.3
com.apple.driver.AppleAHCIPort   3.1.2
com.apple.driver.AppleUSBEHCI   705.4.14
com.apple.driver.AppleUSBXHCI   710.4.11
com.apple.driver.AppleSmartBatteryManager   161.0.0
com.apple.driver.AppleACPIButtons   3.1
com.apple.driver.AppleRTC   2.0
com.apple.driver.AppleHPET   1.8
com.apple.driver.AppleSMBIOS   2.1
com.apple.driver.AppleACPIEC   3.1
com.apple.driver.AppleAPIC   1.7
com.apple.driver.AppleIntelCPUPowerManagementClient   218.0.0
com.apple.nke.applicationfirewall   161
com.apple.security.quarantine   3
com.apple.security.TMSafetyNet   8
com.apple.driver.AppleIntelCPUPowerManagement   218.0.0
com.apple.iokit.IOSCSIParallelFamily   3.0.0
com.apple.driver.AppleThunderboltPCIUpAdapter   2.0.2
com.apple.driver.AppleThunderboltDPOutAdapter   4.0.6
com.apple.iokit.IOUSBMassStorageClass   3.7.2
com.apple.iokit.IOSCSIBlockCommandsDevice   3.7.5
com.apple.kext.triggers   1.0
com.apple.iokit.IOSerialFamily   11
com.apple.driver.DspFuncLib   272.18
com.apple.kext.OSvKernDSPLib   1.15
com.apple.iokit.IOSurface   97.4
com.apple.nvidia.driver.NVDAGK100Hal   10.0.2
com.apple.nvidia.driver.NVDAResman   10.0.2
com.apple.iokit.IOBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.iokit.IOBluetoothFamily   4.3.4f4
com.apple.driver.AppleHDAController   272.18
com.apple.iokit.IOHDAFamily   272.18
com.apple.iokit.IOAudioFamily   203.3
com.apple.vecLib.kext   1.2.0
com.apple.iokit.IOUSBUserClient   705.4.0
com.apple.driver.AppleSMBusPCI   1.0.12d1
com.apple.driver.AppleBacklightExpert   1.1.0
com.apple.driver.AppleGraphicsControl   3.10.22
com.apple.iokit.IOSlowAdaptiveClockingFamily   1.0.0
com.apple.driver.X86PlatformPlugin   1.0.0
com.apple.driver.AppleSMC   3.1.9
com.apple.driver.IOPlatformPluginFamily   5.9.1d7
com.apple.driver.AppleSMBusController   1.0.13d1
com.apple.iokit.IOAcceleratorFamily2   156.14
com.apple.AppleGraphicsDeviceControl   3.10.22
com.apple.iokit.IONDRVSupport   2.4.1
com.apple.iokit.IOGraphicsFamily   2.4.1
com.apple.driver.AppleUSBMultitouch   245.2
com.apple.iokit.IOUSBHIDDriver   705.4.0
com.apple.driver.AppleThunderboltDPInAdapter   4.0.6
com.apple.driver.AppleThunderboltDPAdapterFamily   4.0.6
com.apple.driver.AppleThunderboltPCIDownAdapter   2.0.2
com.apple.driver.AppleUSBMergeNub   705.4.0
com.apple.driver.AppleUSBComposite   705.4.9
com.apple.driver.CoreStorage   471.20.7
com.apple.iokit.IOSCSIArchitectureModelFamily   3.7.5
com.apple.driver.AppleThunderboltNHI   3.1.7
com.apple.iokit.IOThunderboltFamily   4.2.2
com.apple.iokit.IO80211Family   730.60
com.apple.driver.mDNSOffloadUserClient   1.0.1b8
com.apple.iokit.IONetworkingFamily   3.2
com.apple.iokit.IOAHCIFamily   2.7.5
com.apple.iokit.IOUSBFamily   720.4.4
com.apple.driver.AppleEFINVRAM   2.0
com.apple.driver.AppleEFIRuntime   2.0
com.apple.iokit.IOHIDFamily   2.0.0
com.apple.iokit.IOSMBusFamily   1.1
com.apple.security.sandbox   300.0
com.apple.kext.AppleMatch   1.0.0d1
com.apple.driver.AppleKeyStore   2
com.apple.driver.AppleMobileFileIntegrity   1.0.5
com.apple.driver.AppleCredentialManager   1.0
com.apple.driver.DiskImages   396
com.apple.iokit.IOStorageFamily   2.0
com.apple.iokit.IOReportFamily   31
com.apple.driver.AppleFDEKeyStore   28.30
com.apple.driver.AppleACPIPlatform   3.1
com.apple.iokit.IOPCIFamily   2.9
com.apple.iokit.IOACPIFamily   1.4
com.apple.kec.corecrypto   1.0
com.apple.kec.pthread   1
com.apple.kec.Libm   1
Model: MacBookPro10,1, BootROM MBP101.00EE.B07, 4 processors, Intel Core i7, 2.3 GHz, 16 GB, SMC 2.3f36
Graphics: Intel HD Graphics 4000, Intel HD Graphics 4000, Built-In
Graphics: NVIDIA GeForce GT 650M, NVIDIA GeForce GT 650M, PCIe, 1024 MB
Memory Module: BANK 0/DIMM0, 8 GB, DDR3, 1600 MHz, 0x80AD, 0x484D5434314753364D465238432D50422020
Memory Module: BANK 1/DIMM0, 8 GB, DDR3, 1600 MHz, 0x80AD, 0x484D5434314753364D465238432D50422020
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0xEF), Broadcom BCM43xx 1.0 (7.15.166.24.3)
Bluetooth: Version 4.3.4f4 15601, 3 services, 18 devices, 1 incoming serial ports
Network Service: Thunderbolt Ethernet, Ethernet, en4
Network Service: Wi-Fi, AirPort, en0
PCI Card: pci105a,8760, RAID Controller, Thunderbolt@194,0,0
PCI Card: pci105a,8760, RAID Controller, Thunderbolt@10,0,0
Serial ATA Device: APPLE SSD SM256E, 251 GB
USB Device: USB-SATA Bridge
USB Device: Hub
USB Device: FaceTime HD Camera (Built-in)
USB Device: Hub
USB Device: My Book
USB Device: Hub
USB Device: BRCM20702 Hub
USB Device: Bluetooth USB Host Controller
USB Device: Apple Internal Keyboard / Trackpad
Thunderbolt Bus: MacBook Pro, Apple Inc., 23.4
Thunderbolt Device: Pegasus2-R, Promise Technology, Inc., 1, 19.2
Thunderbolt Device: Pegasus R-Series, Promise Technology, Inc., 3, 22.2


After the crash, resilvering continued normally again:
Code: Select all
Thu May 14 00:16:59 CEST 2015
  pool: zfsraid
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    3.53T scanned out of 9.43T at 47.6M/s, 36h6m to go
    595G resilvered, 37.43% done
config:

   NAME                                              STATE     READ WRITE CKSUM
   zfsraid                                           DEGRADED     0     0     0
     raidz2-0                                        DEGRADED     0     0     0
       disk8                                         ONLINE       0     0     0
       disk5                                         ONLINE       0     0     0
       disk6                                         ONLINE       0     0     0
       disk4                                         ONLINE       0     0     0
       replacing-4                                   DEGRADED     0     0     0
         media-4DFD3E7B-CDA3-CF4C-B364-17B142373380  ONLINE       0     0     0
         4566845490322943837                         UNAVAIL      0     0     0  was /dev/disk10s1
       disk7                                         ONLINE       0     0     0

errors: No known data errors


But as soon as the CPU got load, the new disk was not resilvering anymore, but instead other disks:
Code: Select all
Thu May 14 00:26:50 CEST 2015
  pool: zfsraid
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    3.75T scanned out of 9.43T at 176M/s, 9h24m to go
    595G resilvered, 39.77% done
config:

   NAME                                              STATE     READ WRITE CKSUM
   zfsraid                                           DEGRADED     0     0     0
     raidz2-0                                        DEGRADED     0     0     0
       disk8                                         ONLINE       0     0     6  (resilvering)
       disk5                                         ONLINE       0     0     0
       disk6                                         ONLINE       0     0     0
       disk4                                         ONLINE       0     0     0
       replacing-4                                   DEGRADED     0     0     0
         media-4DFD3E7B-CDA3-CF4C-B364-17B142373380  ONLINE       0     0     0
         4566845490322943837                         UNAVAIL      0     0     0  was /dev/disk10s1
       disk7                                         ONLINE       0     0     2  (resilvering)

errors: No known data errors

Then, the resilvering of the actually replacing disks continued at some point:
Code: Select all
Thu May 14 00:28:20 CEST 2015
  pool: zfsraid
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    3.79T scanned out of 9.43T at 187M/s, 8h46m to go
    595G resilvered, 40.13% done
config:

   NAME                                              STATE     READ WRITE CKSUM
   zfsraid                                           DEGRADED     0     0     0
     raidz2-0                                        DEGRADED     0     0     0
       disk8                                         ONLINE       0     0     6  (resilvering)
       disk5                                         ONLINE       0     0     0
       disk6                                         ONLINE       0     0     0
       disk4                                         ONLINE       0     0     0
       replacing-4                                   DEGRADED     0     0     4
         media-4DFD3E7B-CDA3-CF4C-B364-17B142373380  ONLINE       0     0     0  (resilvering)
         4566845490322943837                         UNAVAIL      0     0     0  was /dev/disk10s1
       disk7                                         ONLINE       0     0     2  (resilvering)

errors: No known data errors


But it still touched other disks, too. So I rebooted the system, but kept CPU load low. Now it was resilvering both, the old and the new disk:
Code: Select all
  pool: zfsraid
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 12 23:34:27 2015
    3.88T scanned out of 9.43T at 4.80M/s, 336h32m to go
    595G resilvered, 41.17% done
config:

   NAME                                              STATE     READ WRITE CKSUM
   zfsraid                                           ONLINE       0     0     0
     raidz2-0                                        ONLINE       0     0     0
       disk7                                         ONLINE       0     0     0
       disk5                                         ONLINE       0     0     0
       disk6                                         ONLINE       0     0     0
       disk4                                         ONLINE       0     0     0
       replacing-4                                   ONLINE       0     0     4
         media-4DFD3E7B-CDA3-CF4C-B364-17B142373380  ONLINE       0     0     0  (resilvering)
         disk9                                       ONLINE       0     0     0  (resilvering)
       disk8                                         ONLINE       0     0     0

errors: No known data errors


Just when I thought, it's done, I realized that it happened again: The resilvering was done, but the replacement did not complete.
Exporting and reimporting started another resilvering. I tried several things including a downgrade as you can read in this thread (viewtopic.php?f=26&t=2276).

Code: Select all
Thu May 14 18:27:02 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.51T in 42h46m with 0 errors on Thu May 14 18:20:42 2015
config:

   NAME                                              STATE     READ WRITE CKSUM
   zfsraid                                           ONLINE       0     0     0
     raidz2-0                                        ONLINE       0     0     0
       disk7                                         ONLINE       0     0     0
       disk5                                         ONLINE       0     0     0
       disk6                                         ONLINE       0     0     0
       disk4                                         ONLINE       0     0     0
       replacing-4                                   ONLINE       0     0     4
         media-4DFD3E7B-CDA3-CF4C-B364-17B142373380  ONLINE       0     0     0
         disk9                                       ONLINE       0     0     0
       disk8                                         ONLINE       0     0     0

errors: No known data errors


Finally, I upgraded to the 1.3.2 RC1 version again and let the additional resilvering do its job.
Code: Select all
Fri May 22 11:38:23 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 1.57T in 78h9m with 0 errors on Fri May 22 06:42:47 2015
config:

    NAME        STATE     READ WRITE CKSUM
    zfsraid     ONLINE       0     0     0   
      raidz2-0  ONLINE       0     0     0   
        disk9   ONLINE       0     0     0   
        disk2   ONLINE       0     0     0   
        disk7   ONLINE       0     0     0   
        disk6   ONLINE       0     0     0   
        disk10  ONLINE       0     0     0   
        disk8   ONLINE       0     0     0   

errors: No known data errors

Both times the replacement has not been completed, although the resilvering has, I modified kernel parameters. Maybe there is a correlation, maybe not. I don't know.
But this time (with kernel params reset due to the crash/restart), it was successful and I could offline one disk and online it again with the -e flag set in order to expand the pool, which worked like a charm:
Code: Select all
Fri May 22 11:40:28 CEST 2015
  pool: zfsraid
 state: ONLINE
  scan: resilvered 116K in 0h0m with 0 errors on Fri May 22 11:40:20 2015
config:

    NAME        STATE     READ WRITE CKSUM
    zfsraid     ONLINE       0     0     0   
      raidz2-0  ONLINE       0     0     0   
        disk9   ONLINE       0     0     0   
        disk2   ONLINE       0     0     0   
        disk7   ONLINE       0     0     0   
        disk6   ONLINE       0     0     0   
        disk10  ONLINE       0     0     0   
        disk8   ONLINE       0     0     0   

errors: No known data errors

Code: Select all
Filesystem   Size   Used  Avail Capacity iused     ifree %iused  Mounted on
zfsraid      29Ti   10Ti   18Ti    36% 3995554 39312909752    0%   /Volumes/zfsraid
rattlehead
 
Posts: 12
Joined: Tue Apr 14, 2015 11:54 am

Crashes while resilvering

Postby erico » Sun Jun 21, 2015 7:33 am

Rattlehead's report was quite illuminating....I wish I had read it before I started two weeks ago trying to expand a pool of 4gb raidz disks with 6tb ones. It seems there are some severe problems with resilvering in 1.31 and unfortunately also the 1.32rc1 release. I was able to zpool replace one drive, and have it resilver that in with just a few crashes. But the second drive has been absolutely terrible....it crashes every hour with kernel panic. My pool is unfortunately stuck in a half-resilvered degraded state that I don't know how I'll get it out.

I'm happy to give more details but here are a couple of crash logs. I've seen no real pattern as to when it crashes.

Any suggestions anyone has would be most welcome!

cheers,
erico

Code: Select all
Anonymous UUID:       F3499F50-9FAB-DC38-31E1-E04168EBCD9C

Sat Jun 20 01:21:56 2015

*** Panic Report ***
panic(cpu 0 caller 0xffffff8008617cc2): Kernel trap at 0xffffff7f88d046a7, type 14=page fault, registers:
CR0: 0x000000008001003b, CR2: 0x0000000000000048, CR3: 0x000000000b2ae000, CR4: 0x0000000000002660
RAX: 0x0000000000000000, RBX: 0xffffff843c1f3e60, RCX: 0x0000000000000248, RDX: 0x0000000000000000
RSP: 0xffffff843c1f3dd0, RBP: 0xffffff843c1f3e90, RSI: 0x0000000000000000, RDI: 0x0000000000000000
R8:  0x000000000097f2f0, R9:  0xffffff8057d63798, R10: 0x0000030e359b7b66, R11: 0x0000030e35038876
R12: 0xffffff8437cf1698, R13: 0x0000000000000003, R14: 0xffffff8437cf1248, R15: 0xffffff8437cf1300
RFL: 0x0000000000010246, RIP: 0xffffff7f88d046a7, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000000048, Error code: 0x0000000000000000, Fault CPU: 0x0

Backtrace (CPU 0), Frame : Return Address
0xffffff843c1f3a80 : 0xffffff800852bda1
0xffffff843c1f3b00 : 0xffffff8008617cc2
0xffffff843c1f3cc0 : 0xffffff8008634b73
0xffffff843c1f3ce0 : 0xffffff7f88d046a7
0xffffff843c1f3e90 : 0xffffff7f88d04cb1
0xffffff843c1f3ec0 : 0xffffff7f88d419dd
0xffffff843c1f3ef0 : 0xffffff7f88d3ef6f
0xffffff843c1f3f50 : 0xffffff7f88c809a0
0xffffff843c1f3fb0 : 0xffffff80086125b7
      Kernel Extensions in backtrace:
         net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f88c76000->0xffffff7f88caafff
         net.lundman.zfs(1.3.1)[C77DD3AA-EFD4-3799-BB77-8C5BA45AB071]@0xffffff7f88cab000->0xffffff7f88e5efff
            dependency: com.apple.iokit.IOStorageFamily(2.0)[76E50D45-C97B-3ED1-97C5-94E6E0EB4514]@0xffffff7f88c47000
            dependency: net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f88c76000

BSD process name corresponding to current thread: kernel_task
Boot args: ktext-dev-mode=1

Mac OS version:
14D136

Kernel version:
Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
Kernel UUID: 4B3A11F4-77AA-3D27-A22D-81A1BC5B504D
Kernel slide:     0x0000000008200000
Kernel text base: 0xffffff8008400000
__HIB  text base: 0xffffff8008300000
System model name: MacPro5,1 (Mac-F221BEC8)

System uptime in nanoseconds: 3359553945778
last loaded kext at 20983764025: com.apple.driver.AudioAUUC   1.70 (addr 0xffffff7f89cd3000, size 32768)
last unloaded kext at 232083552930: com.apple.driver.AppleFileSystemDriver   3.0.1 (addr 0xffffff7f8a698000, size 8192)
loaded kexts:
net.lundman.zfs   1.3.1
net.lundman.spl   1.3.1
at.obdev.nke.LittleSnitch   4212
com.highpoint-tech.kext.HighPointRR   4.3.3
com.apple.driver.AudioAUUC   1.70
com.apple.filesystems.autofs   3.0
com.apple.driver.AppleTyMCEDriver   1.0.2d2
com.apple.driver.AGPM   110.19.5
com.apple.iokit.IOBluetoothSerialManager   4.3.4f4
com.apple.driver.AppleOSXWatchdog   1
com.apple.driver.AppleHWSensor   1.9.5d0
com.apple.driver.AppleUpstreamUserClient   3.6.1
com.apple.driver.AppleMCCSControl   1.2.11
com.apple.driver.AppleMikeyHIDDriver   124
com.apple.kext.AMDFramebuffer   1.3.2
com.apple.driver.AppleHDA   272.18
com.apple.iokit.IOUserEthernet   1.0.1
com.apple.driver.AppleMikeyDriver   272.18
com.apple.ATIRadeonX2000   10.0.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.driver.AppleHWAccess   1
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.driver.AppleLPC   1.7.3
com.apple.driver.AppleHV   1
com.apple.kext.AMD4800Controller   1.3.2
com.apple.driver.ACPI_SMC_PlatformPlugin   1.0.0
com.apple.driver.AppleIntelSlowAdaptiveClocking   4.0.0
com.apple.iokit.SCSITaskUserClient   3.7.5
com.apple.driver.XsanFilter   404
com.apple.iokit.IOAHCIBlockStorage   2.7.1
com.apple.driver.AppleFWOHCI   5.5.2
com.apple.driver.AppleUSBHub   705.4.2
com.apple.driver.Intel82574L   2.6.8b1
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless   1.0.0d1
com.apple.driver.AppleUSBXHCI   710.4.11
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib   1.0.0d1
com.apple.BootCache   36
com.apple.driver.AppleAHCIPort   3.1.2
com.apple.driver.AppleUSBEHCI   705.4.14
com.apple.driver.AppleUSBUHCI   656.4.1
com.apple.driver.AppleRTC   2.0
com.apple.driver.AppleHPET   1.8
com.apple.driver.AppleACPIButtons   3.1
com.apple.driver.AppleSMBIOS   2.1
com.apple.driver.AppleACPIEC   3.1
com.apple.driver.AppleAPIC   1.7
com.apple.driver.AppleIntelCPUPowerManagementClient   218.0.0
com.apple.nke.applicationfirewall   161
com.apple.security.quarantine   3
com.apple.security.TMSafetyNet   8
com.apple.driver.AppleIntelCPUPowerManagement   218.0.0
com.apple.kext.triggers   1.0
com.apple.driver.IOBluetoothHIDDriver   4.3.4f4
com.apple.iokit.IOSerialFamily   11
com.apple.iokit.IOUSBUserClient   705.4.0
com.apple.driver.DspFuncLib   272.18
com.apple.kext.OSvKernDSPLib   1.15
com.apple.iokit.IOSurface   97.4
com.apple.iokit.IONDRVSupport   2.4.1
com.apple.iokit.IOUSBHIDDriver   705.4.0
com.apple.driver.AppleSMBusController   1.0.13d1
com.apple.iokit.IOBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.iokit.IOBluetoothFamily   4.3.4f4
com.apple.driver.AppleHDAController   272.18
com.apple.iokit.IOHDAFamily   272.18
com.apple.iokit.IOAudioFamily   203.3
com.apple.vecLib.kext   1.2.0
com.apple.driver.AppleSMBusPCI   1.0.12d1
com.apple.kext.AMDSupport   1.3.2
com.apple.AppleGraphicsDeviceControl   3.10.22
com.apple.iokit.IOGraphicsFamily   2.4.1
com.apple.driver.AppleSMC   3.1.9
com.apple.driver.IOPlatformPluginLegacy   1.0.0
com.apple.driver.IOPlatformPluginFamily   5.9.1d7
com.apple.iokit.IOFireWireIP   2.2.6
com.apple.iokit.IOSlowAdaptiveClockingFamily   1.0.0
com.apple.driver.AppleUSBMergeNub   705.4.0
com.apple.driver.AppleUSBComposite   705.4.9
com.apple.iokit.IOSCSIMultimediaCommandsDevice   3.7.5
com.apple.iokit.IOBDStorageFamily   1.7
com.apple.iokit.IODVDStorageFamily   1.7.1
com.apple.iokit.IOCDStorageFamily   1.7.1
com.apple.driver.CoreStorage   471.20.7
com.apple.iokit.IOFireWireFamily   4.5.6
com.apple.iokit.IOAHCISerialATAPI   2.6.1
com.apple.iokit.IOAHCIFamily   2.7.5
com.apple.iokit.IOUSBFamily   720.4.4
com.apple.iokit.IONetworkingFamily   3.2
com.apple.iokit.IOSCSIParallelFamily   3.0.0
com.apple.iokit.IOSCSIArchitectureModelFamily   3.7.5
com.apple.driver.AppleEFINVRAM   2.0
com.apple.driver.AppleEFIRuntime   2.0
com.apple.iokit.IOHIDFamily   2.0.0
com.apple.iokit.IOSMBusFamily   1.1
com.apple.security.sandbox   300.0
com.apple.kext.AppleMatch   1.0.0d1
com.apple.driver.AppleKeyStore   2
com.apple.driver.AppleMobileFileIntegrity   1.0.5
com.apple.driver.AppleCredentialManager   1.0
com.apple.driver.DiskImages   396
com.apple.iokit.IOStorageFamily   2.0
com.apple.iokit.IOReportFamily   31
com.apple.driver.AppleFDEKeyStore   28.30
com.apple.driver.AppleACPIPlatform   3.1
com.apple.iokit.IOPCIFamily   2.9
com.apple.iokit.IOACPIFamily   1.4
com.apple.kec.corecrypto   1.0
com.apple.kec.Libm   1
com.apple.kec.pthread   1
Model: MacPro5,1, BootROM MP51.007F.B03, 8 processors, Quad-Core Intel Xeon, 2.26 GHz, 37 GB, SMC 1.39f5
Graphics: ATI Radeon HD 4870, ATI Radeon HD 4870, PCIe, 512 MB
Memory Module: DIMM 1, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 2, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 3, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 4, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 5, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 6, 2 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54324743373242385041304E462D424520
Memory Module: DIMM 7, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 8, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Bluetooth: Version 4.3.4f4 15601, 3 services, 27 devices, 1 incoming serial ports
Network Service: Ethernet 1, Ethernet, en0
PCI Card: pci1103,645, RAID Controller, Slot-2
PCI Card: ATI Radeon HD 4870, Display Controller, Slot-1
PCI Card: pci1b21,612, AHCI Controller, Slot-3
PCI Card: pci1b73,1100, USB eXtensible Host Controller, Slot-4
Serial ATA Device: HL-DT-ST DVD-RW GH41N
Serial ATA Device: OCZ-SOLID3, 480.1 GB
Serial ATA Device: ST4000VN000-1H4168, 4 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD40EFRX-68WT0N0, 4 TB
Serial ATA Device: M4-CT512M4SSD2, 512.11 GB
SCSI Device: SCSI Target Device @ 32
USB Device: Composite Device
USB Device: USB3.0 Hub
USB Device: USB2.0 Hub
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: DYMO LabelWriter DUO
USB Device: BRCM2046 Hub
USB Device: Bluetooth USB Host Controller
USB Device: Hub
USB Device: Sony Quick Scroll Mouse (USB)
FireWire Device: built-in_hub, Up to 800 Mb/sec
Thunderbolt Bus:




Code: Select all
Anonymous UUID:       F3499F50-9FAB-DC38-31E1-E04168EBCD9C

Fri Jun 19 23:59:45 2015

*** Panic Report ***
panic(cpu 4 caller 0xffffff8003e17cc2): Kernel trap at 0xffffff7f861d56a7, type 14=page fault, registers:
CR0: 0x000000008001003b, CR2: 0x0000000000000048, CR3: 0x00000000068c0000, CR4: 0x0000000000002660
RAX: 0x0000000000000000, RBX: 0xffffff8437803e60, RCX: 0x0000000000000248, RDX: 0x0000000000000000
RSP: 0xffffff8437803dd0, RBP: 0xffffff8437803e90, RSI: 0x0000000000000000, RDI: 0x0000000000000000
R8:  0x0000000000988fb7, R9:  0xffffff84290c6068, R10: 0x0000078b0377a3bd, R11: 0x0000078b02df1406
R12: 0xffffff8462afa998, R13: 0x0000000000000003, R14: 0xffffff8462afa548, R15: 0xffffff8462afa600
RFL: 0x0000000000010246, RIP: 0xffffff7f861d56a7, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000000048, Error code: 0x0000000000000000, Fault CPU: 0x4

Backtrace (CPU 4), Frame : Return Address
0xffffff8437803a80 : 0xffffff8003d2bda1
0xffffff8437803b00 : 0xffffff8003e17cc2
0xffffff8437803cc0 : 0xffffff8003e34b73
0xffffff8437803ce0 : 0xffffff7f861d56a7
0xffffff8437803e90 : 0xffffff7f861d5cb1
0xffffff8437803ec0 : 0xffffff7f862129dd
0xffffff8437803ef0 : 0xffffff7f8620ff6f
0xffffff8437803f50 : 0xffffff7f861519a0
0xffffff8437803fb0 : 0xffffff8003e125b7
      Kernel Extensions in backtrace:
         net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f86147000->0xffffff7f8617bfff
         net.lundman.zfs(1.3.1)[C77DD3AA-EFD4-3799-BB77-8C5BA45AB071]@0xffffff7f8617c000->0xffffff7f8632efff
            dependency: com.apple.iokit.IOStorageFamily(2.0)[76E50D45-C97B-3ED1-97C5-94E6E0EB4514]@0xffffff7f84447000
            dependency: net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f86147000

BSD process name corresponding to current thread: kernel_task
Boot args: ktext-dev-mode=1

Mac OS version:
14D136

Kernel version:
Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
Kernel UUID: 4B3A11F4-77AA-3D27-A22D-81A1BC5B504D
Kernel slide:     0x0000000003a00000
Kernel text base: 0xffffff8003c00000
__HIB  text base: 0xffffff8003b00000
System model name: MacPro5,1 (Mac-F221BEC8)

System uptime in nanoseconds: 8293630115391
last loaded kext at 334768776555: net.lundman.zfs   1.3.1 (addr 0xffffff7f8617c000, size 1781760)
last unloaded kext at 514486319023: com.apple.driver.AppleFileSystemDriver   3.0.1 (addr 0xffffff7f85cab000, size 8192)
loaded kexts:
net.lundman.zfs   1.3.1
net.lundman.spl   1.3.1
at.obdev.nke.LittleSnitch   4212
com.highpoint-tech.kext.HighPointRR   4.3.3
com.apple.driver.AudioAUUC   1.70
com.apple.driver.AppleTyMCEDriver   1.0.2d2
com.apple.driver.AGPM   110.19.5
com.apple.iokit.IOBluetoothSerialManager   4.3.4f4
com.apple.filesystems.autofs   3.0
com.apple.driver.AppleOSXWatchdog   1
com.apple.driver.AppleHWSensor   1.9.5d0
com.apple.driver.AppleMikeyHIDDriver   124
com.apple.driver.AppleUpstreamUserClient   3.6.1
com.apple.driver.AppleMCCSControl   1.2.11
com.apple.iokit.IOUserEthernet   1.0.1
com.apple.kext.AMDFramebuffer   1.3.2
com.apple.driver.AppleHDA   272.18
com.apple.driver.AppleMikeyDriver   272.18
com.apple.ATIRadeonX2000   10.0.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.driver.AppleHWAccess   1
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.driver.AppleHV   1
com.apple.driver.ACPI_SMC_PlatformPlugin   1.0.0
com.apple.driver.AppleLPC   1.7.3
com.apple.kext.AMD4800Controller   1.3.2
com.apple.driver.AppleIntelSlowAdaptiveClocking   4.0.0
com.apple.iokit.SCSITaskUserClient   3.7.5
com.apple.driver.XsanFilter   404
com.apple.iokit.IOAHCIBlockStorage   2.7.1
com.apple.driver.AppleFWOHCI   5.5.2
com.apple.driver.AppleUSBHub   705.4.2
com.apple.driver.Intel82574L   2.6.8b1
com.apple.driver.AppleUSBXHCI   710.4.11
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless   1.0.0d1
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib   1.0.0d1
com.apple.BootCache   36
com.apple.driver.AppleAHCIPort   3.1.2
com.apple.driver.AppleUSBEHCI   705.4.14
com.apple.driver.AppleUSBUHCI   656.4.1
com.apple.driver.AppleRTC   2.0
com.apple.driver.AppleHPET   1.8
com.apple.driver.AppleACPIButtons   3.1
com.apple.driver.AppleSMBIOS   2.1
com.apple.driver.AppleACPIEC   3.1
com.apple.driver.AppleAPIC   1.7
com.apple.driver.AppleIntelCPUPowerManagementClient   218.0.0
com.apple.nke.applicationfirewall   161
com.apple.security.quarantine   3
com.apple.security.TMSafetyNet   8
com.apple.driver.AppleIntelCPUPowerManagement   218.0.0
com.apple.driver.IOBluetoothHIDDriver   4.3.4f4
com.apple.iokit.IOSerialFamily   11
com.apple.kext.triggers   1.0
com.apple.iokit.IOUSBUserClient   705.4.0
com.apple.iokit.IOUSBHIDDriver   705.4.0
com.apple.iokit.IOSurface   97.4
com.apple.driver.DspFuncLib   272.18
com.apple.kext.OSvKernDSPLib   1.15
com.apple.iokit.IONDRVSupport   2.4.1
com.apple.iokit.IOBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.iokit.IOBluetoothFamily   4.3.4f4
com.apple.driver.AppleSMBusController   1.0.13d1
com.apple.driver.AppleHDAController   272.18
com.apple.iokit.IOHDAFamily   272.18
com.apple.iokit.IOAudioFamily   203.3
com.apple.vecLib.kext   1.2.0
com.apple.driver.IOPlatformPluginLegacy   1.0.0
com.apple.driver.IOPlatformPluginFamily   5.9.1d7
com.apple.kext.AMDSupport   1.3.2
com.apple.AppleGraphicsDeviceControl   3.10.22
com.apple.iokit.IOGraphicsFamily   2.4.1
com.apple.driver.AppleSMC   3.1.9
com.apple.iokit.IOSlowAdaptiveClockingFamily   1.0.0
com.apple.driver.AppleSMBusPCI   1.0.12d1
com.apple.iokit.IOFireWireIP   2.2.6
com.apple.driver.AppleUSBMergeNub   705.4.0
com.apple.driver.AppleUSBComposite   705.4.9
com.apple.iokit.IOSCSIMultimediaCommandsDevice   3.7.5
com.apple.iokit.IOBDStorageFamily   1.7
com.apple.iokit.IODVDStorageFamily   1.7.1
com.apple.iokit.IOCDStorageFamily   1.7.1
com.apple.driver.CoreStorage   471.20.7
com.apple.iokit.IOAHCISerialATAPI   2.6.1
com.apple.iokit.IOFireWireFamily   4.5.6
com.apple.iokit.IOAHCIFamily   2.7.5
com.apple.iokit.IONetworkingFamily   3.2
com.apple.iokit.IOUSBFamily   720.4.4
com.apple.iokit.IOSCSIParallelFamily   3.0.0
com.apple.iokit.IOSCSIArchitectureModelFamily   3.7.5
com.apple.driver.AppleEFINVRAM   2.0
com.apple.driver.AppleEFIRuntime   2.0
com.apple.iokit.IOHIDFamily   2.0.0
com.apple.iokit.IOSMBusFamily   1.1
com.apple.security.sandbox   300.0
com.apple.kext.AppleMatch   1.0.0d1
com.apple.driver.AppleKeyStore   2
com.apple.driver.AppleMobileFileIntegrity   1.0.5
com.apple.driver.AppleCredentialManager   1.0
com.apple.driver.DiskImages   396
com.apple.iokit.IOStorageFamily   2.0
com.apple.iokit.IOReportFamily   31
com.apple.driver.AppleFDEKeyStore   28.30
com.apple.driver.AppleACPIPlatform   3.1
com.apple.iokit.IOPCIFamily   2.9
com.apple.iokit.IOACPIFamily   1.4
com.apple.kec.corecrypto   1.0
com.apple.kec.Libm   1
com.apple.kec.pthread   1
Model: MacPro5,1, BootROM MP51.007F.B03, 8 processors, Quad-Core Intel Xeon, 2.26 GHz, 37 GB, SMC 1.39f5
Graphics: ATI Radeon HD 4870, ATI Radeon HD 4870, PCIe, 512 MB
Memory Module: DIMM 1, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 2, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 3, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 4, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 5, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 6, 2 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54324743373242385041304E462D424520
Memory Module: DIMM 7, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 8, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Bluetooth: Version 4.3.4f4 15601, 3 services, 27 devices, 1 incoming serial ports
Network Service: Ethernet 1, Ethernet, en0
PCI Card: ATI Radeon HD 4870, Display Controller, Slot-1
PCI Card: pci1103,645, RAID Controller, Slot-2
PCI Card: pci1b73,1100, USB eXtensible Host Controller, Slot-4
PCI Card: pci1b21,612, AHCI Controller, Slot-3
Serial ATA Device: HL-DT-ST DVD-RW GH41N
Serial ATA Device: OCZ-SOLID3, 480.1 GB
Serial ATA Device: ST4000VN000-1H4168, 4 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD40EFRX-68WT0N0, 4 TB
Serial ATA Device: M4-CT512M4SSD2, 512.11 GB
SCSI Device: SCSI Target Device @ 32
USB Device: Composite Device
USB Device: USB3.0 Hub
USB Device: USB2.0 Hub
USB Device: USB 2.0 Hub [MTT]
USB Device: DYMO LabelWriter DUO
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: BRCM2046 Hub
USB Device: Bluetooth USB Host Controller
USB Device: Hub
USB Device: Sony Quick Scroll Mouse (USB)
FireWire Device: built-in_hub, Up to 800 Mb/sec
Thunderbolt Bus:
 




Code: Select all
 Anonymous UUID:       F3499F50-9FAB-DC38-31E1-E04168EBCD9C

Sun Jun 21 08:08:03 2015

*** Panic Report ***
panic(cpu 8 caller 0xffffff7f95a76b37): "avl_find() succeeded inside avl_add()"@spl-avl.c:632
Backtrace (CPU 8), Frame : Return Address
0xffffff84747eb660 : 0xffffff801532bda1
0xffffff84747eb6e0 : 0xffffff7f95a76b37
0xffffff84747eb710 : 0xffffff7f95b04385
0xffffff84747eb740 : 0xffffff7f95b418d6
0xffffff84747eb790 : 0xffffff7f95b3e98a
0xffffff84747eb7f0 : 0xffffff7f95b05422
0xffffff84747eb8b0 : 0xffffff7f95b418fd
0xffffff84747eb900 : 0xffffff7f95b3e98a
0xffffff84747eb960 : 0xffffff7f95b03b6f
0xffffff84747eb9e0 : 0xffffff7f95b3e98a
0xffffff84747eba40 : 0xffffff7f95addc85
0xffffff84747ebb00 : 0xffffff7f95adb1d1
0xffffff84747ebbe0 : 0xffffff7f95adb8d9
0xffffff84747ebdf0 : 0xffffff7f95af3657
0xffffff84747ebee0 : 0xffffff7f95afb873
0xffffff84747ebfb0 : 0xffffff80154125b7
      Kernel Extensions in backtrace:
         net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f95a76000->0xffffff7f95aaafff
         net.lundman.zfs(1.3.1)[C77DD3AA-EFD4-3799-BB77-8C5BA45AB071]@0xffffff7f95aab000->0xffffff7f95c5efff
            dependency: com.apple.iokit.IOStorageFamily(2.0)[76E50D45-C97B-3ED1-97C5-94E6E0EB4514]@0xffffff7f95a47000
            dependency: net.lundman.spl(1.3.1)[45D67A1B-34A8-3171-B495-055C50D8E249]@0xffffff7f95a76000

BSD process name corresponding to current thread: kernel_task
Boot args: ktext-dev-mode=1

Mac OS version:
14D136

Kernel version:
Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
Kernel UUID: 4B3A11F4-77AA-3D27-A22D-81A1BC5B504D
Kernel slide:     0x0000000015000000
Kernel text base: 0xffffff8015200000
__HIB  text base: 0xffffff8015100000
System model name: MacPro5,1 (Mac-F221BEC8)

System uptime in nanoseconds: 2238524338386
last loaded kext at 20666916229: com.apple.driver.AudioAUUC   1.70 (addr 0xffffff7f96ad3000, size 32768)
last unloaded kext at 198530681486: com.apple.driver.AppleFileSystemDriver   3.0.1 (addr 0xffffff7f97498000, size 8192)
loaded kexts:
net.lundman.zfs   1.3.1
net.lundman.spl   1.3.1
at.obdev.nke.LittleSnitch   4212
com.highpoint-tech.kext.HighPointRR   4.3.3
com.apple.driver.AudioAUUC   1.70
com.apple.filesystems.autofs   3.0
com.apple.iokit.IOBluetoothSerialManager   4.3.4f4
com.apple.driver.AppleTyMCEDriver   1.0.2d2
com.apple.driver.AGPM   110.19.5
com.apple.driver.AppleOSXWatchdog   1
com.apple.driver.AppleHWSensor   1.9.5d0
com.apple.driver.AppleUpstreamUserClient   3.6.1
com.apple.driver.AppleMikeyHIDDriver   124
com.apple.driver.AppleMCCSControl   1.2.11
com.apple.iokit.IOUserEthernet   1.0.1
com.apple.kext.AMDFramebuffer   1.3.2
com.apple.driver.AppleHDA   272.18
com.apple.driver.AppleMikeyDriver   272.18
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.ATIRadeonX2000   10.0.0
com.apple.driver.AppleHWAccess   1
com.apple.driver.AppleHV   1
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.driver.AppleIntelSlowAdaptiveClocking   4.0.0
com.apple.driver.AppleLPC   1.7.3
com.apple.kext.AMD4800Controller   1.3.2
com.apple.driver.ACPI_SMC_PlatformPlugin   1.0.0
com.apple.iokit.SCSITaskUserClient   3.7.5
com.apple.driver.XsanFilter   404
com.apple.iokit.IOAHCIBlockStorage   2.7.1
com.apple.driver.AppleFWOHCI   5.5.2
com.apple.driver.AppleUSBHub   705.4.2
com.apple.driver.Intel82574L   2.6.8b1
com.apple.driver.AppleUSBXHCI   710.4.11
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless   1.0.0d1
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib   1.0.0d1
com.apple.BootCache   36
com.apple.driver.AppleAHCIPort   3.1.2
com.apple.driver.AppleUSBEHCI   705.4.14
com.apple.driver.AppleUSBUHCI   656.4.1
com.apple.driver.AppleRTC   2.0
com.apple.driver.AppleHPET   1.8
com.apple.driver.AppleACPIButtons   3.1
com.apple.driver.AppleSMBIOS   2.1
com.apple.driver.AppleACPIEC   3.1
com.apple.driver.AppleAPIC   1.7
com.apple.driver.AppleIntelCPUPowerManagementClient   218.0.0
com.apple.nke.applicationfirewall   161
com.apple.security.quarantine   3
com.apple.security.TMSafetyNet   8
com.apple.driver.AppleIntelCPUPowerManagement   218.0.0
com.apple.kext.triggers   1.0
com.apple.iokit.IOSerialFamily   11
com.apple.iokit.IOUSBUserClient   705.4.0
com.apple.iokit.IOSurface   97.4
com.apple.iokit.IOUSBHIDDriver   705.4.0
com.apple.driver.DspFuncLib   272.18
com.apple.kext.OSvKernDSPLib   1.15
com.apple.iokit.IONDRVSupport   2.4.1
com.apple.driver.AppleSMBusController   1.0.13d1
com.apple.iokit.IOBluetoothHostControllerUSBTransport   4.3.4f4
com.apple.iokit.IOBluetoothFamily   4.3.4f4
com.apple.iokit.IOSlowAdaptiveClockingFamily   1.0.0
com.apple.driver.AppleHDAController   272.18
com.apple.iokit.IOHDAFamily   272.18
com.apple.iokit.IOAudioFamily   203.3
com.apple.vecLib.kext   1.2.0
com.apple.driver.AppleSMBusPCI   1.0.12d1
com.apple.kext.AMDSupport   1.3.2
com.apple.AppleGraphicsDeviceControl   3.10.22
com.apple.iokit.IOGraphicsFamily   2.4.1
com.apple.iokit.IOFireWireIP   2.2.6
com.apple.driver.AppleSMC   3.1.9
com.apple.driver.IOPlatformPluginLegacy   1.0.0
com.apple.driver.IOPlatformPluginFamily   5.9.1d7
com.apple.driver.AppleUSBMergeNub   705.4.0
com.apple.driver.AppleUSBComposite   705.4.9
com.apple.iokit.IOSCSIMultimediaCommandsDevice   3.7.5
com.apple.iokit.IOBDStorageFamily   1.7
com.apple.iokit.IODVDStorageFamily   1.7.1
com.apple.iokit.IOCDStorageFamily   1.7.1
com.apple.driver.CoreStorage   471.20.7
com.apple.iokit.IOFireWireFamily   4.5.6
com.apple.iokit.IOAHCISerialATAPI   2.6.1
com.apple.iokit.IOAHCIFamily   2.7.5
com.apple.iokit.IOUSBFamily   720.4.4
com.apple.iokit.IONetworkingFamily   3.2
com.apple.iokit.IOSCSIParallelFamily   3.0.0
com.apple.iokit.IOSCSIArchitectureModelFamily   3.7.5
com.apple.driver.AppleEFINVRAM   2.0
com.apple.driver.AppleEFIRuntime   2.0
com.apple.iokit.IOHIDFamily   2.0.0
com.apple.iokit.IOSMBusFamily   1.1
com.apple.security.sandbox   300.0
com.apple.kext.AppleMatch   1.0.0d1
com.apple.driver.AppleKeyStore   2
com.apple.driver.AppleMobileFileIntegrity   1.0.5
com.apple.driver.AppleCredentialManager   1.0
com.apple.driver.DiskImages   396
com.apple.iokit.IOStorageFamily   2.0
com.apple.iokit.IOReportFamily   31
com.apple.driver.AppleFDEKeyStore   28.30
com.apple.driver.AppleACPIPlatform   3.1
com.apple.iokit.IOPCIFamily   2.9
com.apple.iokit.IOACPIFamily   1.4
com.apple.kec.corecrypto   1.0
com.apple.kec.Libm   1
com.apple.kec.pthread   1
Model: MacPro5,1, BootROM MP51.007F.B03, 8 processors, Quad-Core Intel Xeon, 2.26 GHz, 37 GB, SMC 1.39f5
Graphics: ATI Radeon HD 4870, ATI Radeon HD 4870, PCIe, 512 MB
Memory Module: DIMM 1, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 2, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 3, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 4, 8 GB, DDR3 ECC, 1066 MHz, 0x857F, 0x463732314755363746393333334700000000
Memory Module: DIMM 5, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 6, 2 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54324743373242385041304E462D424520
Memory Module: DIMM 7, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Memory Module: DIMM 8, 1 GB, DDR3 ECC, 1066 MHz, 0x830B, 0x4E54314743373242383941304E462D424520
Bluetooth: Version 4.3.4f4 15601, 3 services, 27 devices, 1 incoming serial ports
Network Service: Ethernet 1, Ethernet, en0
PCI Card: ATI Radeon HD 4870, Display Controller, Slot-1
PCI Card: pci1103,645, RAID Controller, Slot-2
PCI Card: pci1b21,612, AHCI Controller, Slot-3
PCI Card: pci1b73,1100, USB eXtensible Host Controller, Slot-4
Serial ATA Device: HL-DT-ST DVD-RW GH41N
Serial ATA Device: OCZ-SOLID3, 480.1 GB
Serial ATA Device: ST4000VN000-1H4168, 4 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD60EZRX-00MVLB1, 6 TB
Serial ATA Device: WDC WD40EFRX-68WT0N0, 4 TB
Serial ATA Device: M4-CT512M4SSD2, 512.11 GB
SCSI Device: SCSI Target Device @ 32
USB Device: Composite Device
USB Device: USB3.0 Hub
USB Device: USB2.0 Hub
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: USB 2.0 Hub [MTT]
USB Device: DYMO LabelWriter DUO
USB Device: BRCM2046 Hub
USB Device: Bluetooth USB Host Controller
USB Device: Hub
USB Device: Sony Quick Scroll Mouse (USB)
FireWire Device: built-in_hub, Up to 800 Mb/sec
Thunderbolt Bus:
 
erico
 
Posts: 3
Joined: Mon Mar 31, 2014 11:11 am

Re: Replacing all disks of a raidz2 for pool expansion: A re

Postby erico » Mon Jun 22, 2015 6:58 pm

One small update:

after reading one of ilovzfs's posts, i downgraded to 1.31RC5. That version seems to be significantly slower, but isn't crashing! I can't quite understand why this is the case, but I notice that all the sysctl variables are different between 1.3.1rc5 and 1.3.1-r2/1.3.2RC2. Would anyone wish to have a bug report for the resilvering crashes under the later versions? It's quite nasty! Or do the developers understand what that bug is?

cheers,
Erico
erico
 
Posts: 3
Joined: Mon Mar 31, 2014 11:11 am

Re: Replacing all disks of a raidz2 for pool expansion: A re

Postby lundman » Wed Jun 24, 2015 4:41 pm

Maybe stay with RC5 until we know better. It could be related to something we have already fixed in master, but as we have yet to push out a new version, it may be simpler for you to remain.
User avatar
lundman
 
Posts: 1335
Joined: Thu Mar 06, 2014 2:05 pm
Location: Tokyo, Japan


Return to General Discussions

Who is online

Users browsing this forum: No registered users and 8 guests

cron