zfs hangs, requires reboot, on send/recv

This forum is to find answers to problems you may be having with ZEVO Community Edition.

Moderators: jhartley, MSR734, nola

zfs hangs, requires reboot, on send/recv

Post by amgems » Fri Jan 04, 2013 10:42 pm

I successfully sent "puddle@move" to "sea". I needed to do this because I added what I thought was an additional mirror, but it turned out to just add it as storage, and I was unable to undo that operation.

Now I am trying to move "sea@move" back to the newly created and correctly mirrored "puddle".

I have tried twice now. Each time, it runs briefly, and then wedges. Blocked uninterruptible. "restart" will not, so requires forced reboot.

Code: Select all
hubris# zfs list
NAME             USED   AVAIL   REFER  MOUNTPOINT
puddle         4.00Gi  2.54Ti  4.00Gi  /zfs/puddle
sea             380Gi  1.42Ti  4.00Gi  /zfs/sea
sea/u           376Gi  1.42Ti   456Ki  /zfs/sea/u
sea/u/8wayUsr   310Gi  1.42Ti   307Gi  /zfs/sea/u/8wayUsr
sea/u/dap      66.2Gi  1.42Ti  64.4Gi  /zfs/sea/u/dap
sea/u/dap/cp   13.2Mi  1.42Ti  10.2Mi  /zfs/sea/u/dap/cp
hubris# zfs destroy -r puddle
hubris# zpool list
NAME      SIZE   ALLOC    FREE     CAP  HEALTH  ALTROOT
puddle  2.59Ti  4.00Gi  2.58Ti      0%  ONLINE  -
sea     1.82Ti   380Gi  1.45Ti     20%  ONLINE  -
hubris# zfs unmount -a ; zfs send -Rv sea@move | zfs receive -v -u -F puddle
sending from @ to sea@2012-12-27-093533
receiving full stream of sea@2012-12-27-093533 into puddle@2012-12-27-093533
sending from @2012-12-27-093533 to sea@2012-12-28-105956
sending from @2012-12-28-105956 to sea@2012-12-29-175752
sending from @2012-12-29-175752 to sea@2013-01-01-194822
sending from @2013-01-01-194822 to sea@2013-01-03-130247
sending from @2013-01-03-130247 to sea@2013-01-04
received 4.01GiB stream in 83 seconds (49.5MiB/sec)
receiving incremental stream of sea@2012-12-28-105956 into puddle@2012-12-28-105956


Here is the current state.

Code: Select all
hubris% mount
/dev/disk17 on / (hfs, local, journaled)
devfs on /dev (devfs, local, nobrowse)
/dev/disk16 on /u/dap (hfs, local, nodev, nosuid, journaled)
map -hosts on /net (autofs, nosuid, automounted, nobrowse)
/dev/disk2s2 on /Volumes/Macintosh HD (hfs, local, journaled)
map -static on /home/dap/t410 (autofs, automounted, nobrowse)
/dev/disk15 on /Volumes/space (hfs, local, nodev, nosuid, journaled)
t410:/home/dap on /home/dap/t410 (nfs, nodev, nosuid, automounted, nobrowse)
/dev/disk19 on /zfs/sea (zfs, local, journaled)
hubris# dmesg | tail
zfsx_unmount: '/zfs/sea/u/8wayUsr' (umount)
316407 matches from 282262 files, 34526 directories in 158 sec
search_free_bins: free 2473 bins...
zfsx_unmount: '/zfs/sea/u' (umount)
zfsvfs_teardown: '/zfs/sea/u' (txg_wait_synced in 327 ms)
zfsx_unmount: '/zfs/sea' (umount)
zfsvfs_teardown: '/zfs/sea' (txg_wait_synced in 345 ms)
zfsx_mount: '/zfs/sea'
zfsvfs_teardown: online recv of /zfs/sea
hubris# zpool status
  pool: puddle
 state: ONLINE
 scan: none requested
config:

   NAME                                           STATE     READ WRITE CKSUM
   puddle                                         ONLINE       0     0     0
     mirror-0                                     ONLINE       0     0     0
       GPTE_7429BB89-84AD-4C9F-A26B-0FB8EFF44C26  ONLINE       0     0     0  at disk10s2
       GPTE_BC57718E-591D-4373-8F52-3462AB38393D  ONLINE       0     0     0  at disk12s2

errors: No known data errors

  pool: sea
 state: ONLINE
 scan: none requested
config:

   NAME                                         STATE     READ WRITE CKSUM
   sea                                          ONLINE       0     0     0
     GPTE_03B57AB6-4F37-46D1-8AEB-CE7C1EE5BD61  ONLINE       0     0     0  at disk18s2

errors: No known data errors
hubris# zfs list
NAME             USED   AVAIL   REFER  MOUNTPOINT
puddle         4.00Gi  2.54Ti  4.00Gi  /zfs/sea
sea             380Gi  1.42Ti  4.00Gi  /zfs/sea
sea/u           376Gi  1.42Ti   458Ki  /zfs/sea/u
sea/u/8wayUsr   310Gi  1.42Ti   307Gi  /zfs/sea/u/8wayUsr
sea/u/dap      66.2Gi  1.42Ti  64.4Gi  /zfs/sea/u/dap
sea/u/dap/cp   13.2Mi  1.42Ti  10.2Mi  /zfs/sea/u/dap/cp
hubris# zfs get all puddle &
NAME    PROPERTY              VALUE                  SOURCE
puddle  type                  filesystem             -
puddle  creation              Fri Jan  4 14:04 2013  -
puddle  used                  4.00Gi                 -
puddle  available             2.54Ti                 -
puddle  referenced            4.00Gi                 -
puddle  compressratio         1.00x                  -

... hangs here....
hubris# ps lax | grep zfs
    0   156     1   0  33  0  2490912   1180 -      Us     ??    0:00.02 /System/Library/Filesystems/zfs.fs/Contents/MacOS/zfs_delegate
 1001   449   224   0  33  0  2518748   5344 -      S      ??    0:00.03 /System/Library/Filesystems/zfs.fs/Contents/MacOS/zfs_notifier
    0   623   423   0  31  0  2433316   1384 -      S+   s000    0:03.49 zfs send -Rv sea
    0   624   423   0  31  0  2433316   1400 -      U+   s000    0:03.53 zfs receive -v -u -F puddle
    0   724   715   0  31  0  2433316   1304 -      U    s001    0:00.00 zfs get all puddle
    0   727   715   0  31  0  2433316   1304 -      U    s001    0:00.01 zfs get all sea
    0   757   715   0  31  0  2432768    620 -      R+   s001    0:00.00 grep zfs


I've been getting lots of similar hangs, but this is the first time I had a clean re-creation.

Running 10.8.2, Zevo 1.1.1, build 2012.09.23.

I now need to go and force reboot again....
amgems Offline


 
Posts: 3
Joined: Sun Sep 16, 2012 4:36 pm

Re: zfs hangs, requires reboot, on send/recv

Post by grahamperrin » Sun Jan 06, 2013 6:22 am

Without being hands on, it sounds like a bus error.

Any any of the three devices on USB?

If you disallow sleep of hard disks (in the Energy Saver pane of System Preferences), can you reproduce the problem?
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: zfs hangs, requires reboot, on send/recv

Post by mk01 » Sun Jan 06, 2013 4:53 pm

do you have the new pool on USB (FW)?

like graham is writing.

mk
mk01 Offline


 
Posts: 65
Joined: Mon Sep 17, 2012 1:16 am

Re: zfs hangs, requires reboot, on send/recv

Post by kerjo » Tue Feb 26, 2013 6:13 am

Having the same issue. It occurs intermittently with error:

Code: Select all
kernel[0]: zfs_secpolicy_write_perms: error 89

while sending/recieving an incremental snapshot to a pool on an external USB disk. Hard disk sleep is OFF. Running 10.8.2, Zevo 1.1.1, build 2012.09.23.
Help is greatly appreciated. Thanks
kerjo Offline


 
Posts: 4
Joined: Sat Sep 15, 2012 4:07 am

code 89

Post by grahamperrin » Wed Feb 27, 2013 10:20 pm

Code 89 may be coincidental. I see it often but probably never with a requirement to restart the Mac.

Please see zfs_secpolicy_write_perms, 89 (debug messages from kernel).
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: zfs hangs, requires reboot, on send/recv

Post by kerjo » Thu Feb 28, 2013 3:17 am

The occurrence of error 89 is indeed coincidental. All the more frustrating as I can find no other clue as to why this phenomena is happening.

Judging from the USB drive indicator light which stops flashing after a reasonable amount of time, zfs seems to be done with copying but somehow does't exit/complete the task.

Any suggestions on where to look further?
kerjo Offline


 
Posts: 4
Joined: Sat Sep 15, 2012 4:07 am

attention to the drive

Post by grahamperrin » Fri Mar 01, 2013 4:04 pm

kerjo wrote:… USB drive indicator light which stops flashing …


Wonder whether the drive truly keeps the disk spinning.

What's the make and model?
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: zfs hangs, requires reboot, on send/recv

Post by shuman » Fri Mar 01, 2013 5:19 pm

As far as I can tell by the stated issue, this is the same issue I'm having here:

http://zevo.getgreenbytes.com/forum/viewtopic.php?f=7&t=2015

Graham made a suggestion about a script to keep drives from spinning down:

http://zevo.getgreenbytes.com/forum/viewtopic.php?t=2017 using this software http://jon.stovell.info/personal/Software.html to keep the drives spinning. I have applied this to all involved disks, but the issue persists.

I am also dealing with external USB drives.
- Mac Mini (Late 2012), 10.8.5, 16GB memory, pool - 2 Mirrored 3TB USB 3.0 External Drives
shuman Offline

User avatar
 
Posts: 96
Joined: Mon Sep 17, 2012 8:15 am

Re: attention to the drive

Post by mk01 » Sun Mar 03, 2013 9:40 pm

grahamperrin wrote:
kerjo wrote:… USB drive indicator light which stops flashing …


Wonder whether the drive truly keeps the disk spinning.

What's the make and model?


Graham, maybe you remember the topic with kernel panics, where I commented, that I'm seeing reboots (from logs or uptimes of the machines). because Because most of the cases was during inaccessibility of backups server I was offering a possibility of ZFS taking down (panic / reboot) the system during started zfs send session, not progressing for a longer time (even hours), but surprisingly not failing the session, but causing panic.

zfs send is utilizing ethernet in this case, but the symptoms the same. I see it no longer as a coincident or random case. It points now more to ZEVO itself than the hw.
mk01 Offline


 
Posts: 65
Joined: Mon Sep 17, 2012 1:16 am

Re: zfs hangs, requires reboot, on send/recv

Post by raattgift » Mon Mar 04, 2013 5:35 am

Install pv from macports ( or http://www.ivarch.com/programs/pv.shtml ) and put it in your zfs send receive pipeline (options -bratpe -s 400g would be useful) to look to see if the pipeline fails essentially at the same byte count each time.

Compare that with "zfs send ... | pv -bratpe -s 400g > /dev/null". Does that finish without hanging or crashing?

Also, try all of the above without giving the replication stream (-R) option to zfs send and without the rollback (-F) option to zfs receive (but with the "-u" leave unmounted option).
raattgift Offline


 
Posts: 98
Joined: Mon Sep 24, 2012 11:18 pm

Next

Return to Troubleshooting

Who is online

Users browsing this forum: bileyqrkq, ilovezfs and 0 guests

cron