One bad sector

All your general support questions for OpenZFS on OS X.

One bad sector

Postby Tsur » Mon Jul 20, 2020 12:44 pm

I recently upgraded from 1.7.2 to 1.9.4. Though the upgraded went smoothly, I’ve been experiencing strange slow downs, or hiccups, while writing files.

Using High Sierra, I’m running ZFS2 with 7 WD 4GB Reds, connected via internal SATA controllers (Marvell 88SE9215). During writes, network or internal SSDs, I’ll be cruising along when things will just slow down to just a few KB/s. After a several seconds, sometimes +10 seconds, things will pick right back up. During large writes, this cycle can occur many times.

Zpool status shows everything is fine. And a scrub was uneventful.

Thinking it’s possibly a failing drive, I ran GSmartControl and discovered one bad sector on one drive. I’ve been monitoring the drive for a couple of weeks, and the bad sector count has remained the same. The “Overall Health Self-Assessment Test” reads “Passed.” Running a "Short Self-Test" reports back “Completed with read failure 10%”.

I do have a replacement drive at the ready, but I don’t want to unnecessarily replace the questionable drive.

So my question is, would one bad sector explain my write hiccups?
Tsur
 
Posts: 22
Joined: Thu Jan 07, 2016 2:11 pm

Re: One bad sector

Postby Sharko » Wed Jul 22, 2020 12:38 pm

Is it possible that one of your drives was purchased fairly recently? Western Digital has, in the last year, started selling WD Red drives incorporating shingled magnetic recording (SMR) technology without informing the purchaser. In fact, people have had to dig around using heuristics such as the drive model number suffix to figure out whether they have an SMR drive. SMR drives do not play nice with ZFS: performance is abysmal, although it usually manages in the end to finish whatever write you're trying to accomplish. Jim Salter on Ars Technica has a good article on SMR drives, and Western Digital's lame-o defense of same.
Sharko
 
Posts: 230
Joined: Thu May 12, 2016 12:19 pm

Re: One bad sector

Postby Tsur » Wed Jul 22, 2020 2:57 pm

None are SMR. Six of them are three years old. One is about two years old.

It seems like one bad sector shouldn't not be causing these write slow downs. But I'm at a loss.
Tsur
 
Posts: 22
Joined: Thu Jan 07, 2016 2:11 pm

Re: One bad sector

Postby JasonBelec » Thu Jul 23, 2020 9:15 am

Actually you will see degradation if a drive is having issues in your pool. ZFS is all about quality and takes hissy fits when things are failing. Have you tried replacing the drive?
JasonBelec
 
Posts: 32
Joined: Mon Oct 26, 2015 1:07 pm

Re: One bad sector

Postby Tsur » Thu Jul 23, 2020 10:46 am

No, I haven't tried. Though I guess that's the next step and I do have a drive at the ready. I just thought it was unlikely that one bad sector was causing the slow downs. I've now been watching said drive for about three weeks and the bad sector count remains at one. So it doesn't seem like the drive is getting any worse.

I guess I'll replace the drive and report back.
Tsur
 
Posts: 22
Joined: Thu Jan 07, 2016 2:11 pm

Re: One bad sector

Postby Tsur » Wed Aug 05, 2020 9:13 am

Apparently, one bad sector can cause problems with your setup.
Finally got around to replacing the drive and, after waiting for the resilver, the hiccups are gone.

Interestingly, I zeroed out the "bad" drive and it now appears really healthy. The bad sector is now gone and smart status reports it as a-okay. Does ZFS have a utility or mechanism for repairing bad sectors?
Tsur
 
Posts: 22
Joined: Thu Jan 07, 2016 2:11 pm

Re: One bad sector

Postby Sharko » Wed Aug 05, 2020 12:25 pm

The firmware of the drive has the capability to replace bad sectors with spare sectors kept in reserve on the drive. I believe that the re-allocation occurs when the drive is instructed to write to a sector that has been returning an error, so writing zeroes to the drive is probably what triggered the reallocation. If you run a SMART utility (I like DxDrive from Binary Fruit) you should see that the re-allocated count is now non-zero. Re-allocating sectors from spares to replace known bad sectors is usually taken as an early warning that a drive is failing, however.
Sharko
 
Posts: 230
Joined: Thu May 12, 2016 12:19 pm


Return to General Help

Who is online

Users browsing this forum: No registered users and 28 guests