Design a site like this with WordPress.com
Get started

The case of the 12TB URE: Explained and debunked

Spoiler: it’s a myth!
URE: Unrecoverable Read Error – an item in the spec sheet of hard drives and a source of a myth. Many claim that reading around 12TB of data from a consumer grade HDD will lead to an almost certain (unrecoverable) read error. Data will be lost. Or an entire RAID array. All nonsense, as explained below.

What is the URE?

URE is the acronym for Unrecoverable Read Error. It is also known under this variations of the name:

  • UBER – Unrecoverable Bit Error Rate
  • UBE – Unrecoverable Bit Error
  • UER – Unrecoverable (Bit) Error Rate

It is a hard drive specification specifying the rate of errors when reading data from said hard drive (from now on HDD as Hard Disk Drive) or the statistical ratio between the number of read error versus the amount of data read. A typical value (in the decade leading up to 2020) is 10-14 for consumer (read: cheap) drives and 10-15 or more (err.. less) for enterprise (read: expensive) drives.

Example from the specs of a WD Ultrastar DC HC510 (enterprise class):

Error rate (non-recoverable, bits read)1 in 1015

Another name for the unrecoverable read error is:

  • LSE – Latent Sector Error
    Latent, because you don’t know it is there, until you try to read it.
  • bad sector – the colloquial name for the same thing

Those names will be used interchangeably in the following text: bad sector for the state of the sector and LSE or URE for the event of an error when attempting to read such a sector

A note about bad sectors:

(added on 31th of January 2022)

There are basically two types of them: soft and hard. Hard bad sectors are damaged surface, they can not be fixed. Soft bad sectors are just an error in the sector content. They can be successfully rewritten with good and valid content. The both types return a read error when they are read. See more in [7] https://www.howtogeek.com/173463/bad-sectors-explained-why-hard-drives-get-bad-sectors-and-what-you-can-do-about-it/

The myth: read 12TB of data, get the error

A popular interpretation of the URE spec is this:

If the amount of data you read from a HDD comes close to about 12 TB, a (unrecoverable) read error is imminent, almost certain.

With the corollary:

If that happens during a rebuild of a degraded RAID-5 array, the entire array will be lost.

(the value of 12 TB is for 1 in 1014 rate of consumer drives, for enterprise disks that are rated with lower error rate it is accordingly higher, so 120 TB for UBER=1 in 1015 and 1200 TB for UBER=1 in 1016).

For examples of the above claims, see:

The practice

“Do not believe in anything simply because you have heard it. Do not believe in anything simply because it is spoken and rumored by many. Do not believe in anything simply because it is found written in your religious books. Do not believe in anything merely on the authority of your teachers and elders. Do not believe in traditions because they have been handed down for many generations. But after observation and analysis, when you find that anything agrees with reason and is conducive to the good and benefit of one and all, then accept it and live up to it.” – Gautama Buddha

Quite a mouthful. It’s so long, that while reading it, you could have run a long read operation on your HDD and see for yourself if an URE happens. Well, almost. To read 12 TB it would take about a day, when a single HDD is used. Less if more of them are read in parallel.

Testing it

So you need a HDD or two and a single day to prove the 12TB URE theory, yet, for some reason, there is virtually no such testimony to be found. The opposite is true. For example Microsoft ran and documented such a test in 2005 ([4] “Empirical Measurements of Disk Failure Rates and Error Rates”, Microsoft Research Technical Report MSR-TR-2005-166, December 2005) and the result is: We moved 2 PB through low-cost hardware and saw five disk read error events, several controller failures, and many system reboots caused by security patches.
Oh, they had 5 read errors!

But they go on: The drive specifications of UER=10-14 suggest we should have seen 112 read errors. So there were 20 times less errors that expected. Also, two errors were actually recoverable (the application that read the data, and computed the checksum, got all the data. The OS did a retry of the read, and succeeded.) So 3 actually unrecoverable errors, when 112 were expected. That is 37 times less. Also, the cheapest test system (office class, running in an office instead of an air conditioned rack like the others) had only one error: In the office setting (System 1) we lost one 10GB file in 35,000 tries, and no 100GB files in 7,560 tries. One error in 350 TB of the first group (10GB test files), and no error in 756 TB read in the second group (100GB test files). Let’s compare that to the myth: 12TB read -> (almost) certain error. Doesn’t seem so.

Also, another research [5] found:
None of the clusters showed a correlation between either the number of reads or the number of writes that a drive sees (as reported by the drive’s SMART parameters) and the number of LSEs it develops. (Understanding latent sector errors and how to protect against them. ACM Trans. Storage, September 2010)

And another [6]:
Observation 9. Latent sector errors are not independent of each other. A disk with latent sector errors is more likely to develop additional latent sector errors than a disk without a latent sector error. (An Analysis of Latent Sector Errors in Disk Drives. InProc. of SIGMET-RICS’07, 2007)

Let’s analyze that in the next section:

The theory

“It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong.” – Richard P. Feynman

So, what is wrong with the theory of 12TB read -> (almost) certain error ?

Well first, correlation does not imply causation. Just because the observed ratio of reads and errors is 1 in 1014, it doesn’t mean one causes the other. 12TB reads do not cause an URE. Or an URE does not cause reads. Obviously. If you have a new HDD, read 10 sectors and the 10th is a bad sector, does it mean you must do 12TB of more (successful) reads, to satisfy the ratio of “1 URE : 12TB read” ? Of course not. The opposite is also true. Making reads does not cause UREs. Try this thought experiment: there is a HDD that has X good sectors and Y bad sectors. If you keep reading the good sectors, you can drive the error ratio down. If you keep reading the bad ones, you push it up. You can control it! And no, reading the same sector again and again does not increase the likelihood of it becoming bad*. Otherwise your PC would not work, as the boot sector and other system files are read again and again, every day. On the cheapest consumer HDD. While the RAID rebuild reads each sector only once.

* actually, the opposite is true. Reading a sector often is checking it often and if it degrades, the HDD firmware will notice it, remap the sector to a good spare and write the content there. Before it becomes unreadable.

Anyway, per [5] as quoted above, there is not even correlation between the amount read and the number of read errors.

What if the LSE is already there?

Yeah, what if one disk in the RAID fails and the other already has at least one LSE (bad sector)? Well, if it is RAID-5 then you have a problem, since it protects against one failing disk, not two. But it is not caused by or even correlated to the array size, as explained before. It can just as well happen with the allegedly safe array sizes, like 1 TB or less.

(the solution to the above scenario is scrubbing; regularly check the disk surface and proactively fix any found read errors. And replace the failing HDD, as those errors show strong locality: where there is one, there will be soon another)

Summary

“Just to be clear I’m talking about a failed drive (i.e. all sectors are gone) plus an URE on another sector during a rebuild. With 12 TB of capacity in the remaining RAID 5 stripe and an URE rate of 10^14, you are highly likely to encounter a URE.” – Robin Harris [1]

No you are not. First, the author ignores the fact that the failed drive makes for 1TB or so of UREs, so there is no “need” for one more URE to “keep up with” the specced “one in 12TB” URE ratio. Second, as explained above, there is no correlation between the amount of data read and number of UREs.

If anyone disagrees, feel free to post a video of this URE (or link to existing research which confirms it). After all, according to the myth, you just need a HDD and 24 hours (much less with a RAID than runs drives in parallel). You do have a HDD and a day, right?

Note: there is a discussion and comments about this article on reddit.

References

[1] Robin Harris. Why RAID 5 stops working in 2009. ZDnet Storage Bits. July 18, 2007 https://www.zdnet.com/article/why-raid-5-stops-working-in-2009

[2] Leventhal, Adam. Triple-parity RAID and beyond. Commun. ACM. 53. 58-63. 10.1145/1629175.1629194. December 2009
https://queue.acm.org/detail.cfm?id=1670144

[3] Scott Alan Miller. When No Redundancy Is More Reliable – The Myth of Redundancy. SMB IT Journal. May 2012
https://smbitjournal.com/2012/05/when-no-redundancy-is-more-reliable/

[4] Jim Gray, Catharine van Ingen. Empirical Measurements of Disk Failure Rates and Error Rates. Microsoft Research Technical Report MSR-TR-2005-166, December 2005 https://www.microsoft.com/en-us/research/wp-content/uploads/2005/12/tr-2005-166.pdf

[5] Bianca Schroeder, Sotirios Damouras, and Phillipa Gill. 2010. Understanding latent sector errors and how to protect against them. ACM Trans. Storage 6, 3, Article 9 (September 2010) https://www.usenix.org/legacy/event/fast10/tech/full_papers/schroeder.pdf

[6] L. N. Bairavasundaram, G. R. Goodson, S. Pasupa-thy, and J. Schindler. An Analysis of Latent Sector Errors in Disk Drives. InProc. of SIGMET-RICS’07, 2007. https://research.cs.wisc.edu/adsl/Publications/latent-sigmetrics07.pdf

[7] Chris Hoffman. Bad Sectors Explained: Why Hard Drives Get Bad Sectors and What You Can Do About It. How-to Geek. July 5, 2017 https://www.howtogeek.com/173463/bad-sectors-explained-why-hard-drives-get-bad-sectors-and-what-you-can-do-about-it/

Published on 2020 aug 25

Advertisement

12 thoughts on “The case of the 12TB URE: Explained and debunked

      1. Actually it kind of shows the bias. You are repeating something verified to be wrong. Like the read info. UREs are not based on straight reads and one of the STOCK tricks of people trying to refute UREs is to pretend that they will happen in contrived reads the same as real reads over time. This is fake and your approach shows that you are attempting to replicate the same smoke and mirrors that we see all the time. This is how vendors show high availability of storage systems in demos that don’t act reliably in production. You can make a demo appear to mimic a real world failure because it sounds reasonable. But in production we don’t have fresh drives being read sequentially all at once. We have drives being read randomly over a long period of time with a large degree of environmental factors. Unless you’ve accurately replicated that over a giant number of drives, any attempt to claim proof is just a lie. You provide a lot of trite quotes, but quotes are for marketing, not math.

        Like

      2. scottntg wrote (it seems I can not reply to that message directly, wordpress limitation I guess):

        > Unless you’ve accurately replicated that over a giant number of drives, any attempt to claim proof is just a lie.

        There is a link in the article about a research over a giant number of drives (quote: “In total the collected data covers more than 1.5 million drives ” … “over a period of 32 months”) that was published in a peer reviewed scientific journal.

        Is that close enough?

        Like

  1. A mistake in thinking can be seen here. How great the influence is would still have to be examined. The error is that bad sectors and read errors are equated in this theory.

    Another error is to assume that the existing bad sectors of an HDD are present at the factory. It is clear that they may be present at the factory. Even if the sector of an HDD is not damaged, an unrecoverable bit rot can occur.

    Otherwise the reallocated sectors and C5 an C6 would never increase after the first full write and read, but they do after a long time of operation.

    I think your debunking has been debunked.

    Like

    1. Hi,
      thank you for your comment!

      > The error is that bad sectors and read errors are equated in this theory.

      The term “bad sector” is introduced as a more known name for “sectors that return an error when read”.

      If we take a strict definition of “bad sector” as “permanently (physically) damaged sector, incapable of storing user data” then they are of course not equivalent. While every bad sector causes a read error, not all read errors are due to bad sectors. Also “bad sector” is a state, while “read error” is an event.

      > Another error is to assume that the existing bad sectors of an HDD are present at the factory.

      Are you refering to the part “If you have a new HDD, read 10 sectors and the 10th is a bad sector” ?
      It does not assume the bad (or rather unreadable) sector existed in the factory. It can be a 2 year old HDD that was unpackaged today and the sector was damaged right now, during unpacking. It has no bearing on the story when the sector went bad.

      If the term “bad sector” is misleading, I can replace it with “(permanently) unreadable sector” or similar.

      Like

  2. Thanks for this! It’s good to see an article that takes a critical look at this myth. The specs for many enterprise drives include a 550TB/year workload. That’s a lot more than 12TB or even 120TB.

    The “Why RAID 5 stops working in 2009” article is particularly bad. It claims that an error in a “12TB stripe is highly likely … almost certain”. Even if you were definitely going to get an error in 12TB of reading, rebuilding a 2TB could not require reading more than 2TB!

    Like

  3. Another interesting article:

    http://www.raidtips.com/raid5-ure.aspx “Bonus Tip – Unrecoverable Errors in RAID 5”

    These calculations are based on somewhat naive assumptions, making the problem look worse than it actually is. The silent assumptions behind these calculations are that:

    – read errors are distributed uniformly over hard drives and over time,
    – the single read error during the rebuild kills the entire array.

    Both of these are not true, making the result useless.

    I found the article mentioned here: https://www.high-rely.com/2012/08/13/why-raid-5-stops-working-in-2009-not/

    Like

  4. It’s weird that you talk about demonstrating this effect, and then act like we don’t demonstrate it all the time. Anyone who has run legacy drives in production with URE exposure sees this all the time. It’s BECAUSE it’s so commonly observed that it is so well known. I’m baffled by how easy it seems to be to claim that something so common and often observed first hand by anyone with any IT exposure doesn’t exist as if people don’t witness it all the time. Then make an example that doesn’t fit the case, like a single drive read all at once, and try to say that the URE average rate can be observed by a single example.

    This is like taking the failure rate of engines and using a single engine to examine it and assuming that the results of a single case can only come from the top of the bell curve. We don’t even have a reason to believe that there is a bell curve! Everything about this approach is wrong.

    URE are not uniform and no one suggests that they are. Storage is not something you measure with a single drive, but with thousands of drives. You need to start with basic statistics before trying to write an article about statistical failure rates.

    Like

    1. > you talk about demonstrating this effect, and then act like we don’t demonstrate it all the time.

      Link please and the case is closed.

      > Then make an example that doesn’t fit the case, like a single drive read all at once, and try to say that the URE average rate can be observed by a single example.

      The myth talks about a very specific scenario: rebuilding a RAID-5 array
      Which is literally “a single drive read all at once”, another “single drive read all at once” and a third drive being written to. And the myth claims the URE in this specific scenario is next to certain. So it can be observed.

      Also: “that the URE average rate can be observed by a single example” I don’t claim that. I claim that by reading 120 TB, not seeing one URE, you can cast a serious doubt about the 1URE-per-12TB claim.

      > Storage is not something you measure with a single drive, but with thousands of drives.

      Sure, that’s why I refer to scientific research done on more than million drives.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: