Backblaze statistics 2Q/2021

Currently reading
Backblaze statistics 2Q/2021

2,486
840
NAS
Synology, TrueNAS
Operating system
  1. Linux
  2. Windows
Imported BB data into my data model.

Some opened questions:
- again their datasets contain the mess. E.g. negative values for some disk capacity (some from HGST), blank SMART values (even w/o temperature records)
- again Unexpected Power Loss Count (ID 174) for SSD (Seagate Barracudas)? I can't get it. Why?
- again they have 108pcs of undefined "Seagate SSD" in operation. No details.
- in Jun 21st Max temperature of 55 HDDs reached 86C! it was a mix of diff capacities and diff classes (desktop/enterprise) and diff form factors (3.5/2.5").
- They keep the maximum HDD temperature at 53C (Median of the Max temperatures). Which is too much for the data centre.

I will continue with the Median failure index for comparisons disk to disk (not finished here). In this model, you can look at the complete context of individual events.

This is one of the reasons why you cannot blindly look at interpretations of data unless the author/source provides you with the context or more details.

As usual, I tried to address the author directly, who does not respond again.

1633447370924.png


HUS (helium filled) drives (now WDC HC5xx series) performed well! 90% of them in 12 and 14 TB capacity:
1633448732896.png


EXOS in similar level (75,23k of HDDs there) almost 2.2x more devices than HUSes in operation
1633448894690.png


But w/o specific operation conditions (workload, FS, BB custom pool levels, ...) no one except architect of the data centre is able to create clean picture about the reliability.
 
Imported BB data into my data model.

Some opened questions:
- again their datasets contain the mess. E.g. negative values for some disk capacity (some from HGST), blank SMART values (even w/o temperature records)
- again Unexpected Power Loss Count (ID 174) for SSD (Seagate Barracudas)? I can't get it. Why?
- again they have 108pcs of undefined "Seagate SSD" in operation. No details.
- in Jun 21st Max temperature of 55 HDDs reached 86C! it was a mix of diff capacities and diff classes (desktop/enterprise) and diff form factors (3.5/2.5").
- They keep the maximum HDD temperature at 53C (Median of the Max temperatures). Which is too much for the data centre.

I will continue with the Median failure index for comparisons disk to disk (not finished here). In this model, you can look at the complete context of individual events.

This is one of the reasons why you cannot blindly look at interpretations of data unless the author/source provides you with the context or more details.

As usual, I tried to address the author directly, who does not respond again.

View attachment 4590

HUS (helium filled) drives (now WDC HC5xx series) performed well! 90% of them in 12 and 14 TB capacity:
View attachment 4591

EXOS in similar level (75,23k of HDDs there) almost 2.2x more devices than HUSes in operation
View attachment 4592

But w/o specific operation conditions (workload, FS, BB custom pool levels, ...) no one except architect of the data centre is able to create clean picture about the reliability.

Interesting, good info. I am about to get new higher capacity drives. I was going to AVOID Helium filled drives, as (1) Helium escapes over time leading to failure (2) It's impossible to repair/recover if failed... but is that not the case anymore?
 
Helium filled HDD
Helium is 7x lighter than air and this is the base of the Helium usage in HDDs.

Warning: This is a long shot. Useful only for those, which have an interest to be more than followers.

Pros:
- Helium creates less drag and turbulence when HDD platters spin = Less drag = less noise
- Squeezing tracks closer together means more data tracks per disk = more data per HDD
- Thinner disks = more disks = more data per HDD
- Thinner disks + less drag = less power to spin
- higher thermal conductivity of helium vs air = less overheating
- "Sufficiently" sealed drives keep helium in and keep contaminants out vs standard HDD (air-filled)
- Helium filled HDD is therefore recommended in the higher attitude operation environment

Cons:
- you can't repair HW parts (stuck heads, ...) of the He filled HDD in a common domestic environment vs standard HDD (open, repair, close)
- leakage of the He will be the biggest cause of major damage to the internal parts of the HDD = all named advantages turn into disadvantages

There is SMART ID 22 indicator of He leakage from the HDD.
Theory about the He leakage = what does it means = US patent #434987:
TdLr:
A method to detect helium leakage from a disk drive enclosure is disclosed and claimed. A measurement electrical current is passed through a temperature sensor disposed within the disk drive enclosure. A reference electrical resistance corresponds to a reference temperature of the temperature sensor. A heating electrical current is passed through the temperature sensor. A heated electrical resistance of the temperature sensor, corresponding to a heated temperature of the temperature sensor that exceeds the reference temperature by at least 5° C., is determined. A value that corresponds to a quantity of helium within the disk drive enclosure is determined based on the reference electrical resistance and heated electrical resistance.
However, nowhere on the official HDD vendor sites (or associate sites), I found what exactly the specific value in ID22 means.

Because I follow the data from Backblaze (BB) for a long time (I am interested in storage mediums from any principle points, because it is part of my education) & according to the source of dataset - BB (nothing was changed from 2018):
We have both HGST and Seagate helium-filled hard drives, but only the HGST drives currently report the SMART 22 attribute.
+ my note: they have also Toshiba He filled HDDs in operation.
I can read the SMART ID 22 just from the HSG/WDC models.
Note: SMART ID 22 indicator - this is not even included among the most critical indicators that BB monitors

He filled HDD in BB operation (Q2/2021 dataset):

modelvendormodel namemodel type
ST10000NM0086SeagateExosX10
ST12000NM0007SeagateExosX12
ST12000NM0008SeagateExosX14
ST14000NM0018SeagateExosX14
ST14000NM0138SeagateExosX14
ST10000NM001GSeagateExosX16
ST12000NM001GSeagateExosX16
ST14000NM001GSeagateExosX16
ST16000NM001GSeagateExosX16
ST16000NM005GSeagateExosX16
ST18000NM000JSeagateExosX18
TOSHIBA HDWE160ToshibaX300
TOSHIBA HDWF180ToshibaX300
TOSHIBA MG07ACA14TAToshibaMG07
TOSHIBA MG07ACA14TEYToshibaMG07
TOSHIBA MG08ACA16TAToshibaMG08
TOSHIBA MG08ACA16TEYToshibaMG08
HGST HUH721010ALE600Western DigitalUltrastarDC HC510
HGST HUH721212ALE600Western DigitalUltrastarDC HC510
HGST HUH721212ALE604Western DigitalUltrastarDC HC510
HGST HUH721212ALN604Western DigitalUltrastarDC HC510
HGST HUH728080ALE600Western DigitalUltrastarDC HC510
WDC WUH721414ALE6L4Western DigitalUltrastarDC HC510
WDC WUH721816ALE6L0Western DigitalUltrastarDC HC510
WDC WUH721414ALE6L4Western DigitalUltrastarDC HC530
WDC WUH721816ALE6L0Western DigitalUltrastarDC HC530

ALL Helium-filled HDDs in BB operations. Filter: ALL Helium filled (2Q/2021 dataset source):

1633683823919.png

Based on Distinct Serial No, they have:
66.5% of ALL drives (HDD+SSD) are He filled HDD: 128.11K from 192.63K
or
98.7% of ALL HDDs are He filled HDD: 190.06K from 192.63K
Note: it is interesting because follow BB 2Q/2021 evaluation they stated that they have up to 177,935 disks in operation. I wrote them many times, that I often don't like the regular discrepancies/results in their official blog and what they export/publish as RAW datasets. I often encounter the problem that someone leaves DB analysts uncontrolled to search the data, without knowing the context. It can also be seen in the quality of their datasets that they are too far from data taxonomy regulations.
Just for a comparison a copy of their report from BB Blog:
Q2-2021-Quarterly-All-Drives.jpg

In their dataset for Q2/2021 they have: 3254 unique HGST HMS5C4040ALE640 (first row in their table)
but in their blog evaluation (table above) they have only 3209 drives.
Source:
Let someone try to explain to me that they publish a different dataset than they use for the Blog. Reason?

BB Dataset, Filtered by He filled HDD & by ID 22 nonblank event:
- 34.29K HDDs (distinct SerialNo.) = 18% from the ALL He filled HDDs in operations (all of them contain Failure flag = 0(No) or 1(Yes)
- Ultrastar models only not older than 3.6 years (based on Power on hours ID)
- you can see their Capacities, Models (Part numbers), ....

1633685292999.png


After filtration by Failure Flag = 1, you can get just a total of 28 failures (ID 22) in Q2/2021. For the HDDs with Power on Hours within 0-2.5.
- 15 of the failures for drives not older than 1Y ...54% from all failures (what supports the experiences, that most of the HDD failures you can get in the first year of operation)
- no relation with an overheating (max temp was about 40C)
- 66.67% of the failures were related to 14TB capacity ... WUH721414ALE6L4 (DCHC530)
more here:
1633685952505.png



Conclusion:
There isn't a fundamental association of the He leakage based on the Power on hours indicator for the HDDs within the dataset. Except for few ID22 events for a negligible number of disks from all the operates.
Of course, a huge amount of data from other disks are missing. Seagate also does not provide data. Since it is not at all clear what those numbers mean, there is no need to deal with the leakage of He from the disks. For now.

Cheers.
-- post merged: --

Just last note: BB does not specify anywhere in the datasets which disks are used for booting. It is more than likely that these will not be those with a capacity above 250GB.
 
It would be interesting to get information on which disks were used in the same Pool.
And then compare their mutual values - influencing (how I hate the term).
Same for a workload performance.
 
It would be interesting to get information on which disks were used in the same Pool.
And then compare their mutual values - influencing (how I hate the term).
Same for a workload performance.
Thanks for looking at the data. So, we can't really tell right now, correct? I was looking to find the largest non-HE filled drive, which seems to be WD Red 10TB, but have to match part# to be sure.

Is it worth digging to find one of those? or HE filled 12TB drives seems to be doing ok (albeit not much long-term data). My 6TB drives are a few years old and still good.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Similar threads

for a correct evaluation I must say, that BB doesn’t have standard RAID in operation. There is their own...
Replies
6
Views
2,543
These kind of data should only be shared with the confidence limits, the table is available on the...
Replies
2
Views
1,176
Unfortunately, so far I have not found one quality data report from BB, for which I would disgrace myself...
Replies
1
Views
1,714

Welcome to SynoForum.com!

SynoForum.com is an unofficial Synology forum for NAS owners and enthusiasts.

Registration is free, easy and fast!

Back
Top