DSM 7.1 DSM7.1 introduces new drive vendor-locking parameters

Currently reading
DSM 7.1 DSM7.1 introduces new drive vendor-locking parameters

Hi all, herewith my first post here.

Well after some research, I had decided the DS920+ (or hopefully an upgraded DS922+ ?) would be the best NAS for me. I am a home user with a LOT of photos, high res scans and terabytes of videos spread across various desktops that need (for the first time) full RAID, proper backups, exposure to the internet etc. I am "just ok" with the cost totalling up the unit plus four Ironwolf drives and find the DSM 7 software suite very attractive.

However this thread puts the wind up me. If Synology turn around and de-legitimise other brands of hard drives, I could be stuck up a dry creek with a very expensive but largely useless paddle. For now I will have to take a watching brief and wait to see whether a "22" unit arrives and whether further changes are made to make life difficult/impossible for those seeking a powerful solution with open architecture and justifiable cost by using third party approved drives. The same argument goes for RAM modules, although that seems more fraught even now with very few documented third party options that actually work 100% (although I am not sure my use cases need a RAM upgrade).

So for now, plans are on hold here (having been on the point of purchase) and I will have to continue supporting pretty ancient windows hardware and backups via a legion of portable consumer external drives.

cheers mobbus
 
I used two diff lines of the escalation:

1. Official support
Answer:
Usually, the smart values are only read out of the manufacturers eprom and database. So you could contact the manufacturer and ask why this happened.

SSD are determined by their lifespan and this value is shown by some manufacturers in percentage of remaining or used hours.
My point - as was expected - Synology 1st line support is:
- absolutely incompetent, they employ people for simple tasks - e.g. how to turn on the NAS.

Their answer indicates that there is an error in the disk even though 3 independent checks confirmed that the disk is OK:
- Synology integrated smartctl
- independent smartctl
- WD tool.
However, if Synology only refers to data from the disk, then it is clear that it only displays data from the disk. And that data doesn't go through a "transition/correction process".
However, if Synology displays data other than from the disk (and the DSM logs), it acknowledges that it is manipulating the data. And that should be called the correct term regardless of their reason.

2. Unofficial line - still on hold. Patience.
 
The recent pace of change and the refinement of the TrueNAS GUI does make it much more approachable though and I think the switch from using DSM to TrueNAS is considerably less steep than going from having never used a NAS to using DSM for the first time
This TrueNAS discussion and following comments is swaying me towards OMV. Seems like an "easy" entry point for Synology users, particularly those with PCs that can be repurposed... to OMV... and ultimately to Docker.... Thoughts?
 
All the platforms have their advantages and disadvantages. No doubt.
It seems that each of them has a different short-term strategy and plans on how to reach the defined strategy. What is unclear - what role do power users have in all this, which know more than just downloading torrents?
 
Finally, I found the real REASON for this strange behaviour. It is 100% certain that the bug is on Synology's side. And again it's an amateur approach of the DEV team. On the contrary, I refute Synology's efforts to discontinue support for WD Blue drives.

Background from the @Robbie smartctl output:

Power_On_Hours: 19959 = 2.29 years ................. it is OK.

TBW written:
241: 41122 GiB / 1024 = 40.16TiB
242: 187404 GiB / 1024 = 183.01TiB
Sector Size: 512 bytes logical/physical
TBW= 241+242 = 223.17 TiB or 245TB ............... also OK, TBW defined by the vendor: 600TB by the official source:
https://documents.westerndigital.co...sd/product-brief-wd-blue-3d-nand-sata-ssd.pdf

What about 230 Media Wearout Indicator?

Yes - here it is:
230 Media_Wearout_Indicator 001
at first, I thought it was a mistake on the WD side. That value should be much higher vs TBW or PoH (above), otherwise, it would mean that the disk wear is remaining only 1% = disk death. So DSM7's behaviour would be OK.

and

because Synology support engineers are flagship support of the vendor that supplies NASs and not microwaves, they should know that the SMART READ DATA command was declared obsolete in ATA ACS-4. Therefore, the normalised VALUE should decrease if things get worse. So a reasonable "wearout" attribute would be VALUE = 100 - WEAROUT%.

However, they also found nothing from my ticket or from @Robbie (from attached smarctl outputs). If they forwarded it to the 2nd level support and didn't keep it together, maybe it could be solved. And maybe not.

Discovery:
So I rummaged through SmartMonTools official support tickets a bit. And I found this (I'll cut it short):
WD calculates 230 Media_Wearout_Indicator for certain firmware in reverse - so not from top to bottom 100% -0%, but from top to top 0% -100%. Thus, the value 001 = 1% of the wear of the disk and not the remaining wear.
Confirmed by the firmware version discovered in @Robbie smartctl outputs.

However, these disks (firmware) already modified with SMART normalized VALUE were added to SmartMonTools DB 3-2 years ago based on the standard SMART DB maintenance from the SmartMonTools. But with Synology behaviour of updating every 3rd party piece of the DSM over a 100-year cycle, of course, it is not in the latest DSM7 upgrade or yes, but:

someone from the Synology DEV team had to do something because in the previous version (before the upgrade) of DSM7 it worked OK (no bug). This means that the Synology DEV team, as usual, underestimated the preparation for the last DSM7 upgrade = I call it "no SW analysis" and we subsequently saw some of the results in live = @Robbie disks, which were OK in real was labelled by DSM7 with the 1% balance for wear = Critical SMART value.

10 beers that DSM have it hardcoded and not as a variable for the Media Wearout Indicator value interpretation.:rolleyes:

Done

I asked Synology support for ASAP remedy and a professional approach.
@Robbie - you can write them also (your ticket)


Conclusion:
If Synology weren't such stupid people in management and agreed on the offer that this forum could provide them with so many pro-grade independent ideas and support, then it could have been a good system. They could put one person from 2nd level support here, just for a part-time (few hours monthly). So, they save costs. Great.

P.S.:
I prefer not to write about what I have read so far as answers from Syno support. I would be unnecessarily rude.
 
one more consideration:
- when the WD Blue SSD firmware issue about upside down interpretation of the MWI is known from 2020 (web research)
- when the value of MWI was the same before the last DSM7 upgrade (smart logs)
- when the value of MWI wasn’t taken into consideration within DSM7 logic in the previous version
Then last possible common sense outcome is, that Syno in the new DSM7 upgraded version taken into account also the MWI value for checking the drive health status. Never before (otherwise it will be discovered). And as hardcoded feature w/o upgrade of the SmartMonTools DB (where is it repaired more than 2y).
 
@Robbie
could you please upload:
/var/log/smart_result/ extracted TAR file from date before the DSM7 upgrade? When was everything fine?

Just for the drive:
Device Model: WDC WDS400T2B0A-00SM50
Serial Number: 1926D7420114

No need for more drives. They will all be the same.

---------------------------------------------
It is clear enough NOW:

I remotely connected to my son's NAS, where there is still an older version of DSM7 7.0.1. build 42218.2
and there is:
Bash:
smartctl -V
smartctl 6.5 (build date Feb 20 2021) [x86_64-linux-4.4.180+] (local build)
After Today's update to the new version of DSM7 7.0.1. build 42218.3, there is:
smartctl 6.5 (build date Feb 20 2021) [x86_64-linux-4.4.180+] (local build)
means still the SAME build = no updated behaviour in the SMART data gathering from the disk. Just not updated smartmontools DB, which contains the WD BLUE firmware 230 Media_Wearout_Indicator correct "translation" mentioned above: that value =001 isn't 1% of the health media surface.

Which is different from all of my DSM6 NASes but created 10 days later:
smartctl 6.5 (build date Mar 2 2021) [x86_64-linux-3.10.105] (local build)

To be sure, @Robbie DSM 7.1-42661.1 was correct = no 230 Media_Wearout_Indicator discovered by the DSM itself.

@Robbie - check pls. if there is possible to upgrade the drives with native WD Digital Dashboard (don't do it), just check it. Send here the new firmware No.
 
I have have 7x of the same drive with the 1% life remaining reporting issue on my FS1018, attached is my smartctl output for one of them if it helps
 

Attachments

  • sdd.txt
    7.3 KB · Views: 18
I’m still waiting for an official statement from WD for this case (parallel escalation).

Smartmontools side is OK.

Synology support - heading from one pearl after another.

@foxgroo:
your 1y (summary) running drive is in expected condition.
 
@Robbie
@foxgroo
@Black-Xstar

could pls. Install WD Dashboard

then:
provide screenshots from all available attributes information under "Tools > S.M.A.R.T. Data" menu
for each drive. Pls. paste it to a single doc, then save it as a PDF for more useful distribution. Please post it here.

here is the official user manual for this (p 20-22):

I will aggregate this for WD 2nd level support, which is right now in communication with me.
 
@Robbie
@foxgroo
@Black-Xstar

could pls. Install WD Dashboard

then:

for each drive. Pls. paste it to a single doc, then save it as a PDF for more useful distribution. Please post it here.

here is the official user manual for this (p 20-22):

I will aggregate this for WD 2nd level support, which is right now in communication with me.

@jeyare
Sorry but not been well recently so getting to my rack for diagnostics may take a while.

I did take some screen grabs and a SMART data list of 1 of my WD Blue drives, when removed from the NAS and connected to a Windows PC running WD Dashboard, earlier this month (4 May 22). Hopefully these may satisfy WD as all 6 drives behave this way on a PC with their tool (ie no issues at all).

 20220504-WD SSD Dashboard-Status.png


 20220504-WD SSD Dashboard-Tools.png


 20220504-WD SSD Dashboard-Drive Details-SN1926D7420114.png


 20220504-WD SSD Dashboard-SMART.png


20220504-WD SSD Bay 1 Tested by WD Dashboard For Windows.png


I'll try to think of a way of extracting and testing all the drives if the above is not good enough. Not sure how or when yet.

Thanks again.

☕
 
Smartmontools use their "drivedb.h" for proper read of SMART data from drives
and version:
{ "VERSION: 7.3 $Id: drivedb.h 5387 2022-05-22 14:52:46Z chrfranke $",

Synology uses this "drivedbh" as usual in a different name "drivedb.db", available:
/var/lib/smartmontools
and version:
{ "VERSION: 7.3 $Id: drivedb.h 5382 2022-05-10 20:13:16Z dipohl $",

So in Synology, during the last update of DSM 7, they had to make a "solid" modification of their own code, which makes also the MWI interpretation based on drivedb.db and then evaluates the data without an understanding of the data inputs.

Which caused all this pointless problems
 
So in Synology, during the last update of DSM 7, they had to make a "solid" modification of their own code, which makes also the MWI interpretation based on drivedb.db and then evaluates the data without an understanding of the data inputs.

Which caused all this pointless problems

I think you are right. Digging through my files I found a pre-DSM7.1 SMART Quick Test Result (25 Feb 2022) which showed (on the GUI part of DSM) that Attribute 230 was reporting the same values as now but the attribute itself was shown as Unknown_SSD_Attribute. Changing this parameter in DSM7.1, without correcting the value reporting issues, seems to have triggered this mess:

Last Quick Test result:

Healthy (25/02/2022 03:30:02)

S.M.A.R.T. Attribute

Each threshold is defined by the drive manufacturer. Users will be notified when the attribute value is below the threshold value. If you receive this notification, please make sure to back up the data on this drive to prevent potential data loss caused by drive failure.


IDAttributeValueWorstThresholdRaw Data

No data

5Reallocated_Sector_Ct100100---0

9Power_On_Hours100100---18494

12Power_Cycle_Count100100---25

165Unknown_Attribute100100---8668513192

166Unknown_Attribute100100---4

167Unknown_Attribute100100---125

168Unknown_Attribute100100---47

169Unknown_Attribute100100---2899

170Unknown_Attribute100100---0

171Unknown_Attribute100100---0

172Unknown_Attribute100100---0

173Unknown_Attribute100100---12

174Unknown_Attribute100100---3

184End-to-End_Error100100---0

187Reported_Uncorrect100100---0

188Command_Timeout100100---74

194Temperature_Celsius064048---36

199UDMA_CRC_Error_Count100100---0

230Unknown_SSD_Attribute001001---1473191870807

232Available_Reservd_Space100100004100

233Media_Wearout_Indicator100100---50424

234Unknown_Attribute100100---58482

241Total_LBAs_Written253253---39770

242Total_LBAs_Read253253---175265

244Unknown_Attribute000100---0
 
I think you are right. Digging through my files I found a pre-DSM7.1 SMART Quick Test Result (25 Feb 2022) which showed (on the GUI part of DSM) that Attribute 230 was reporting the same values as now but the attribute itself was shown as Unknown_SSD_Attribute. Changing this parameter in DSM7.1, without correcting the value reporting issues, seems to have triggered this mess:

Unknown_Attribute in the "mess" is due to "not described in" drivedb.h which they use as source for their mirrored (tuned) /var/lib/smartmontools/drivedb.db (mentioned above)

when you use native smartctl it's obviously better due to the native drivedb.h for this command

so this is simple chemistry about the SMART use case in the DSM in two diff ways.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Similar threads

Hello and welcome to the forum. Update for your 918 can be done by visiting the following link, and there...
Replies
1
Views
1,316
UPS's with AVR (Automatic Voltage Regulation) are worth the added expense... They'll take care of voltage...
Replies
13
Views
3,451
  • Question
So just in case anyone else gets this problem I thought I'd do an update. It turned out I was having this...
Replies
8
Views
2,358
I haven’t bothered trying to mate any of our 5UPS’s to any of 3 NAS’s, or IT Gear, or TV or Sat gear...
Replies
91
Views
25,789
I found this. Interesting as it seems many modern synology boxes should be able to sync 1-5million files...
Replies
1
Views
1,122
  • Question
When you add share folders on your local PC/Mac, it lists your share folders in the order you added them...
Replies
0
Views
831
  • Question
No need to deactivate. You forced that. Reinsert the drive and repair. Be sure your backups are...
Replies
1
Views
1,595

Welcome to SynoForum.com!

SynoForum.com is an unofficial Synology forum for NAS owners and enthusiasts.

Registration is free, easy and fast!

Back
Top