HDD Selection

Currently reading
HDD Selection

3
1
NAS
DS918+
Operating system
  1. Windows
Mobile operating system
  1. Android
I have a 2 disc Raid 1 system. I know it writes to both HDDs but what disc does it read from? Always 1, always 2 or alternates ? (assuming system knows none of the hdds are bad).

If I suspected some files were corrupted on one HDD, is there a way I can force it to either read from 1 or 2 via the dashboard without taking out HDD 1 or HDD 2 to see results? Can't see any commands in the control panel to force operation from just one disc or the other.

This is a tech question for synology, but I've wondered how it knows a HDD is "bad" Is there some background write, read, erase test program going on with each HDD to determine health? Does it possibly read both discs each time and if not matched, then start looking for problems in one of the disc?

All this stems from a very bad experience I had with a 5 drive Drobo 5N. With all the beyond raid BS they advertise, I had corrupted data (presumably from just one of the disc sets) that it knew nothing about, never flagged warnings, etc. Numerous word documents and pdf files looked "normal" as far as you could see on windows file explorer, but were annunciated by word and adobe as corrupted, unable to open.
 
3
1
NAS
DS918+
Operating system
  1. Windows
Mobile operating system
  1. Android
I caught my DS918 doing something that tells me it is NOT always writing the same to both discs. I modified a file and later went to read it and it was the original unmodified file!. Reselecting the directory, I opened it again and the new file was there. Only explanation is that it did not write same to both discs. I had this happen with another synology nas, so I thing the operating system some issue. Of course, synology says, not.
 

fredbert

Moderator
NAS Support
Subscriber
1,626
676
NAS
DS1520+, DS218+, DS215j
Router
  1. RT2600ac
  2. MR2200ac
Operating system
  1. macOS
Mobile operating system
  1. iOS
Hi, welcome to the forum.

I've no experience in this low-level issue you seem to be having but maybe a few questions wouldn't hurt?

Is the NAS powered up 24x7 or shutdown regularly?
When shutting down is it done using the NAS controls or a hard power-off?
What's the power supply like to you home?
Do you use a UPS with the NAS?
With your DS918+ do you use NVME cache for read or read/write? Specifically the write part.
You say it's RAID1, so you opted not to use SHR?
The hard disks: how similar are they?
How are you accessing data? File sharing (SFTP, SMB, AFP, etc), SCSI LUN, folder sync (Syno Drive, indirectly by Cloud Sync of Dropbox/Box/etc)


As for which disk is prioritised for read or write ... not sure that's an easy thing to say. It would speed up performance to be reading from the other disk when writing, but there has to be time to sync the updated data.

That you've had two separate storage systems with a similar issue does sound odd but RAID is being managed by the storage enclosure and not client devices. The common denominators (you/client devices) are external to the RAID manager. Hmm.
 

jeyare

Subscriber
1,587
537
some shorter RAID1 principle in your NAS:
- Syno use Linux (SW) based RAID controller
- write operation - the controller choses one from the group disk members for the write operation. It’s random, nothing to do with bay positions. When the write process is finished in the first disk, then the data is written to another group disk member.
When some corruption happens during the write to first disk, the controller doesn’t allow the data write to next drive.

There is an internal controller mechanism to allow correct such situation and support the write continuity when possible. When there is a HW error on the first disk (randomly chosen) then drive will fall and the array become degraded and controller will continue with writing to next drive from the group.
There is long list of possible issue when data consistency could be inconsistent. One of them is the disk firmware error, unexpected or hard shutdown (an outage, accident,...).
The outage before controller had written its data to all of the disks in the array, then some blocks in that stripe will be in disagreement with the others.

To be sure: SW RAID 1 controllers, rely on the fact that HDDs use their own error detection and correction methods.

There is a way how to check the consistency from this guide.
as @fredbert ask you - we need more details from you.
 

jeyare

Subscriber
1,587
537
SW RAID1/5 can’t protect you from a silent bit/block data corruption if the corruption is done by the HDD. This is really seldom event.
It’s easy to detect that data have been corrupted in RAID1/5, but there is no way to determine at such RAID level which disk got the right data, and which got the old/bad. Reason is based on HDD logic for internal block checksum.

Then most important question is - do you have identical Part No of the disk in the RAID1? Same vendor or same disk range isn’t sufficient.
 
3
1
NAS
DS918+
Operating system
  1. Windows
Mobile operating system
  1. Android
Thanks for insight on it. My current system has 4 SSD 1 G drives set up a 3 in raid 1 mirror and one as hot standby. It is on 24/7 with a good backup power system.

After doing some forensics on the failed drobo 5N, it was not HDD failure but apparently corrupted firmware. After I put in a set on new HDDs it kept giving a "failed mount" error that drobo said was firmware and to reflash it via download. It did download new firmware but upon restart, gave the fail message again, so I'm assuming something failed and the operating system went off in the weeds. Odd it didn't flag problem until reset when I try to fix it, but it took my concerns away about the system not figuring out a disc was bad. It just didn't think anything was wrong but the hw failure was making it write corrupted data to all the discs until I shut it down.

So far the Ds918 looks good. I am disappointed at how many bugs they've fixed in the last DSM update, in particular pulling back some recent update on May 13. You'd think the operating system would have long ago been made perfect but they keep adding new features and everyone of those has their own issues not found until users start stressing it.

Thanks for help - now back to work.
 

jeyare

Subscriber
1,587
537
as I wrote:
“There is long list of possible issue when data consistency could be inconsistent. One of them is the disk firmware error,...”
In your case it doesn’t matter if your setup is based on HDD/SSD. Then in such case is better to upgrade all the disk drive in the array to sam firmware, to keep same consistency logic across all the disk drives.

Second. as I know Drobo doesn’t use BTRFS, what is in Syno case really robust
guard of the inconsistency cases based on “Copy on Write” FS foundation.
 

jeyare

Subscriber
1,587
537
finally- filesystem can save your mental health, because it’s filesystem responsibility to keep consistency. This is the bottom level of the next RAID logic in SW based RAIDs. Then implementations of next level may stay in same FS RAID or not. In syno cases there is a mixed setup.
1590642896367.png


More about such logic you can read in our resource.
 

jeyare

Subscriber
1,587
537
Another point of view for cases based on power outages during write operation.

Some of HW based RAID controllers have a setup which disk will be always designated as “first” to write and which “affected” by the first. What is nice until first write hole issue happens, because:
- the disk drive can cache data itself and the caching may violate the arrangement done by the controller. Just thinking about SMR base feature - caching and then during idle writing.
- if the disk that was designated as the first/authoritative fails, write holes may already been present on the second disk and it would be impossible to find them without the first disk data.
It’s valid for all RAID types.
CoW FS an good estimated UPS can avoid such issues.
 

jeyare

Subscriber
1,587
537
So far the Ds918 looks good. I am disappointed at how many bugs they've fixed in the last DSM update, in particular pulling back some recent update on May 13. You'd think the operating system would have long ago been made perfect but they keep adding new features and everyone of those has their own issues not found until users start stressing it.
Welcome in Syno world. But it’s better to see, that they work on bugs, than see unexpected operation based on single updates per year only.
 

fredbert

Moderator
NAS Support
Subscriber
1,626
676
NAS
DS1520+, DS218+, DS215j
Router
  1. RT2600ac
  2. MR2200ac
Operating system
  1. macOS
Mobile operating system
  1. iOS
@Roger Duroid To add to @jeyare, regarding DSM updates: it's advisable to enable notifications for updates to DSM but not to automatically do the update (Control Panel -> Update & Restore). Check the release notes and see what's fixed [is it something you use, or not] and if there are security patches to vulnerabilities you really must plug. Otherwise, you can always sit out an upgrade.

Other observations on DSM + package upgrades
  • They don't appear everywhere at the same time: if your NAS doesn't alert you (like here in UK) you can click the Release Notes link to see what's the latest and do a manual update.
  • Recently it's been wise to wait before updating as there have been more than one time when the X.Y.Z-ABCD Update N has been reissued with a replacement Update N (same N), or Update N has got pulled within a day or two.
  • Take the window between update release and whenever you find there is an update as your protection window for not introducing some unwanted excitement to NAS ownership.
  • Same notes for SRM too.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Welcome to SynoForum.com!

SynoForum.com is an unofficial Synology forum for NAS owners and enthusiasts.

Registration is free, easy and fast!

Similar threads

Similar threads

Trending threads

Top