Checksum mismatch on some files - Complete storage pool wipe-out necessary?

Currently reading
Checksum mismatch on some files - Complete storage pool wipe-out necessary?

Hi there,

At the beginning of the month my NAS reported a "Checksum mismatch" on some files. The NAS is a 920+ with 3x4To IronWolf disks, Btrfs, RAID 5. Also, it had an additional 8 Gb Crucial memory stick at that time...

I followed all the diagnosis steps recommended by Synology (complete SMART test on the disks, several memory tests, etc.) but couldn't identify the origin of the issue. So I contacted Synology for further support.

Of course, they noticed the non-OEM memory and pointed it as the most probable cause of this issue. They asked me to remove it, which I did. Now, they:
do not recommend continue using the current storage pool& volume because file system error has been found from the logs.

Though most of the files are visible and can be used, file system error may ruin the data in the future.

The safest way is to remove the current storage pool& file to wipe out the errors; therefore, we still recommend you wipe all out (both the storage pool and volume) and then restore all of the files from the backup.


Despite I have some backups available, I am not very confident doing this as in the mean time most of my backups turned "recovery only" after detecting corrupted files. Not sure if these are the same or not. Will a recovery from these backups not introduce corrupted files again, and render the whole process useless?

I couldn't find a report identifying the checksum mismatched files. Do you know where I can find this information and if it would be possible to recover these files only?

Is there a real risk to continue using this volume/pool now that the "bad" memory stick has been removed? Can the problem really spread out further?

(and by the way: are Btrfs format and RAID 5 arrays not supposed to be able to repair that kind of issue?)

Thank's for your support!
BR
 
Did you enable "data integrity".. on your shared folders? If, so, run a file scrub, and it should identify and resolve data mismatches that may exist.

I would not heed strongly the FUD noise from Synology. They are trying to protect their hardware markets by suggesting uncertainty comes from non-Synology memory (it's amazing that an adhesive sticker can have such a huge influence on data integrity). Even with a failed memory stick (and I'd guess syno-labeled memory fails like they all do), a robust OS should manage most memory failures without disaster.
 
Upvote 0
Hi Telos,

Thank you for your reply!

"Data integrity" was activated on the corrupted folder. However, I didn't know about the scrubbing task until few days ago, so it had never been done before. It's actually the first activation of this task that flagged the checksum mismatch. I guess it means it is now too late to repair the data... Am I correct, or would there be another way to recover it?

The log lists the following 2 corrupted files:
  • /volume1/@sharesnap/Off-site_Backup/GMT+02-2022.05.16-15.11.27/Off-Site_Backup.hbk/Pool/0/0/38.bucket
  • /volume1/Off-site_Backup/Off-site_Backup/Pool/0/0/38.bucket
As you can see, the corrupted folder is the target for a remote hyper backup task. Do you know what "@sharesnap" is? I cannot see it on File Station.

Luckily, this backup is not absolutely critical, so I could delete it and start the task from scratch. What would you recommand: try to restore the corrupted "bucket" file only, or delete the complete backup and start from scratch? I am not sure how hyper backup would handle the restoration of the "bucket" file only...

And my last question: what do you think about Synology's reco to completely wipe my Volume? Is there really a "worm in the apple" that could damage other files, and the only way to stay safe is to wipe all of it?

Thank's a lot for your support again, and have a nice weekend!
 
Upvote 0
Hopefully others here will share their wisdom. I certainly don't want to misguide you and risk your content.

@sharesnap is related to Snapshot Replication
I guess it means it is now too late to repair the data... Am I correct, or would there be another way to recover it?
If your shared folder "Off-site_Backup" has data integrity activated (presumably it was activated when the shared folder was created)... If so, I would presume that there is no corruption in the file that resides in "Off-site_Backup" folder... I say that only as I'm uncertain that there is protection for the version contained in the snapshot. Obviously the two don't match, hence the error. On that basis, I would delete the affected snapshot, and stay the course.
 
Upvote 0

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Similar threads

Yea, that was my initial thought. I think I will enable it. I'm sure there's a downside but, you know...
Replies
3
Views
8,175
As long as you have the whole task backed up, you can set up a "new" task and relink to that archive, as...
Replies
49
Views
3,558
So you're accessing the NAS using macOS Finder? Maybe then you are using SMB or AFP file sharing service...
Replies
3
Views
706
Yes, it’s weird. I would’ve expected some indication of activity. You may have closed the initial dialogue...
Replies
4
Views
1,076
Yes sir, always helps to have a second set of eyes on things. 1691005708 One other note - there should...
Replies
10
Views
2,722

Welcome to SynoForum.com!

SynoForum.com is an unofficial Synology forum for NAS owners and enthusiasts.

Registration is free, easy and fast!

Trending threads

Back
Top