Threat Prevention Router reliability with TP in SRM 1.3.1

Currently reading
Threat Prevention Router reliability with TP in SRM 1.3.1

fredbert

Moderator
NAS Support
Subscriber
5,269
2,125
NAS
DS1520+, DS218+, DS215j
Router
  1. RT2600ac
  2. MR2200ac
  3. RT6600ax
  4. WRX560
Operating system
  1. macOS
Mobile operating system
  1. iOS
I have been having issues with my RT2600ac rebooting randomly and I'm sure it has something to do with Threat Prevention.

I've not made extensive notes about when it happens but I have notifications and SRM log that suggest many reboots have occurred around the time that TP is scheduled to update its signatures. Then today when I checked the ET Open version notes and wanted to check my settings on the much updated Misc Attack (Medium) class I find that loading this into the web view doesn't finish and most often causes the router to reboot: there are nearly 52k signatures in this class.

The CPU is always very high when TP is starting up / updating, yet total usage RAM is consistently at 60%.

I've been testing with System Database on external USB3 drives housing M2 SSD and HDD. Also enabling and disabling the downgrade USB3 to USB2 option. I haven't yet tried uninstalling TP completely and starting a new install. It could be a USB drive issue, with both of them, or it could be a USB ports issue on the router.

I noted today that three 3CORE signatures have been removed from ET Open but I have these configured in my self-defined rules: I don't yet want to delete my rules but I wonder what checking is done to for this situation.
 
Last edited:
Not seeing 2600 log reboots here. No V1.3.1-1 reboot schedule programmed at all. Both 2600 on UPS. Tell me where you determine that TP has rebooted separate from 2600 and I’ll check to see if my 2 is doing the same. One is M.2 SSD to USBV3, other is Sata SSD to USBV3. Both 128GB, relatively no-name Mfg. Both here using same TP 160+ rule sets. Both at USBV3 speeds formatted ext4. Both write cache. Both set to NOT pass unprocessed packets (yet to see anything strange). Both on command line speed tests repeatedly report approx same speeds. I am using a clip on ferrite bead for cable management and drives are Velcro attached at bottom of 2600, as “dusting” once locked up 2600 due to movement of USBV3 connection with loose drive laying on table top underneath.

I still have both 2600 running to verify repaired one is using same CPU & RAM as purchased one. Was about to wind down the purchased one and just run repaired one.
CPU average is 38-68%, RAM 52-58%
Yes short CPU peaks of 80%
 
TP signatures were set to update at 04:05 each day. The USB drive was used for log archiving and I would get an alert that around 04:06 that the log archiving had stopped or hadn't enough capacity... looking at router uptime matched this log alert time. Also, extracted /var/log/messages and it confirmed the same time.

What I'm saying is that TP is causing the RT2600ac to reboot, not just that TP is restarting. I may disable the System Database (thereby stopping TP), reformat the USB drive, and start TP install fresh. It could be a database corruption.
 
Last edited:
I SSH'ed into the router and extracted /var/log/messages. The notifications where related to Log Center. I didn't see much useful logs in Log Center itself, only that the USB drive had obviously been disconnected as part of the reboot and so Log Center archiving wasn't working.

I'm winging it a bit as my wife is working and will get annoyed if I cause another reboot, but I've disabled System Database and reformatted the M2 SSD drive. Also uninstalled TP inc. events and configuration, when asked, but not before taking a backup.

After reenabling System Database on the fresh M2 SSD I have installed TP. Once it finished starting up I see that the Misc Attacks class is down to a few thousand signatures. I'm now waiting while the configuration restore finishes. If this restores to a previous state then I'll repeat the TP uninstall/install and just rebuild by memory.

~~~~~~~~~~~~~~~

Looks to be working after the restore. The Misc Attack class now has 1290 signatures... so something really messed up the TP database, and I don't use write-cache support. The USB drives are always formatted Ext4 since this is their sole purpose.
 
Last edited:
Decided now is as good as any time😁

First repaired unit. TP Updates at 4:35A.
Logs general. Nothing log related, or 4:35 or later.
No connection or file transfer logs
Logs network. Nothing log related or time near or after 4:35A
Chased logs back to 10/22, when unit first booted after repaired, and 10/24 when drive was attached TP and logs were first archived.

Then did same with first 2600. Went back to the ext4 and write cache changeover.
Nothing there, either.

Obviously I was doing this as you posted!

Logs does show logs being enabled, and when reboot occurs, and the reconnect to EXT Drive after...

I don’t see any reboots that I don’t remember doing though! I did a few in repaired unit, but not first one. Last Reboot happened on -1 update on that unit...

Both TP showing less than 600 events in past 7 days here
 
Last edited:
Thanks for checking. It didn't happen every night, which was the odd thing, but today's ET Open update had quite a few rule changes and may be why it took ages to finish: I did this manually so was keeping an eye on it and the CPU was high for around an hour.
Latest changes http://rules.emergingthreats.net/changelogs/open-daily-changes.txt
More info about updates ET OPEN Ruleset Download Instructions

I'm going to see how things go with the new database. I'm hoping it was a corruption, which would suggest this isn't repeatable by others unless they too have a similar corruption. Not sure how it would happen, as I said, I don't use write-cache plus I use a UPS and don't randomly switch the router off ... though those reboots may not have been the most data friendly events, and they may have multiplied the corrupt.
 
Last edited:
Got a UPS on the Router? Power issues can cause corruption... (Here we have maybe 4-5 brief outages a week!) By the way, I have NOT seen the need for a sine wave unit, vs synthetic wave, on multiple devices here.... but each has "AVR" which can kick in during AC noise events without total switchover to UPS...
 
Yes, have it on a UPS shared with the DS1520+, cable modem/router, and main switch.

So now I am able to open rule classes and save signature action changes. BTW restoring the configuration backup looks to have removed the self-defined rules for signatures have been removed from ET Open. So this process has cleaned up both the database and my rules too.
 
If so, this is a recent developement, as I restored TP Backup to the repaired 2600 on 10-24 after adding external drive, and that brought in all my 160+ rules... ??? Could it be that your TP Backup was corrupt???
For Kicks, would you like a copy of my TP Backup to test?
 
Oh I still had lots of rules :)

But prior to the restore, and was in my backup, where these three signatures that were removed in yesterday's ET Open update. These three were not restored in my Self-Defined Rules but the other 30 were restored: all set to drop.

Code:
[---]         Removed rules:         [---]

 2525030 - ET 3CORESec Poor Reputation IP group 31 (3coresec.rules)
 2525031 - ET 3CORESec Poor Reputation IP group 32 (3coresec.rules)
 2525032 - ET 3CORESec Poor Reputation IP group 33 (3coresec.rules)
 
Last edited:
I have not examined TP in that detail.... let me check!
-- post merged: --

So I did a search on ET 3 and came up with this on Both Units.... As you can see, with date comment, some were added by me... Some without comment were TP added.... which would seem to confirm your suggestion that 31, 32, 33 were removed...

Now as an "End User who occasionally tweaks the rules--gets in trouble then figures my way out" .... I guess I'm at the mercy of TP as for files they update/remove, unless I add them as my own??

Does your rules contain 9, 17, 18 ???
ET 3CORES.jpg
 
I guess you are right. It may be worth making a calendar task to backup TP and restore so that old rules get flushed out.

I have a basic approach to these multi-signature groups. Mostly this is applied to Misc Attack class (Medium), filter these strings and drop (just added the last two today, figured it can't hurt): compromised; reputation; 3CORESec; ET CINS; ET TOR; spamhaus; ET Threatview. Same thing for Misc Activity class (Low) filtering on SCAN. Do the same in Detection of a Network Scan class (Low) and drop the lot. So I'd add going through these after the restore, then make a new backup.

Given that I've just done this, then I have all these groups' current signatures as drop.

I don't see why I would want any suspicious IPs accessing my stuff, I'm not hosting a public service.
 
Last edited:
In past years I’ve been prompted that identical rules I’ve created are being asked by me to update. And I answered yes, Update. When I posted this years ago was told that rules get updated, and keep their same identifiers.
But occasionally didn’t flag identical rule, and caused a couple identical rules to be added. Because I dated each of my rules, (by dumb luck), one time a year ago I browsed through all my rules, and when found identical ones, I deleted all but latest one. I haven’t done that recently, but as time goes on, fewer and fewer new rules are created.
Is this correct?? Frankly I don’t know, but it seems to make sense!
That’s why reading your comments about same is refreshing!

Edit: One last thing (regarding reboots):
I’m using a device to keep tabs on my ISP: Pinging router and ISP server feeding modem every 40 seconds, and logging drops, and date & time outages longer than 3 min. Have not seen router IP log ping drops or outages on either router (Switched to repaired 2600 10/26).
 
As someone who doesn’t run TP, I can say that 1.3 def has issue in the long run. 3rd time since upgrading to the latest version that my mesh goes down.

By that I mean that the satellites lose connectivity (via cable) because the main router (2600) becomes unstable. To make things more strange, it works fine. No reboots, no reconnects, net works fine.

As soon as I try and log into the ui, it logs with 0 issues but opening Network Center, resolves in “unable to….”.

Also trying over DSRouter app, I am stuck at initial logging.

Reboot of the main unit with a follow up of all the satellites restores everything just fine.

This happens once a month, or once in 6 weeks and I can’t really connect to it to anything specific. First time it happened was in the middle of the night when there was no activity from a user perspective, other then services and hosting platforms doing their things.

Eagerly awaiting a new patch
 
So, today after the automated TP signature update last night, I checked the ET Open change log and saw 3CORESec 31 and 32 have been re-added. Going back to set them to drop and I notice that Misc Attack class has changed from yesterday's 1290 signatures to 2433 (1221 enabled to 2364).

One other class, A Network Trojan was Detected, which has increased by seven from 7391 to 7398. OK that's seven and not the eight new ET MALEWARE signatures. But two new 3CORESec signatures shouldn't increase the class by 1143 (even with today's 179 changes to it). Only these classes have changed their total signatures, not the others where I changed event actions.

This seems very odd.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Welcome to SynoForum.com!

SynoForum.com is an unofficial Synology forum for NAS owners and enthusiasts.

Registration is free, easy and fast!

Trending threads

Back
Top