~ 40-50 min event flow delays with ISS RealSecureDB Connector (MS SQL)
First of all, thanks for the great community. Much helpful information and really kind people here. Keep up the good work!
So, here is the problem that we've been facing for about 1 month.
Our configuration is 4 agents with 5-7 connectors on each. Most heavy-loaded ones has 1500-2000 EPS and no performance problems at all.
But, damn thing, the one we've connected to SiteProtector RealSecureDB (Based on Win2008 x64 + MSSQL) is always has performance/events delay problems. Basically, it starts with no delay at the beginning, but then a difference between Manager Receipt Time (time when ESM receives an event from connector) and Event End Time (time when event actually ended and registered in IPS) is becoming more and more notable.
After all, here's what we have after 1,5-2 hours:
As a result we can't analyze events in real-time. And thats a huge problem. Also I should say, that connector isn't high loadable, because it has ~15EPS only (compared to others, it is such a small number). So, logically, we should'n't experience any performance or delay problems.
What i tried:
1) Changed batching settings (to 1 second), turbo mode to faster and fastest in connector configuration. No results.
2) Played with this parameters in agent.properties file, changed it to maximum and minimum values, tried different combinations.:
|initretrysleeptime||If/when database connection fails, this parameter controls the time the SmartConnector will wait before trying to reconnect.|
|jdbcquerytimeout||An internal parameter which controls the SQL query timeout. It is disabled (with a value of -1) by default.|
|jdbctimeout||Controls the timeout period of JDBC connection.|
|loopingenabled||Used as a performance testing utility, set to false by default.|
It has no impact on the behavior of the agent.
|persistenceinterval||This is a Boolean flag, disabled by default, which tells the SmartConnector to store the last processed event or byte offset in the log files. If enabled, the SmartConnector will continue where it left off. If it is disabled, the SmartConnector starts with the first log.|
|preservestate||If set to true, The SmartConnector should keep track of the last event processed.|
|preservedstatecount||If preservestate is set to true, this is the count of events to process before writing the preserve state.|
|preservedstateinterval||If preservestate is set to true, time interval in ms before writing the preserve state.|
|useconnectionpool||If false, a connection will be opened every time a query is made on the database.|
Use of this parameter makes the SmartConnector process more resource intensive.
Parameters below are used for JDBC Connection Pools setting:
|dbcpcachestatements||Option to enable/disable caching of statements.|
|dbcpcheckouttimeout||Maximum number of seconds a Thread can checkout a connection before it is closed and returned to the pool.|
|dbcpidletimeout||Maximum number of seconds a connection can remain idle before it is closed.|
|dbcpmaxcheckout||Number of times a connection may be checked out before it is closed.|
|dbcpmaxconn||Maximum number of connections to open.|
|dbcpreap||How many seconds to wait between reaping connections in the pool.|
No results also.
Now I'm stuck with this problem. Any help will be greatly appreciated. My Configuration screenshots and agent.properties included in post.
We have GMT+4, and all events in SP console are looking good and real-time. So, I'm 100% sure that it is not a Event Collector issue.
Like i said before, our connector is 5.2.2 now (but we also tried to use 5.2.1).
We do not use Internet Scanner, and FusionModule is disabled.
I wonder how many EPS do you have from that connector?
what I mean is you need to check the ISS management system, in our case this was the single point of failure because it was totally over loaded and the performance was very bad.
The connector guide supports IBM SiteProtector DB 2.0 SP7 and SP8. https://protect724.arcsight.com/servlet/JiveServlet/previewBody/2487-102-3-3664/IBMSiteProtectorDBConfig.pdf
One question nobody asked is: which version and build of the IBM SiteProtector DB are you using?
The issue is we do not know what IBM might have changed between different builds that could be the reason for the events delay.
Any change at all could affect the behavior of the connector.
Hey Gbenga, nice to meet you here!
You know, we should've started with that version-info question...
BTW I think it's a shame that modern and expensive systems like ArcSight do not support half a year old products (ISS 2.0 SP9 was released 6 months ago).
Hope that will be fixed as soon as possible and support for Service Pack 9 will be added.
Again, thanks to everyone who help me with solving this problem. Best wishes!
We had the same problem about 5 years ago with the ISS connector. I remember at one point that we actually trimmed the data retention on the ISS database to correct the problem. We were holding 30 days or more on the ISS database which was constantly being stuck in maintenance mode. We then change the retention on the ISS DB to 2 weeks, and that eliminated the issue.
Have a look at the id numbers in the SP DB. I have dealt with SP before and the one thing that tends to happen is big jumps in the numbering in the SP DB. The ArcSight connect used to be handling that very badly. I am not sure if this has been fixed, but as I do not have access to a SP DB setup I am unable to verify if they fixed the behavior.
I hope that this helps.
I've recently upgraded the Connector to 5.2.X (tried multiple minor releases) and I experienced exactly the same Issue you described here. Before it worked absolutely fine.
After a Rollback to Connector Version 188.8.131.5214.0 everything is good again.
I am not convinced that this is a SiteProtector only (DB) issue, I didn't take the time to further Investigate this though, since i have no direct access to the SiteProtector Server (where the Connector is running).
ArcSight support essentially trimmed the calls being made so that all
database fields weren't being queried. This improved the performance
significantly. I now see only a 1-3 minute delay. If you want more
specifics, please let me know and I will be happy to look the case up and
pass along instructions... however I did need to download an instruction
file used by the connector (provided by ArcSight support).
I went back and reviewed the case. In my instance, ArcSight confirmed it was a bug and edited the parser. This should be included in the latest code ( I upgraded to 184.108.40.20674 after implementing the fix below and the issue did not returm - though this code was not removed during the upgrade and I did not confirm that it was included). Keep in mind that our particular issue surfaced after migrating to the latest ISS version.
Attached to the case is the file 2_1SP90.sdktbdatabase.properties, this is a parser override given by Arcsight Engineering to test on your connector.
Note that in the parser override, we(in the connector) now query limited type of data from SensorDataAVP(they are originally mapped to additional data) such as
AttributeName='Message' and corresponding AttributeValue
AttributeName='code' and corresponding AttributeValue
AttributeName=':EventType' and corresponding AttributeValue
Is there a certain "AttributeName" that is important to you? Please let us know.
The steps to apply the parser override are as follows:
1. Create a folder named siteprotector_db folder under <connector_home_directory>\current\user\agent\fcp
2. Place the attached 2_1SP90.sdktbdatabase.properties under the folder <connector_home_directory>\current\user\agent\fcp
3. Clean up everything under the folder <connector_home_directory>\current\user\agent\agentdata
4. Clean up everything under the folder <connector_home_directory>\current\logs
4. Note the time and date
5. Restart connector
6. Let the connector run for some time and let us know if you experience the issue again where the Agent receipt and Manager receipt experience time drift from the Device receipt time.
Please note that I did not attach the file, only provided the instructions. If you need the file and cannot obtain it through support, I will need to request permission from ArcSight before providing.