Active Channel performance
We are struggeling wirth performance issue in the active channels since over a year now.
We receive about 100M events a day and it takes about 40 minutes to load a simple active channel of 24 hour.
Example: Timestamp: Manager receipt time, sort field: Manager receipt time, and attacker address is one of our internal IP.
We would like to compare our "performance" with others so feel free to share your stats
I ran your test on 2 managers:
70 million events per day manager
took aproximately 20 minutes to populate
270 Million events per day manager
took aproximately 40 minutes to populate
95.000.000 events per day , endtime, 1 external attacker address chosen randomly, evaluate once at attach time, 1 day of data being 5 days old ( to avoid cahing effect ) : 5 min 30 sec to load
same system, same filter but for the last 24H of data ( $Now - 1d ) : 9 min
exactly the same test than the second but ran a couple of minutes later ( lot of events still in cache ) : 30 sec to load
These tests show you how careful you must be when trying to evaluate performance of your system. Obviously cached events impact hugely the time needed to load the AC but a lot of factors impact performance for such a kind of test. This is still interesting to compare with other system because it gives a rough idea of what is possible but don't jump too quickly to definitive conclustion based on these tests.
If you try to improve your performance you must setup a reliable test protocol. By reliable I mean that running multiple times the same test gives values very similar. I was quite successful by setting up multiple reports for different time period and removing all background tasks like trends. I was also stopping all connectors so there weren't any event processed. Before running the tests I was just stopping the DB and the manager to avoid any caching effect. Then you can try to optimize the system by modifying some parameters, restart the dB and the manager and run your tests again. This should help in evaluating the impact of your changes on the system. Alternatively, you can also disconnect the manager from the DB and run directly your queries on the DB, the time needed to generate a report from the command line is very similar to the time needed when the report is executed from the AS console if there is no load on the system.
ESM is pretty slow (as you are well aware) for loading such a large amount of data to an active channel. Is this your normal mode of operation? If you have many consoles and analysts all loading numerous channels with filters et al, your iowait is going to get pretty high on your DB box unless you have some whizz-bang storage system in place.
Are you using dashboards, trends and queryviewers to help you visualise and drill down to the events of real interest without hitting your DB too hard?