Check Point SmartConnector Caching
We currently have a Check Point manager sending logs to ESM via a Check Point ad_opsec connector. We are already doing some field-based aggregation at the connector. As we are taking in more logs, I have noticed that the connector is beginning to cache more frequently during the day, most of the time in the 10,000 event range, but there have been spikes in the range of 25,000 - 40,000 events.
Would adding another opsec_lea object (in Checkpoint) to transfer logs to a second connector and then applying filters to each smartconnector so that each connector only process events from certain firewalls be a viable option?
What other recommendations do you have for handling the caching? Are there additional ways to optimize the logging to prevent the caching?
Thank you in advance for you time.
Below is a Recommendation that I found.
Have a Connector for each Firewall.
See if this helps (easier than setting up the Filtering and have the Connector reading the same data).
−The JVM memory limitation, this brings to the limitation of maximum EPS of 1500-1800. If the current connection of the connector is to Provider-1 which has consolidated firewall logs coming from various Firewall-1s, we would recommend to have multiple instance of Connectors, connecting directly to the Firewall-1. Alternatively, have multiple instances of Provider-1 with one connector connecting to each instance. This will help to distribute the load.
Do you able to setup Load balancer in between ArcSight and Fw? If its possible then you will able to load balance your FW log in between those connector servers.
A good thing to remember is that all connectors will write any inbound data into an initial buffer cache before processing the data. It will then read these files, parse the data and then send the data to the outbound queue(s) as a write. Then it will take that processed data and send it on.
So if you think about it, the data is written once, read once and then written again to the outbound queue before being read again! Its the most reliable way to do this (and not loose data) and as a result puts the connector under a lot of load from read / write processes. That means it makes a lot of sense to make sure its got good performance on read and write locally.
I would also recommend running logfu on the connector to see if there is any issues with the processing loads and if it slows on the outbound queue due to this processing. This will give you a good idea of where to concentrate any efficiency settings to focus on.
But as an initial setting though, look to boost the local memory that the connector has and make sure that its not running low on memory and hence doing too many garbage collections (GC messages) which also slows things down considerably.
Increase the SmartConnector memory (java heap size) if required.
Increase the batching to 1sec/600 events.
Try these settings in agent.properties to increase the throughput, by adding additional threads and increasing the queuesize: