Syslogng Connector is creating multiple queue files in [/arcsight/connectors/connector_name/current/user/agent/agentdata] folder

Hello,

We have a syslogng connector installed in our arcsight environment.

This connector(1) is sending the logs as cefsyslog to another connector(2) which is further forwarding the logs to arcsight esm/logger.

We have access access to connector1.

Recently we got a notification that the 300GB /arcsight partition on the connector server was full.

When we checked, we found [/arcsight/connectors/connector_name/current/user/agent/agentdata] directory had around 20000 queue files and around 50 cache files and that this directory was using 250GB out of the 300GB allocated to the /arcsight partition.

We have checked the connection between Connector1 and Connector2 using telnet/netcat, it is showing established for the defined syslog port.

As of now we have copied the files in [.../agentdata] directory to a different server as a backup.

The property flag for file queueing is defined as,

agents[0].filequeuemaxfilecount=100
agents[0].filequeuemaxfilesize=10000000

Need help with addressing the issue.

Regards

  • It sounds like both connectors can't keep up, large number of queue files mean that connector1 cant keep up with the incoming events and cache files mean the destination: in this case connector2 can't process quick enough.

    How much EPS are you doing through these, what are the specs, how much JVM heap have you assigned? is your first connector UDP or TCP? are the sockets dropping packets? What is the source device?

  • Hello Kyle,

    Thanks for the response.

    The information I have immediately available is, logs are sent as Raw TCP, log source is linux  >1000 devices and the JVM is set to 2GB, RAM (total:23G, free:193M, Cached:14G), 8CPUs, X86_64, total disk space is around 360G.

    I do have access to Connector1, kindly let me know how I can find EPS and whether sockets are dropping packets.

    Regards

  • Since its TCP the socket will be fine, but your connector is most likely the bottleneck. To find the EPS you can look at ArcMC if its managing the connector or you can look for agent:050 events in the destination which should give you the event volume as well. I suspect depending on how noisy the connector is 1000+ devices might be too much for that one connector, you can do some tuning 

    Mileage varies in connectors in terms of event throughput they can handle, the solution in the case it is the connector is to tune until you don't get any more benefit and then add more connectors in a parallel probably using a load balancer (there is the ArcSight LB if you don't have one)

    I would check how often GC is happening as that will indicate whether you have enough memory on the connector.

  • its all about incoming  EPS, CPU and RAM.
    you cant measure incoming EPS in ArcMC, however you can find out about queue dropping in ArcMC.

    If your connection would be UDP, it would be easiy to figure out EPS, as each UDP packet is a log/event... and you could just do an tcpdump, for 120 seconds and devide packet count/120.. job almost done. 

    However you use TCP.. and then its not like this anymore. but close to... so could measure the incoming packets on the dest port.. and make an educated guess about EPS.

    The processed EPS  on the connector is in the logs... check for eps slc (since last check).

    chech also Machine load, if it is a linux server, you should not have a load > 100% ie loud devided by cpu-count <1

    Do you have a lot of parsing errors? could also decrease the EPS the connector can handle.

    Last but not least, there are tuning guides... but i would check above first..


    https://community.microfocus.com/cyberres/arcsight/f/arcsight-discussions/337754/smartconnector-syslog-performance-tuning

    Cheers A