Commodore
Commodore
3189 views

"Too many open files" issue causing TCP Syslog connector to stop working

Jump to solution

Syslog connector for McAfee Email Gateway 7.5.  ArcSight doesn't currently support this version, however it still pulls in data that we can use.  The problem seems to be that after about 2 days of logging, the container itself stops working and starts to fill up with the following error messages...

Here are the errors...

Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3MdaxEzcBABDdIzz5Rvwwug==.m1.cache.dflt.0 (Too many open files)
Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3MdaxEzcBABDdIzz5Rvwwug==.m1.tmp.dflt (Too many open files)
java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 10:03:23 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3MdaxEzcBABDdIzz5Rvwwug==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 10:03:23 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3MdaxEzcBABDdIzz5Rvwwug==.m1.tmp.dflt (Too many open files)
INFO   | jvm 1    | 2014/02/17 09:56:31 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:46:22 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:47:22 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:48:22 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:49:22 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:49:27 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.tmp.dflt (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:49:32 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:49:36 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:51:23 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:52:23 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:53:23 | Caused by: java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 09:44:26 | [Mon Feb 17 09:44:26 EST 2014] [ERROR] com.arcsight.agent.transport.q: com.arcsight.common.o.g: java.net.SocketException: Too many open files
INFO   | jvm 1    | 2014/02/17 09:32:26 | [Mon Feb 17 09:32:26 EST 2014] [ERROR] com.arcsight.agent.transport.q: com.arcsight.common.o.g: java.net.SocketException: Too many open files
INFO   | jvm 1    | 2014/02/17 09:21:26 | [Mon Feb 17 09:21:26 EST 2014] [ERROR] com.arcsight.agent.transport.q: com.arcsight.common.o.g: java.net.SocketException: Too many open files
INFO   | jvm 1    | 2014/02/17 09:15:30 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3wwKEKEIBABD+6taYRnGwnQ==.1 (Too many open files)
INFO   | jvm 1    | 2014/02/17 09:15:30 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 09:15:30 | [Mon Feb 17 09:15:30 EST 2014] [ERROR] com.arcsight.agent.transport.q: com.arcsight.common.o.g: java.net.SocketException: Too many open files
INFO   | jvm 1    | 2014/02/17 09:15:40 | [Mon Feb 17 09:15:40 EST 2014] [ERROR] com.arcsight.agent.transport.q: com.arcsight.common.o.g: java.net.SocketException: Too many open files
INFO   | jvm 1    | 2014/02/17 08:59:30 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3wwKEKEIBABD+6taYRnGwnQ==.1 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:59:30 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:59:40 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3wwKEKEIBABD+6taYRnGwnQ==.0 (Too many open files)
INFO   | jvm 1    | 2014/02/17 08:59:40 | java.io.FileNotFoundException: /opt/arcsight/connector_2/current/user/agent/agentdata/ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.1 (Too many open files)

Here's the output of ulimit -a:

core file size          (blocks, -c) unlimited
data seg size          (kbytes, -d) unlimited
scheduling priority            (-e) 0
file size              (blocks, -f) unlimited
pending signals                (-i) 135167
max locked memory      (kbytes, -l) 32
max memory size        (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues    (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time              (seconds, -t) unlimited
max user processes              (-u) 135167
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Here's the output of the agentdata directory...

total 17M
drwxr-xr-x 2 root root 44K Feb 17 10:05 .
drwxr-xr-x 16 root root 4.0K Feb 12 17:15 ..
-rw-r--r-- 1 root root 0 Feb 17 10:05 3CpWMrEABABCDEM4rGBz2Vg==.cache.dflt.0
-rwxr--r-- 1 root root 362K Feb 12 16:54 3CpWMrEABABCDEM4rGBz2Vg==.idmap
-rw-r--r-- 1 root root 0 Feb 17 10:05 3CpWMrEABABCDEM4rGBz2Vg==.m1.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 3CpWMrEABABCDEM4rGBz2Vg==.m1.size.dflt
-rwxr--r-- 1 root root 2 Feb 12 16:54 3CpWMrEABABCDEM4rGBz2Vg==.size.dflt
-rw-r--r-- 1 root root 0 Feb 17 10:05 3MdaxEzcBABDdIzz5Rvwwug==.cache.dflt.0
-rwxr--r-- 1 root root 362K Feb 12 16:54 3MdaxEzcBABDdIzz5Rvwwug==.idmap
-rw-r--r-- 1 root root 0 Feb 17 10:05 3MdaxEzcBABDdIzz5Rvwwug==.m1.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 3MdaxEzcBABDdIzz5Rvwwug==.m1.size.dflt
-rw-r--r-- 1 root root 9.3M Feb 17 10:05 3MdaxEzcBABDdIzz5Rvwwug==_queue.syslogd.5
-rwxr--r-- 1 root root 2 Feb 12 16:54 3MdaxEzcBABDdIzz5Rvwwug==.size.dflt
-rw-r--r-- 1 root root 2.3K Feb 17 08:39 3MdaxEzcBABDdIzz5Rvwwug==.tmp.dflt
-rw-r--r-- 1 root root 0 Feb 17 10:05 3wwKEKEIBABD+6taYRnGwnQ==.cache.dflt.0
-rwxr--r-- 1 root root 362K Feb 12 16:54 3wwKEKEIBABD+6taYRnGwnQ==.idmap
-rw-r--r-- 1 root root 0 Feb 17 10:05 3wwKEKEIBABD+6taYRnGwnQ==.m1.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 3wwKEKEIBABD+6taYRnGwnQ==.m1.size.dflt
-rw-r--r-- 1 root root 5.0M Feb 17 10:05 3wwKEKEIBABD+6taYRnGwnQ==_queue.syslogd.246
-rwxr--r-- 1 root root 2 Feb 12 16:54 3wwKEKEIBABD+6taYRnGwnQ==.size.dflt
-rw-r--r-- 1 root root 0 Feb 17 10:05 3-xBER0IBABCHuVnTP5KJaA==.cache.dflt.0
-rwxr--r-- 1 root root 362K Feb 12 16:54 3-xBER0IBABCHuVnTP5KJaA==.idmap
-rw-r--r-- 1 root root 0 Feb 17 10:05 3-xBER0IBABCHuVnTP5KJaA==.m1.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 3-xBER0IBABCHuVnTP5KJaA==.m1.size.dflt
-rwxr--r-- 1 root root 4 Feb 12 16:54 3-xBER0IBABCHuVnTP5KJaA==.size.dflt
-rwxr--r-- 1 root root 43 Dec 16 15:01 desc.txt
-rw-r--r-- 1 root root 78 Feb 17 10:05 ps.ADNameAbbr.3wwKEKEIBABD+6taYRnGwnQ==.1
-rw-r--r-- 1 root root 78 Feb 17 10:05 ps.ADNameAbbr.3-xBER0IBABCHuVnTP5KJaA==.0
-rw-r--r-- 1 root root 7.0K Feb 12 17:05 supportfiles_1392242718671.zip
-rw-r--r-- 1 root root 186 Feb 12 17:12 supportfiles_1392243133143.zip
-rw-r--r-- 1 root root 234 Feb 12 17:12 supportfiles_1392243151925.zip
-rw-r--r-- 1 root root 972 Feb 12 17:12 supportfiles_1392243160457.zip
-rw-r--r-- 1 root root 842K Feb 12 17:13 supportfiles_1392243190059.zip
-rw-r--r-- 1 root root 7.0K Feb 12 17:16 supportfiles_1392243386057.zip
-rw-r--r-- 1 root root 397 Feb 14 08:16 supportfiles_1392383811048.zip
-rw-r--r-- 1 root root 2.3K Feb 14 08:17 supportfiles_1392383822940.zip
-rw-r----- 1 root root 0 Jan 9 14:02 TC_35PavEzcBABCAApjZsKEXNA==.cache.dflt.0
-rwxr--r-- 1 root root 2 Jan 9 19:47 TC_35PavEzcBABCAApjZsKEXNA==.size.dflt
-rw-r--r-- 1 root root 0 Feb 12 16:57 TC_3CpWMrEABABCDEM4rGBz2Vg==.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 TC_3CpWMrEABABCDEM4rGBz2Vg==.size.dflt
-rw-r--r-- 1 root root 0 Feb 12 16:57 TC_3MdaxEzcBABDdIzz5Rvwwug==.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 TC_3MdaxEzcBABDdIzz5Rvwwug==.size.dflt
-rwxr--r-- 1 root root 0 Dec 16 15:01 TC_3U1GlT0EBABCEzzBp5i6+6A==.cache.dflt.0
-rwxr--r-- 1 root root 2 Dec 16 15:01 TC_3U1GlT0EBABCEzzBp5i6+6A==.size.dflt
-rw-r--r-- 1 root root 0 Feb 12 16:57 TC_3wwKEKEIBABD+6taYRnGwnQ==.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 TC_3wwKEKEIBABD+6taYRnGwnQ==.size.dflt
-rw-r--r-- 1 root root 0 Feb 12 16:57 TC_3-xBER0IBABCHuVnTP5KJaA==.cache.dflt.0
-rwxr--r-- 1 root root 2 Feb 12 16:54 TC_3-xBER0IBABCHuVnTP5KJaA==.size.dflt
-rwxr--r-- 1 root root 0 Dec 16 15:01 TC_3zjOPo0EBABDGo1hNEaOEhg==.cache.dflt.0
-rwxr--r-- 1 root root 2 Dec 16 15:01 TC_3zjOPo0EBABDGo1hNEaOEhg==.size.dflt

Any help would be greatly appreciated as this has been a regular occurance taking place every 2 days.  The temporary solution is to completely restart the container using monit (since the container appears "down" on the connector appliance when this takes place).

------
Labels (3)
0 Likes
1 Solution

Accepted Solutions
Admiral
Admiral

Hi Matt!

Looks like you didn't set the parameter tcppeerclosedchecktimeout to a value greater than 0. This is a must-set-parameter for TCP-Syslog-Connectors.

Just set agents[0].tcppeerclosedchecktimeout=30000 in agent.properties, restart the container and you will be fine.

Details:

best regards

Tobias

View solution in original post

6 Replies
Cadet 2nd Class Cadet 2nd Class
Cadet 2nd Class

Usually, agentdata dir contains the cached logs, you need to find out why connector is caching. Is that the eps is high ? what other connectors exist on that container ? are they loaded with high eps ? try to fine tune the connector with filtering and aggr and try to increase memory and threads.

Another important thing is sending partially or unparsed events to esm is not a good idea, it impacts the performance.

Thanks

Anwar

0 Likes
Commodore
Commodore

Hey Anwar,

Thanks for the response.  There is one other UDP syslog connector in that container but it has a very low EPS.  How would I go about increasing the threads?

Could this have something to do with a high number of queue.syslogd files?  I noticed that one of the files was queue.syslogd.246  - assuming the number at the end is the number of the file?

Thanks,

Matt

------
0 Likes
Admiral
Admiral

Hi Matt!

Looks like you didn't set the parameter tcppeerclosedchecktimeout to a value greater than 0. This is a must-set-parameter for TCP-Syslog-Connectors.

Just set agents[0].tcppeerclosedchecktimeout=30000 in agent.properties, restart the container and you will be fine.

Details:

best regards

Tobias

View solution in original post

Commodore
Commodore

You Sir are a genius!  I cant tell you how frustrating it was to try and troubleshoot this issue!  It appears that was setup to -1 by default.  I've made the suggested changes but also bumped the filequeuemaxfilecount up from 100 to 150.  Any idea what the tcpcleanupdelay or tcpsetsocketlinger functions are used for?

Thanks!

agents[1].destination[1].type=http

agents[1].deviceconnectionalertinterval=60000

agents[1].enabled=true

agents[1].entityid=WQiEKEIBABDurIjUFAmlbA\=\=

agents[1].fcp.version=0

agents[1].filequeuemaxfilecount=150

agents[1].filequeuemaxfilesize=10000000

agents[1].forwardmode=false

agents[1].ipaddress=(ALL)

agents[1].overwriterawevent=false

agents[1].persistenceinterval=0

agents[1].port=7514

agents[1].protocol=Raw TCP

agents[1].rawloginterval=-1

agents[1].rawlogmaxsize=-1

agents[1].tcpbindretrytime=5000

agents[1].tcpbuffersize=10240

agents[1].tcpcleanupdelay=-1

agents[1].tcpencoding=UTF8

agents[1].tcpmaxbuffersize=1048576

agents[1].tcpmaxidletime=-1

agents[1].tcpmaxsockets=1000

agents[1].tcppeerclosedchecktimeout=30000

agents[1].tcpsetsocketlinger=false

agents[1].tcpsleeptime=50

agents[1].tcptimeout=5000

agents[1].type=syslog

agents[1].usecustomsubagentlist=false

agents[1].usefilequeue=true


------
0 Likes
Admiral
Admiral

No, sorry...I can only guess - maybe setsocketlinger = true will keep the already closed TCP sockets and don't give them back to the OS again. And tcpcleanupdelay > -1 will sleep the given milliseconds before dropping a half-open-connection. If you are feeling lucky you can ask support for explanation and share with us.

In my case I left these parameters to default and it's still looking good.

best regards

Tobias

0 Likes
Commodore Commodore
Commodore

Sir

This thread was very helpful. But setting that property

agents[0].tcppeerclosedchecktimeout=30000

did not help me as I am still getting the too many open files error and Java IO Exception. It seems to run fine for 24 hours but at some point it errors out and I am having to restart the connector.

I am wondering if at the OS level I have to set parameter to resolve this issue. I have reached out to support also but they have not given any promising response yet.

Here is the link to the solution I came across

https://easyengine.io/tutorials/linux/increase-open-files-limit

Can you please suggest any solution to this problem. This is a Raw TCP Syslog Connector.

Regards

Vignesh

 

 

 

 

 

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.