Cadet 2nd Class Cadet 2nd Class
Cadet 2nd Class
4246 views

Threads – Jetty311 and Threads.max

Working with the original ArcSight engineers prior to HP, the HPE Field Engineers, my peer group, and HPE Support Desk I figured I would summarize the “threads” topic as there seems to be many discussions on it.  Like anything, there are many ways to talk about this subject, but in the end the core items are discussed.  As of writing this, we should all take note that all systems are different so the values will be different for each person’s environment.


The WARNING -

WARNING: '17' agent requests REJECTED because the limit of '512' agent threads was exceeded.


Some have seen this message noted above followed by the ESM displaying duress by connectors dying off as they will be rejected due to thread exhaustion.


We need to understand what is a thread, who uses them, and how may should we setup our ESM to use.  We should not just assume more is better as this actually works against the ESM and the hardware it is running on.


Thread – a connection from one process to another consuming system resources


Where do I find out how many threads are being used or assigned?

  • agents.threads.max=Nrepresents how many threads the ArcSight Daemon can handle. This can be found going into your Web UI.

h t t p s : / / <ESM URL>:8443/arcsight/web/login.jsp?origPage=%2Farcsight%2Fweb%2Findex.jsp

From there “System Management” > “Threading” (under java.lang) , look for the value presented for “DaemonThreadCount” . 


  • servletcontainer.jetty311.threadpool.maximum=N” is the limit for the “ActiveThreadCount” (value found in the ESM System Information Dashboard)

Who uses threads and how many do they use?

Smart Connectors (a.k.a. Smart Agents) – by default, with no other parameters put into the agent.properties a connector uses 3 threads by default.  The 3 threads that go from a connector to the ESM are:


  1. Internal events
  2. System logging
  3. The event flow itself (this is the one most people think of as the ONLY thread)

To add more threads to the connector to feed the ESM more data when needed one would add the following parameters to the agent.properties…


http.transport.multithreaded=true

http.transport.threadcount=N

The “N” should be replaced with the number of threads you want to connect back to your ESM to transmit the event flow, maximum.  If we put a 10 in place of “N” we would have no greater than 12 threads from this one connector (1-internal events, 1-system logging, 10-data).  What this means is that unless the connector needs to and the ESM allows it, the number seen will fluctuate. There are other threading options not necessarily related to the ESM but for the connector so if you are interested you may want to look into –


$ArcSightAgentHome/current/config/agent/agent.defaults.properties


ESM & Console- The ESM and Console connections to the ESM use threads.  The ESM uses threads and it varies to connect to itself as well as to fork processes.  The console (the java based application usually installed on your desktop) also uses threads and consumes values in the jetty311 setting.


Bringing it all Together -

There are two formulas that are being handed around and neither one is wrong. Speaking recently with field engineers (FE’s) as well as developers working with the HP Support Desk their rational is as follows. 


**NOTE** - HPE works on the assumption that the customer has no more resources to bear and so both formulas are conservative.


Working with the HPE Support Desk -

The HPE Support desk uses the following formula below when the situation is warranted.


agents.threads.max=Nt  formula: ((64+(N1*2))

servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)

N1 is the number of physical connectors (aka Smart Agents) connecting to your ESM


HP Support Desk Support rational is this. In a general sense it would give a 30% overhead for thread utilization.  Of course by this time they would have already asked you to gather your logs via KM1270566 before and after the change as they are trying to see if they got the message to subside.  The “warning” above in of itself is not telling anyone the cause of the problem as adjusting these values arbitrarily just moves the threshold before you may see this warning again.


Working with the HPE FE’s -

Many of the HP FE’s use the same formula but just tweaked a little.


agents.threads.max= Nt  formula: (N1*3)

servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)

N1 is the number of physical connectors (aka Smart Agents) connecting to your ESM


The FE uses a factor of 3 as many times customers who need to adjust these values have some custom environment configurations. The 3 multiplier adds buffer to the thread pool to take in consideration extra threads coming from connectors as well as extra console connections that would be covered under the jetty311 value.


Some additional notes from FE’s look like this –

  • Keep the number in increments of 64 (e.g. 64…128…192…256…320…etc.. etc..)
    • Even if your math comes out to a smaller odd number such as 317, you should round up to 320 (the next number evenly divisible by 64)
  • Keep the total amount of threads just above the total amount, up to 30% higher

**NEVER** increase max threads far above the actual count. Do the math and if your numbers are not being exceeded and the manager still reports problems do not increase threads as a panacea.  Bigger problems will result from the manager using more threads than mathematically supposed to use. Further, if you do not know why you are getting the “warning”, increasing the threads will only put off the problem. Perform thread dumps and find and fix the real problem.  If the real cause for threads exceeding is in fact you do have a bunch of connectors hooking into you, then this is probably where the “buck stops here”.


Further, if you are looking for threads connecting to your ESM run this command at the CLI –


watch 'netstat -anp |grep 8443 | grep ESTABLISHED |wc -l'


What to look for after the change?

Physical Memory (red zone messages) - As you increase the MAX threads you are using up more physical memory. Be sure not to increase the value past the available memory.


Disk I/O - As you increase the MAX threads to accommodate additional threads you are allowing more events to be sent to the ESM. The increased load equates to more events being written to the DB faster resulting in Disk IO. Disk IO that is too high will result in thread blocking.


JVM Full GC – A small change to threads can have a profound effect on the JVM GC of your ESM.  Take note of before and after “Full GC” messages as you can see if the Full GC is running too frequent and/or too long.  You may need to adjust additional settings in the server.wrapper.conf –

wrapper.java.initmemory=N

wrapper.java.maxmemory=N

…These settings too have no silver bullet and are another topic of discussion.

Sample Case –

Scenario is that your ESM has 40 connectors

HP Support Desk Solution

agents.threads.max=Nt  formula: ((64+(N1*2))
                 (64+(20*2))=144 ... to allow breathing room and to not break the rule for 64 increments, rounded up to nearest 64 -> 192
servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)
                 Take rounded up value of above and double it -> 384
agents.threads.max=Nt  formula: (N1x3)
                 120 ... to allow breathing room bring it up another 64 -> 192
servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)

                 Take rounded up value of above and double it -> 384


HPE FE's Solution

agents.threads.max=Nt  formula: (N1x3)

120 ... to allow breathing room bring it up another 64 -> 192

servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)

Take rounded up value of above and double it -> 384

The Customers Solution

A customer may (should) know his environment and so would take in consideration for their connectors that multi-threading(aka MT) is enabled and add up those values as well where the FE’s and HPE Support would not know these intimate details.  If we know that all our connectors have MT set to 8 the formula result changes significantly


agents.threads.max=Nt  formula: (N1*3)+(MT*N1)
                (40*3)+(40*8)=440 ... to allow breathing room bring it up to the next number divisible by 64 -> 512
servletcontainer.jetty311.threadpool.maximum=Nt2  formula: (Nt*2)

                Take rounded up value of above and double it -> 1024


Summary -

In general, the values are basically the same.  Of course if the points noted above are in effect (multi-threading, console connections, etc...) check to see where your numbers are today and confirm the math as a baseline and work from there.  We the customers should be in the know for our environment and so what HP Support and the HPE FE’s provide is the guidance to help us get our ESM’s to an optimum state.


In Closing…


Thank you to the original ArcSight Engineers, HPE FE’s, HPE Support Desk as well as my peers in recent threads as we have been speaking passionately about this topic.

5 Replies
Absent Member.
Absent Member.

We have encountered this issue many times in our environment - simply increasing the threads didn't solve the issue.  We have had severe bottlenecks from active lists and rules - not just content we wrote but with stock content as well.  We have to be very diligent with in watching active list sizes and rule fires, otherwise it all comes crashing down.

0 Likes
Cadet 2nd Class Cadet 2nd Class
Cadet 2nd Class

Ben,

Did you find certain size where the active list becomes a problem?  Same with rules, did you find a certain threshold that was causing the ESM to crash?

0 Likes
Absent Member.
Absent Member.

We didn't find that special number.  We have a daily script on the ESM that checks the table sizes for active lists, session lists, and trends.  We try not to let them grow over 750,000 to 1,000,000 entries.  When we had problems, some of the lists had over 15,000,000 entries.

Because trends and session lists are a little more difficult to manage, we avoid session lists unless we have a specific low-volume purpose.  Trends and also try to be very specific with what we want - instead of trending all failed logins for a day, we just do a count.

With rules, I watch the top firing rules and partial matches.  I was told to keep the partial matches below 100,000.  A lot of our issues came from poorly written rules performing lookups on massive active lists.  Any rules that perform a lookup I try to make very specific - even the stock content.

0 Likes

Sessions list can get large as they currently do not have a TTL. This can be a pain if there is a lot of activity writing to them and they grow to monster large lists. ESM 6.9.1 will have a TTL setting  for sessions lists to manage this.

0 Likes
Cadet 1st Class
Cadet 1st Class

Hi can u share the script.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.