Anonymous_User Absent Member.
Absent Member.
164 views

Couldn't Execute Action when Correlation Fired


Hi,

Recently, my correlation rules fired on events but can't create
correlation events and execute the corresponding action. I could see the
rules firing on the dashboard and "Last fire at" time is updating every
time I refresh the dashboard. I look through Google and found that the
behavior is similar to bug 812522 on Sentinel 7.0.3 while mine is 7.2.

Then I look at the server0.0.log and found a lot of log like this:


Wed Aug 20 10:33:40 HKT 2014|SEVERE|Thread-221|Unknown.unknown
; Exception Task
esecurity.ccs.comp.correlation.EngineResultListenerImpl$RuleAction$1@1f07b957
rejected from java.util.concurrent.ThreadPoolExecutor@1b86b734[Running,
pool size = 8, active threads = 8, queued tasks = 10000, completed tasks
= 3296728]; java.util.concurrent.RejectedExecutionException;
Wed Aug 20 10:33:40 HKT 2014|SEVERE|Thread-221|Unknown.unknown
java.util.concurrent.RejectedExecutionException: Task
esecurity.ccs.comp.correlation.EngineResultListenerImpl$RuleAction$1@1f07b957
rejected from java.util.concurrent.ThreadPoolExecutor@1b86b734[Running,
pool size = 8, active threads = 8, queued tasks = 10000, completed tasks
= 3296728]
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at
esecurity.ccs.comp.correlation.EngineResultListenerImpl$3.run(EngineResultListenerImpl.java:212)

OK, my questions are:

1. Is the bug 812522 fixed on 7.2 / 7.3?
2. In the log, queued tasks = 10000, it seems this phenomenon (fire but
no execution) occur when there's a lot of queued task (say 10000). Is
there any relationship between these two things? If yes, is it possible
to clean up the queued task either on the web interface, control centre
or command line interface?
3. Is it possible to view the queued tasks?
4. Other than restarting the whole Sentinel service, is there any other
ways to resolve this issue? (Mine are running HA, therefore I don't want
to restart the server and flap to the backup machine.)

Thanks and regards, and sorry for the big question.
Jack


--
jackcheng
------------------------------------------------------------------------
jackcheng's Profile: https://forums.netiq.com/member.php?userid=1387
View this thread: https://forums.netiq.com/showthread.php?t=51561

0 Likes
3 Replies
Anonymous_User Absent Member.
Absent Member.

Re: Couldn't Execute Action when Correlation Fired

On 08/19/2014 08:53 PM, jackcheng wrote:
>
> Recently, my correlation rules fired on events but can't create
> correlation events and execute the corresponding action. I could see the
> rules firing on the dashboard and "Last fire at" time is updating every
> time I refresh the dashboard. I look through Google and found that the
> behavior is similar to bug 812522 on Sentinel 7.0.3 while mine is 7.2.
>
> Then I look at the server0.0.log and found a lot of log like this:
>
>
> Wed Aug 20 10:33:40 HKT 2014|SEVERE|Thread-221|Unknown.unknown
> ; Exception Task
> esecurity.ccs.comp.correlation.EngineResultListenerImpl$RuleAction$1@1f07b957
> rejected from java.util.concurrent.ThreadPoolExecutor@1b86b734[Running,
> pool size = 8, active threads = 8, queued tasks = 10000, completed tasks
> = 3296728]; java.util.concurrent.RejectedExecutionException;
> Wed Aug 20 10:33:40 HKT 2014|SEVERE|Thread-221|Unknown.unknown
> java.util.concurrent.RejectedExecutionException: Task
> esecurity.ccs.comp.correlation.EngineResultListenerImpl$RuleAction$1@1f07b957
> rejected from java.util.concurrent.ThreadPoolExecutor@1b86b734[Running,
> pool size = 8, active threads = 8, queued tasks = 10000, completed tasks
> = 3296728]
> at
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
> at
> esecurity.ccs.comp.correlation.EngineResultListenerImpl$3.run(EngineResultListenerImpl.java:212)
>
> OK, my questions are:
>
> 1. Is the bug 812522 fixed on 7.2 / 7.3?


Yes, unless something re-introduced it post-fix.

> 2. In the log, queued tasks = 10000, it seems this phenomenon (fire but
> no execution) occur when there's a lot of queued task (say 10000). Is
> there any relationship between these two things? If yes, is it possible
> to clean up the queued task either on the web interface, control centre
> or command line interface?


If you are generating events faster than the system can kick off the
actions, you need to fix something. I would not expect good things if you
are in this state. I do not recall hitting this limit, but correlation
isn't designed to fire thousands of times per second for hours on end, so
if you see that happening then it's probably time to dial things back.
Since actions can be anything, including external processes which may have
a fair bit of overhead getting loaded as a new process every time (vs. a
new thread within Sentinel), those will not scale nearly as well as you
may expect (if you expect them to run as fast as something that's internal).

> 3. Is it possible to view the queued tasks?


Possible... perhaps if you get a memory dump from the box. I do not know
of anywhere this would be cached to disk but most of the time I look under
/var/opt/novell/sentinel/data/events for things like that.

> 4. Other than restarting the whole Sentinel service, is there any other
> ways to resolve this issue? (Mine are running HA, therefore I don't want
> to restart the server and flap to the backup machine.)


Pause the correlation rule so that things can catch up, perhaps? Just a
wild guess, though.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Couldn't Execute Action when Correlation Fired


ab,

Thanks for your quick response!

>>> 3. Is it possible to view the queued tasks?

>>
>>Possible... perhaps if you get a memory dump from the box. I do not

know
>>of anywhere this would be cached to disk but most of the time I look

under
>>/var/opt/novell/sentinel/data/events for things like that.
>>

There's nothing inside /var/opt/novell/sentinel/data/events, put
somewhere else?
/var/opt/novell/sentinel/data/events # du -h
4.0K ./rawDataDiskBuffer
4.0K ./triggerErrorBuffer
4.0K ./insertDiskBuffer
16K .

>>> 4. Other than restarting the whole Sentinel service, is there any

other
>>> ways to resolve this issue? (Mine are running HA, therefore I don't

want
>>> to restart the server and flap to the backup machine.)

>>
>>Pause the correlation rule so that things can catch up, perhaps? Just

a
>>wild guess, though.

I tried to disable all correlation rules and re-enable them. No
effect... Is restart a must?

In additional to the above questions, there's 1 more questions:
Is the "queue" "sit" inside the Sentinel server or it's correlation
engine? I'm thinking if it sits inside correlation engine, can I use a
separate correlation engine as a work-around for this problem?

Regards,
Jack


--
jackcheng
------------------------------------------------------------------------
jackcheng's Profile: https://forums.netiq.com/member.php?userid=1387
View this thread: https://forums.netiq.com/showthread.php?t=51561

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Couldn't Execute Action when Correlation Fired

On 08/20/2014 12:35 AM, jackcheng wrote:
>
>>> wild guess, though.

> I tried to disable all correlation rules and re-enable them. No
> effect... Is restart a must?
>
> In additional to the above questions, there's 1 more questions:
> Is the "queue" "sit" inside the Sentinel server or it's correlation
> engine? I'm thinking if it sits inside correlation engine, can I use a
> separate correlation engine as a work-around for this problem?


I'm not sure where the queue is to be honest; if you do have 10,000 items
in it, what will that mean IF if gets processed? 10,000 new incidents
(fatal to most organizations from a standpoint of processing all of those)
or 10,000 e-mails (not fun for the recipient(s)), or just 10,000 commands
run that do not impact employees/staff too much?

I'll see if I can find a system where I can tinker in order to identify
what happens where with regard to the actions, but I may need to build one
so we'll see.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.