Optimization required


Hello,

there is a home-made collector processing files (via SAMBA mount) .
Total number of lines is about 20 000 000 and final effective
performance is about 990 Events (lines) per second. Cutstomer complains
such result and expects it to be imporoved.

I performed some testings and results are following:
- when rejecting all events, the "performance" is about 7900 EPS
- when accepting all lines without any additional processing, we have
2884 EPS

Well, above numbers say how fast is the hardware used there, but ...

- when requied processing logic was enabled, perfomance stepped down to
911 EPS but after simple optimization steps *we achieved 991 EPS which
is still not acceptable.
*
To be more precise, main body (parse method) calls 4 map lokups (18,
154, 81 and 151 lines). As I guess these operation are key ones in terms
of performance.
MY QUESTIONS:
- IS THE \"LOOKUP\" FUNCTION PROVIDED WITH SDK ALREADY OPTIMIZED (BINARY
SEARCH, BST ETC?) OR SHOULD BE REPLACED BY SOMETHING ELSE, E.G. OWN
WRITTEN?

- ANY OTHER IDEAS HOW TO INCREASE PERFORMANCE (PLEASE, DON'T OFFER
HARDWARE UPGRADE :-)


Regards, Dariusz


--
karakan
------------------------------------------------------------------------
karakan's Profile: https://forums.netiq.com/member.php?userid=10087
View this thread: https://forums.netiq.com/showthread.php?t=54462

  • Care to share what the application is?

    Is there a reason the customer expects the collector to go faster? Based
    on their experience? Based on other collectors for other event sources?

    900 EPS is not bad, especially if that is with a full system (vs. testing
    within the development environment) so going much faster will probably
    require doing other things to share the load. With that said, having 900
    security-related events per second from a single source sounds like you're
    doing a huge number of logins, actual changes of data by usres, etc.

    The most-common reason I've heard for "make it go faster" is that the
    event source sends tons of non-security data which still must be processed
    to some degree. Disabling the stuff that has no bearing on security is
    the best way to fix that, as it also frees up your licensed EPS for other
    applications, saves disk space for things stored, speeds up searches due
    to fewer things needing to be checked, etc. Since most application
    vendors do not understand the difference between "logging" and "auditing"
    the generated output may be full of logs (operational stuff) which makes
    getting out the needles (security stuff) from the haystack more
    time-consuming.

    If you end up using your own lookup implementation I'd be interested to
    hear how that performs. If you just commented out the map lookups you may
    have an idea of how much that is impacting things. I have not typically
    noticed huge performance hits using maps, but I may be using them very
    differently from you.

    --
    Good luck.

    If you find this post helpful and are logged into the web interface,
    show your appreciation and click on the star below...

  • Thanks, AB.
    1) It's about TMG webproxy,
    2) I agree that most of these tones might be irrelevant,
    3) Sure, I'll comment lookups out and will let you know (but not
    tonight),
    4) Most important in your response for me was "900 EPS is not bad". This
    gives me a strong argument before tomorrow's meeting.

    Many, many thanks and regards, AB.


    --
    karakan
    ------------------------------------------------------------------------
    karakan's Profile: https://forums.netiq.com/member.php?userid=10087
    View this thread: https://forums.netiq.com/showthread.php?t=54462


  • karakan;261627 Wrote:
    > Thanks, AB.
    > 1) It's about TMG webproxy,
    > 2) I agree that most of these tones might be irrelevant,
    > 3) Sure, I'll comment lookups out and will let you know (but not
    > tonight),
    > 4) Most important in your response for me was "900 EPS is not bad". This
    > gives me a strong argument before tomorrow's meeting.
    >
    > Many, many thanks and regards, AB.


    1) Map lookups should not be costly - I suspect something else. Are
    you, for example, using safesplit()?
    2) We disallow collectors from shipping if they can't do at least 2000
    EPS, so I would not agree that 900 EPS is good, but it's far from
    'awful' at least.

    Is there any source that you'd be willing to share with us, to see what
    the impacts might be?


    --
    brandon.langley
    ------------------------------------------------------------------------
    brandon.langley's Profile: https://forums.netiq.com/member.php?userid=350
    View this thread: https://forums.netiq.com/showthread.php?t=54462

  • Brandon,

    In what environment (SDK vs. Sentinel, fast hardware vs. slow, dedicated
    CM vs. not) are you getting 2000 EPS through collectors? Maybe I still
    have old numbers from many years ago, but I did not think collectors were
    expected to go in the many-thousands of EPS. You're much closer to the
    source of good numbers than I, so please share as being wrong is good news
    for me.

    --
    Good luck.

    If you find this post helpful and are logged into the web interface,
    show your appreciation and click on the star below...