Highlighted
Established Member.. raymond.doty
Established Member..
8922 views

ESM6 Performance Tuning?

Hello!!!

Wondering if anyone has started venturing into this arena.  It seems ESM6 while faster in general, has its own set of potential performance bottlenecks...  Has anyone gone through some performance tuning in relation to MySQL, RHEL, and/or the new CORRE for high throughput single tier systems (seeking 5k+ eps, prefferably 20k+ EPS to ESM).

Not looking at connectors and/or loggers at this time.

*****NOTE: THESE SETTINGS ARE NOT SUPPORTED BY ARCSIGHT SUPPORT, please do your own research!!!  I hold no liability for these changes being implemented into an environment and breaking it 😮

I went through Joes guide(very helpful), and quite a few others and came up with what seems to have helped a little, there are some items I wasnt sure even applied given that we are not using oracle (such as huge pages)...

Our ESM system is 48 core 2ghz AMD processors, with 256gb ram, and an attached HP SAN connected.  We are currently running about 5k average EPS at the ESM, looking to up that to near 30k.

Will gladly admit that I am neither an SQL nor RHEL admin, so please take it easy on me if some of these things make you scratch your head, just tell me what was done wrong.  These configs are running in our environment and it seems to have stabilized our memory usage, but increased the load requirements on our SAN.

Important Edit / Update 4/11/2014:

I have been notified that the innodb_file_per_table = 1 parameter can and has caused serious issues in some installations.  I have strikethrough'd it on this document.  I have not personally run into this issue, so I don't know the scenario or cause, sorry.

Edit / Update 9/10/2013:

We have made a number of configuration changes to our system, I figured this post is in need of an update.  We have two systems now as well, these configurations have been tested on the above system with a SAN in addition to another system with 4x10 core intel xeon, 512gb ram, and 8.8TB usable fusion IO cards (6x3TB iodrive 2 in RAID 10).

As a conclusion - we have not really found any way to get much more performance out of the query engine of the software (cpu, memory, and io all seem to mostly idle).  We have only found a few parameters which truly have an impact on system stability, and the MySQL perf tuning below seem to be 'best practice' but don't seem to change performance in our high eps and performance environment (running at about 20k EPS).

Also: Big thank you to Joe (jbur) for all his hard work, he has even put out a config guide on his blog, which covers a number of the items listed below as well:

https://protect724.arcsight.com/people/jbur/blog/2013/08/02/tuning-esm-6

There was a MySQL tuning script found (thanks anonymous) intended to help ensure your system is configured properly, based on research they seem to be inline with the configurations and recommendations we had found previously and what is documented in the MySQL reference manuals.

https://github.com/major/MySQLTuner-perl

Short outline of what seem to be the most impactful configurations:

  1. (OS - /etc/sysctl.conf ) vm.swappiness=0
    1. This is huge, without this change our system with 512GB of RAM would frequently swap, once swap begins, we have massive performance hits all over (usage of console, insertion rates, queries, etc
  2. (Mysql) sort_temp_limit = #G
    1. This has a direct impact on your ability to run reports and queries, this needs to be sized by testing queries and watching the size of the temp files in /opt/arcsight/logger/data/MySQL/  (they will all start with # and end with myi and myd I believe).  In our environment we sized this to 250G for ~25k EPS with extremely large queries.
  3. (Mysql) innodb_buffer_pool_size = #G
    1. This is how much memory you allocate directly to MySQL/innodb.  We have this set to 64gb although I haven't seen a huge impact on performance pushing from 16 to 256GB.
  4. (Mysql) innodb_flush_method=O_DIRECT
    1. This seems to be a solid best practice based on research
  5. (Mysql) innodb_flush_log_at_trx_commit=1
    1. This seems to be a solid best practice based on research
  6. (Mysql) innodb_thread_concurrency = 8
    1. This seems to be a solid best practice based on research
  7. (OS command) tuned-adm profile enterprise-storage
    1. This is pertinent if you have some kind of enterprise storage, SAN, SSD, etc

The full / final configuration we are running with is as follows (the values are based on the references and scripts listed above)...


MySQL Tuning (in my.cnf):

1. thread_cache_size = 5120

a. #see http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html#statvar_Threads_created

2. table_cache = 4096

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

c. #See http://www.mysqlperformanceblog.com/2009/11/26/more-on-table_cache/

3. key_buffer = 64M

a. #See http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_key_buffer_size

5. read_rnd_buffer_size = 256M

a. #See http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_read_rnd_buffer_size

6. innodb_buffer_pool_size = 64G

a. #See http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size

b. #See http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/

7. innodb_flush_method=O_DIRECT

8. innodb_flush_log_at_trx_commit=1

9. innodb_additional_mem_pool_size = 8M

a. #See http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_additional_mem_pool_size

10. innodb_thread_concurrency = 8

a. #See http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-cont-or-triple-your-throughput

b. #See http://dimitrik.free.fr/blog/archives/2010/11/mysql-performance-55-and-innodb-thread-concurrency.html

c. #See *not accurate* http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_thread_concurrency

11. sort_temp_limit = 256G

a. #Note: This is an undocumented ARCSIGHT SPECIFIC parameter (stored but not used directly by MySQL), I believe related to the temporary tables when youre running a trend/query.  If youre getting an error about "Encountered persistence problem while fetching data: Unable to execute query:Temporary sort space limit exceeded", this may be a knob which you can tweak, however, this is something which is VERY specific to your environment (EPS and Trend size), word of caution.

12. slow_query_log = 1

a. #Note:This allows you to maintain a slow query log, this is generally useful when troubleshooting performance of queries and finding out which are the slowest queries, it also is referenced/used in the mysqltuner perl script.

b. #See http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html

c. #See http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_slow_query_log

d. #Note: Modified based on script output

e. #See https://github.com/major/MySQLTuner-perl

13. slow_query_log_file = /opt/arcsight/logger/data/mysql/mysql_server-slow.log

a. #Note: This is the location of the slow query log file

b. #See http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html

c. #See http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_slow_query_log_file

d. #Note: Modified based on script output

e. #See https://github.com/major/MySQLTuner-perl

14. tmp_table_size = 64M

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

15. max_heap_table_size = 64M

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

16. max_connections = 100

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

17. table_definition_cache = 1024

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

18. table_open_cache = 4096

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

19. join_buffer_size = 64M

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

20. open_files_limit = 65535

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

21. read_buffer_size = 64M

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

22. innodb_file_per_table=1

a. #Note: Modified based on script output

b. #See https://github.com/major/MySQLTuner-perl

RHEL Tuning:

1. References

a. https://protect724.arcsight.com/docs/DOC-1198

b. https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/tuned-adm.html

c. http://support.sas.com/resources/papers/proceedings11/72480_RHEL6_Tuning_Tips.pdf

d. https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/Performance_Tuning_Guide/Red_Hat_Enterprise_Linux-6-Performance_Tuning_Guide-en-US.pdf

e. https://www.google.com/search?q=tuned+red+hat+utility&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

f. http://www.redhat.com/summit/2011/presentations/summit/decoding_the_code/friday/rao_f_1045_Tuning_RHEL_for_databases.pdf

2. Tuned installed and profile changed - RHEL6 has tuning profiles, enterprise-storage looks most appropriate for large scale ArcSight implementations

a. tuned-adm profile enterprise-storage

3. BlockDev changes (changes default read-ahead for drives associated with our SAN)

a. blockdev --setra 8192 /dev/dm-#

b. ... etc

4. ##/boot/grub/grub.conf - deadline / no-op looked to be recommended for connected storage solutions that have their own optimization

a. #Adding elevator=deadline to the end of the kernel line

5. ##/etc/sysctl.conf - changes regarding network read/write buffers and hugepages (unsure on hugepage applicability)

a. net.core.rmem_default=262144

b. net.core.rmem_max=8388608

c. net.core.wmem_default=262144

d. net.core.wmem_max=8388608

e. vm.nr_hugepages=8192

f. vm.hugetlb_shm_group=502

g. vm.swappiness=0

6. ##/etc/crontab - this is specifically created to assist in the caching and eventual swapping of MySQL, although the swappiness setting had the largest impact on this, I believe this has also proven helpful.

a. 50 * * * * root sync; echo 3 > /proc/sys/vm/drop_caches

servletcontainer.jetty311.threadpool.maximum=512

b. agents.threads.max=256

c. log.channel.file.property.maxsize=100MB

2. Server.wrapper.properties changes (based on java full garbage collections)

a. wrapper.java.initmemory=32768

b. wrapper.java.maxmemory=32768

Original tuning parameters below (retained for historical purposes):

RHEL Tuning:

1. References

a. https://protect724.arcsight.com/docs/DOC-1198

b. https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/tuned-adm.html

c. http://support.sas.com/resources/papers/proceedings11/72480_RHEL6_Tuning_Tips.pdf

d. https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/Performance_Tuning_Guide/Red_Hat_Enterprise_Linux-6-Performance_Tuning_Guide-en-US.pdf

e. https://www.google.com/search?q=tuned+red+hat+utility&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

f. http://www.redhat.com/summit/2011/presentations/summit/decoding_the_code/friday/rao_f_1045_Tuning_RHEL_for_databases.pdf

2. Tuned installed and profile changed - RHEL6 has tuning profiles, enterprise-storage looks most appropriate for large scale ArcSight implementations

a. tuned-adm profile enterprise-storage

3. BlockDev changes (changes default read-ahead for drives associated with our SAN)

a. blockdev --setra 8192 /dev/dm-#

b. ... etc

4. ##/boot/grub/grub.conf - deadline / no-op looked to be recommended for connected storage solutions that have their own optimization

a. #Adding elevator=deadline to the end of the kernel line

5. ##/etc/sysctl.conf - changes regarding network read/write buffers and hugepages (unsure on hugepage applicability)

a. net.core.rmem_default=262144

b. net.core.rmem_max=8388608

c. net.core.wmem_default=262144

d. net.core.wmem_max=8388608

e. vm.nr_hugepages=8192

f. vm.hugetlb_shm_group=502

servletcontainer.jetty311.threadpool.maximum=256

b. agents.threads.max=128

c. log.channel.file.property.maxsize=100MB

2. Server.wrapper.properties changes (based on java full garbage collections)

a. wrapper.java.initmemory=16384

b. wrapper.java.maxmemory=16384

Labels (5)
0 Likes
56 Replies
Established Member.. raymond.doty
Established Member..

Re: ESM6 Performance Tuning?

So weve been running with these configs in our environment for about a week or two now (the mysql and RHEL were changed about a week apart).  Seem to be working out fairly well haven't had to make many tweaks, our memory management is MUCH better, i really think the primary reason for that is the sort_buffer_size = 2M versus 1000M, from what I have read, this is the amount of RAM allocated for every SORT operation, at 1GB per sort, this can be extremely cumbersome and run your server out of memory very fast.

I can't say that it has particularly sped up the response times in the application (channel load times, query / trend completion times, etc), but I believe this may be more of a SAN bottleneck, we are diagnosing at the moment.

Has anyone else done any 6.0c tuning?

0 Likes
chenselein1 Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

Great Post, thanks for gathering and sharing those valuable Informations.

0 Likes
Answer Honored Contributor.
Honored Contributor.

Re: ESM6 Performance Tuning?

Yep, great post!

We just went live last Tuesday with 6.0c. So far running at about 3-4K EPS. But we still have some loggers caching quite often and the insert time is higher that i'd like... Suspecting there might be a bit too much load during the day when everyone is using the console... As of now, I don't think it would be able to sustain 10-15K EPS as we plan to have soon...

I'll check out the tweaks and maybe try couple of them, although, some are already done.

0 Likes
Established Member.. raymond.doty
Established Member..

Re: ESM6 Performance Tuning?

Thanks for the response, id definitely appreciate any feedback.

I have finally been able to gather all the statistics into a single chart so this can be assessed more programatically than running a constant IOStat, top, and watching the connector screen for EPS...

Here is our current environment, including when the changes were made.

It seems that we definitely transitioned a bottleneck from Memory to SAN utilization with the MySQL tuning.  I cant say there seems to have been much change with the RHEL tuning steps.

Anyone else experience with this would be greatly appreciated.  Again, I am not a MySQL nor a Unix person, so I would love to not trudge through this alone

Notes regarding the graph:

a) The 256GB memory note is regarding our upgrade, we were originally running 64GB (it allowed us to re-enable a series of trends which were disabled previously)

b) The ESM Temp Space change relates to the variable sort_temp_limit which is an undocumented ArcSight specific setting, stored in MySQL.  Support can help you change this if you have trends which aren't completing.  But basically it seems like a partial bandaid against a broken architecture for queries/trends (in very specific scenarios)

c) MySQL Sort Buffer relates to changing from 256k to 2M, we had some queries against large fields (filename, device event category, etc) that would fail immediately.  Based on the previous notes, this parameter specifically has dramatic impacts on the memory utilization of the server, the default value from ArcSight is 1GB (see previous notes regarding this).  I am going to keep an eye on it, but our memory utilization went up definitely after this change.

d) EPS numbers are in grey and values are on right side (0-~8000 EPS)

     d1) EPS_Received represents EPS into the connector infrastructure

     d2) EPS_Manager represents EPS into the ESM manager

e) Utilization numbers are colorful and values are on left (0-100%)

f) All are averages over the hour
2013Feb_PerfMetrics_Changes.jpg

0 Likes
w531t41
New Member.

Re: ESM6 Performance Tuning?

Ray,

I was glad to see your post a few days ago, as i'm currently battling this issue myself. What did you use to make the chart?

0 Likes
Established Member.. raymond.doty
Established Member..

Re: ESM6 Performance Tuning?

The event volumes (EPS_Connector / EPS_Manager) were gathered using ArcSight monitor events (specifically monitor:146 and monitor:147).

The Disk, Memory, and CPU were gathered using SAR (sar being the regular storing of the data, sadf being the tool to gather the data and export it into a usable format).

It was all munged into two files (one for disk/memory/cpu - and the other for the EPS) and then averaged out per day, then the two files were combined and excel was used to create the pivot table you see...  I manually inserted the lines and text boxes where we did the changes.

Honestly it took a lot longer than id prefer, but im glad i have the capability to see all aspects (i even have insert/retrieval time for the logger - but those numbers are scary).  I am looking at taking the sadf numbers and pulling them into ArcSight using a flex cef connector, so we can have all the metrics in one place.

I took all the SAR files we have put them in temp and then ran the following commands to get the general info i needed out:

#CPU
for file in /tmp/20130214_sar/sa_*; do sadf -d -t "$file" >> /tmp/20130214_sar/20130214_sadf_CPU.csv -- -u ; done

#Memory
for file in /tmp/20130214_sar/sa_*; do sadf -d -t "$file" >> /tmp/20130214_sar/20130214_sadf_Memory.csv -- -r ; done

#Disk
for file in /tmp/20130214_sar/sa_*; do sadf -d -t "$file" >> /tmp/20130214_sar/20130214_sadf_Disk.csv -- -d ; done

0 Likes
vdor Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

This is great information. Can't wait to actually have a test instance of ESM 6 so I can try it out.

0 Likes
jbur Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

Ray,

Thank you for posting your research on ESM 6.  I'm sure there will be more of us to trudge through it with you over the coming months as ESM 5 nears end of life.

Hopefully HP adopts the performance tweaks (with QA testing) and fixes the issues.

-Joe

0 Likes
Established Member.. raymond.doty
Established Member..

Re: ESM6 Performance Tuning?

Thanks Joe - I started with your guide and tried to adapt from there.  It has proven unbelievably useful over the years, so cheers to you for helping start this whole movement

0 Likes
jbur Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

Thank you for the kind words. 

-J

0 Likes
jbur Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

Ray,

I'm 99% sure hugepages aren't applicable here, but you can do this confirm if they're being used on your system:

cat /proc/meminfo

It will show you total versus free huge pages.  If it's the same number, then your implementation isn't using it and I recommend removing it from your config.

I'm going to start playing around with your config recommendations and quantifying the impact to query performance.  Tuning on ESM 6 is interesting as some of the settings seem to affect not only speed, but if your query even completes at all.

-Joe

EDIT:  You may actually be able to get MySQL to use huge pages.  Looks like you just need to add a line to your CNF (in addition to proper sysctl and other RHEL configs of course).

[mysqld]

large-pages


Optimizing a Server for MySQL | On Forums

0 Likes
jbur Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

I did some active channel testing to try out some of the tweaks.  I ran two consoles simultaneously, each with a different active channel.  One active channel used annotation fields and had multiple sorted columns, the other was a simple vendor!=arcsight channel.  Total number of events to scan through was about 65 million (intentionally small to make it feasible to do multiple runs).

I tested different schedulers (cfq,deadline, noop), innodb_thread_concurrency=0, and innodb_flush_method=O_DIRECT.

None of the parameters seemed to make any difference.  The active channels completed in about the same amount of time after each config change.

Followup note:  These tweaks may have an impact on reports or ESM running on different hardware.  I'm just saying that I didn't see any noticeable impact in my environment for active channels.

0 Likes
Established Member.. raymond.doty
Established Member..

Re: ESM6 Performance Tuning?

The perf tuning above was more to address the reporting engine query (report, trend, queryviewer) performance as compared to channel performance...  Sorry if that wasn't very clear.

But I will agree that the modifications above seemed to have moderate effect on the larger issues at hand.  When tested I believe the gains were only 10-20% at maximum (in addition to resolving the run failure issues on *some* queries).

I will take a look into the hugepages and see if I can find any impact on this

0 Likes
jbur Absent Member.
Absent Member.

Re: ESM6 Performance Tuning?

Given that ESM6 seems to use a massive amount of memory, Huge pages seems likely to help performance.  (assuming it doesn't break anything)

While we're on the topic, I noticed something interesting about how ESM 6 uses memory.  If I launch a single large report I see my memory utilization rapidly shoot from 6GB to 28GB (always stopping before taking the last 200MB free for some reason).  During this time my active channels are significantly slower (until the report throws an error and releases the memory).  Thus it's logical to infer that the tweaks you listed may be improving performance by simply avoiding buffer/memory starvation.

-J

Here's Ray's "query of death" if anyone else wants to try it.  I've never seen it complete successfully even with only a few million events on ESM 6, and it slows down all the active channels on your system while it's running.

query_of_death.jpg

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.