Linux, ESM 6, caching leading to swapping causing performance impacts (EPS slowdown, console responsiveness)


Please excuse my ignorance if any below points are invalid or, badly worded, I am a *nix newb, just trying to understand something that has been bothering me.

Initially what looked like a memory leak in ESM 6 (is anyone seeing this in 5.x or prior as well?), it seems that this is unintended consequences of an intended feature of RHEL/Linux than a specific problem with ArcSight.

The question:  Why does the system continually cache (what I understand being files read by the OS) to the point where there is zero 'free' memory (but plenty of cache), and then start swapping?  From what I understand the OS is supposed to (under normal circumstances) release the cached memory to applications.  Specific scenarios (and I guess there is an algorithm) decides when this does not occur.

Link: Gentoo Forums :: View topic - Linux Memory Management or 'Why is there no free RAM?'

ghoti adds:

When an application needs memory and all the RAM is fully occupied, the kernel has two ways to free some memory at its disposal: it can either reduce the disk cache in the RAM by eliminating the oldest data or it may swap some less used portions (pages) of programs out to the swap partition on disk.

It is not easy to predict which method would be more efficient.

The kernel makes a choice by roughly guessing the effectiveness of the two methods at a given instant, based on the recent history of activity.

Link: Experiments and fun with the Linux disk cache

1. While newly allocated memory will always (though see point #2) be taken from the disk cache instead of swap, Linux can be configured to preemptively swap out other unused applications in the background to free up memory for cache. The is tunable through the 'swappiness' setting, accessible through /proc/sys/vm/swappiness. A server might want to swap out unused apps to speed up disk access of running ones (making the system faster), while a desktop system might want to keep apps in memory to prevent lag when the user finally uses them (making the system more responsive). This is the subject of much debate.

2. Some parts of the cache can't be dropped, not even to accomodate new applications. This includes mmap'd pages that have been mlocked by some application, dirty pages that have not yet been written to storage, and data stored in tmpfs (such as in /dev/shm). The mmap'd, mlocked pages are stuck in the page cache. Dirty pages will for the most part swiftly be written out. Data in tmpfs will be swapped out if possible.

Link: Linux cached memory: Over 85% of cached memory and using swap - Server Fault
As for why a server might swap data instead of releasing cache, it may be the case that your cached data was being read much more than your data stored in memory. Programs sometimes have pages that they rarely, if ever, visit. That space is better utilized by caching.

This seems to ruin performance for ESM, once the cycle starts, and it never seems to clear up until you drop_cache on the box and restart services...


Here are some references I've found helpful regarding caches and swapping configurations:

It seems the key flag to relieve this behavior may be the swappiness setting in /etc/sysctl.conf.  We have tested with various numbers (60, 10, 1, and 0) and the only which keeps this behavior from recurring is 0.

I was hoping that maybe someone can help me understand why the Linux filesystem is deeming that cached disk-based files are more important than memory pulled out by what I assume are ESM/MySQL and decides to swap instead of providing some of the cached memory?

  • Verified Answer

    I found something!

    It seems if your ESM has started swapping, you can force that back into memory.  Its just recommended that you set the swappiness to zero (so it doesn't happen again) then drop your caches (so you have space to put it back into memory) and then there is a script.  I tested this in our environment and it worked.

    1) Change the active system swappiness setting (will not persist after reboot)

          sudo sysctl vm.swappiness=0

    2) Open sysctl for editing

          vi /etc/sysctl.conf

    3) Add the swappiness setting so setting will persist after reboot


    4) Save sysctl.conf file

    5) Drop file system in ram caches, to allow swap to be placed back into memory

          echo 3 | sudo tee /proc/sys/vm/drop_caches

    6) Create new file to move swap back into RAM

          vi /usr/local/sbin/

    7) Insert below lines into file


          mem=$(free  | awk '/Mem:/ {print $4}')

          swap=$(free | awk '/Swap:/ {print $3}')

          if [ $mem -lt $swap ]; then

              echo "ERROR: not enough RAM to write swap back, nothing done" >&2

              exit 1


          swapoff -a &&

          swapon -a

    8) Save file

    9) Make the file executable

          sudo chmod x /usr/local/sbin/

    10) Execute the script to move swap into ram

          sudo /usr/local/sbin/

    Additionally Red Hat has provided a handy, dandy script which will tell you the top 10 processes using swap!

    Example output (showing loggers mysqld using 22mb of swap)

    Process   : 25683 ?        Sl   10407:39 /opt/arcsight/logger/current/local/mysql/libexec/mysqld
    Swap usage: 22084 kB


    ps ax | sed "s/^ *//" > /tmp/ps_ax.output 
    for x in $(grep Swap /proc/[1-9]*/smaps | grep -v '0 kB' | tr -s ' ' | cut -d' ' -f-2 | sort -t' ' -k2 -n | tr -d ' ' | tail -10); do 
        swapusage=$(echo $x | cut -d: -f3)
        pid=$(echo $x | cut -d/ -f3)
        procname=$(cat /tmp/ps_ax.output | grep ^$pid)
        echo "============================" 
        echo "Process   : $procname" 
        echo "Swap usage: $swapusage kB"; done