Linux swap on DB
We moved our 10G DB from windows to RHEL 5.2 64bits a few weeks ago and we noticed we have an issue with the swapping. From time to time we see a lot of swapping activity ( up to 1300 write/s ) which is causing serious performance issue. Action taken so far have been to :
* decrease memory allocated to Oracle ( currently 17 GB for sga + pga ) to increase the memory available for the OS (currently 3 GB )
* stop unused services to reduce memory usage as much as possible
* change swapiness parameter to 0
These changes improved things slightly but swapping still occurs and we can't figure out which process is causing this behavior. We don't see any reason for this to happen as nothing else than Oracle and PA is running on this server. A swapiness value of 0 should force the system to swap only when there is absolutely no other solution but swapping still happens.
Any idea or suggestion would be appreciated.
Thanks for your help.
I don't think there's a simple solution, but rather a bunch of areas to check. First I'd look at top and see what's happening with the memory - e.g., are you running out of memory, is it just swapping because it wants to, etc. Usually if Linux is swapping database stuff, then you're running out of memory. Under normal circumstances, it only swaps out memory that's not being used. Also, vmstat will help you get some overall statistics on your memory usage. Finally, use 'ps aux' and look at the RSS and VSZ columns to see what process(es) is using the most swap. RSS shows the physical memory used, and VSZ shows the virtual memory used.
Keep in mind too, that if you change the swapiness via /proc, it disappears next reboot (don't know if this is how you did it, but throwing it out there anyway).
If you want to post your top, vmstat, /proc/meminfo, and 'ps aux' output, I can probably help a little more.
Also, there's a free guide from RedHat on tuning RHEL for Oracle - I've attached it. Hope some of this helps!
Chris, thanks for the document. Actually I have been working on a very similar one ( see attachment ) but it seems there are some small differences. I finally found out the reason for the intensive swapping which is apparently related to huge pages which must be configured properly.
If you experience similar issue, you can look in the attached document at section "Large Memory Optimization (Huge Pages) p14"
Because of that, memory allocated to Oracle was sometimes swapped out or in which was severely impacting performance. My first tests show that the average DB responsiveness ( ASM database responsiveness data monitor ) has been drastically reduced even when background processes like partition unarchiving is occurring.
This type of issue is quite difficult to solve because Linux doesn't allow to determine clearly the swapping activity per process. I would be curious to know if people running their DB on RHEL already checked the swapping activity on their system via vmstat, sar -p or iostat -x . Actually this kind of issue could be unnoticed if you don't closely monitor your system performance so it could be interesting to have look at it.
I've got some RHEL 4.8 64bit DB servers doing the same thing. Looks like Oracle grabs all available memory on the box plus some swap, and never releases it.
On a side note: The Dell R815 supports two SSD devices in RAID1 for primary storage. That should speed up your swap space! 😉
Joe Burke wrote:
On a side note: The Dell R815 supports two SSD devices in RAID1 for primary storage. That should speed up your swap space! ;-)
Thanks for the tip Joe but no need for fast swap storage anymore as the swapping completely dispappeared 😉
I guess I found the fix 😉 As explained here above , the issue was related to HugePage which were not configured properly.
This is the list of changes I made on my 64 bits 5.2 RHEL with total ( sga + pga ) = 15GB
echo "vm.nr_hugepages=7680" >> /etc/sysctl.conf
echo "vm.hugetlb_shm_group=`id -g oracle`" >> /etc/sysctl.conf
echo "oracle soft memlock 15728640" >> /etc/security/limits.conf
echo "oracle hard memlock 15728640" >> /etc/security/limits.conf
for huge page size of 2048 kb
The system needs to be rebooted for the changes to take effect.
Disclaimer : these values are provided as an example, they will be different on your system. Please refer to the file I attached on my precedent post for more details ( page 14 )
That's why I asked for other people to check their system. If I see this problem is quite common I will ask AS to update their documentation because this swapping issue can very heavily impact the performance depending on the speed of the disks the swap is located on. That being said my guess is that the impact is bigger on systems with decent amount of ram available for the DB. Or maybe is it simply an awful conspiracy to sell high end HP SAN
So any other feedback than Joe's one ?
We disabled swap here. We enabled directIO which yielded a significant improvement in EPS. I have also read, the document you posted and I am going to try some of the tweaks to see if performance improves.