Recently we had an issue with one of the Solaris global zone which is running with Oracle/SAP .Normally system use to run with 40GB~ to 50GB~ of free physical memory.But when we had a soft hang , we found that system is doing more paging to disk and free physical memory went down to 6 to 8GB.
We ran almost all the monitoring tools to find which process are consuming the memory but no luck .Then finally we had raised the case with oracle to find the root cause .
System configuration:
System Physical Memory :256GB
Swap :480GB
As per top,prstat, used physical memory is almost around 200GB~ .
root@ ~]# sar -r 5 5
SunOS 5.10 Generic_144500-19 sun4u 09/05/2012
16:40:11 freemem freeswap
16:40:16 885031 399256061
16:40:22 883764 399240091
16:40:27 882266 399212586
16:40:33 882487 399223267
16:40:38 882453 399221275
Average 883193 399230658
root@ ~]# echo "::memstat" | mdb -k
Page Summary Pages MB %Tot
----------- ---------------- ---------------- ----
Kernel 1696877 13256 5%
Anon 24796106 193719 75%
Exec and libs 150104 1172 0%
Page cache 5814314 45424 18%
Free (cachelist) 437928 3421 1%
Free (freelist) 79348 619 0%
Total 32974677 257614
Physical 32952777 257443
Finally Oracle kernel engineer found that shared memory segments was consumed 103 GB of physical memory. In that 35GB memory used by invalid memory segment where no process were using it .
[root@ ~]# ipcs -ZmA |awk '{x=x+$10} END {print x}'
111073374219
Converting in to GB
#bc
111073374219/1024/1024/1024
103.44514084886759519577
He found there are couple of shared memory segments which are not in use but holding the memory space.
[root@ ~]# ipcs -mA |grep myora1
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME ISMATTCH
m 19777443 0x8795c49c --rw-rw---- myora1 db myora1 db 1220 24576 23178 21487 9:15:48 9:16:07 14:15:49 1220
m 19777472 0 --rw-rw---- myora1 db myora1 db 1220 34292629504 23178 21487 9:15:48 9:16:07 14:15:58 1220
m 20331903 0 --rw-rw---- myora1 db myora1 db 1220 201326592 23178 21487 9:15:48 9:16:07 14:16:47 1220
m 83886120 0 --rw-rw---- myora1 db myora1 db 0 34292629504 24490 24490 9:56:37 9:56:37 9:56:09 0
m 83886125 0 --rw-rw---- myora1 db myora1 db 0 201326592 24490 24490 9:56:09 9:56:37 9:56:09 0
Shared memory segment 83886120 consumed 34GB & 83886125 consumed 200MB where there is no process are using it .Highlighted field will explain more. First three segments are valid one’s since there are so many process were using it(1220) and last two lines of shared memory segments were not using by any process. (0)
The number of ISM attaches to the associated shared memory segments.
[root@ ~]# ipcrm -m 83886120
[root@ ~]# ipcrm -m 83886125
After removing it, system back to normal and swapping reduced lot and got almost 50GB free physical memory .
Thank you for reading this article.Please leave a comment if you have any doubt ,i will get back to you as soon as possible.
Leave a Reply