Unless otherwise noted, articles © 2005-2008 Doug Spencer, SecurityBulletins.com. Linking to articles is welcomed. Articles on this site are general information and are NOT GUARANTEED to work for your specific needs. I offer paid professional consulting services and will be happy to develop custom solutions for your specific needs. View the consulting page for more information.

Solaris Memory Exhaustion

From SecurityBulletins.com

Jump to: navigation, search

Written by Doug Spencer
(c) 2007 - Written 2/2/2007

On some occasions, I have received inquires from people in the information technology industry concerned that their application running on a Solaris system does not have sufficient memory. More often than not, they see the values shown in prstat or top and panic that all the memory on the system is exhausted. Unfortunately, determining if a Solaris system is nearly exhausting its memory is not as simple as looking at the values presented by prstat. There are a few tips that can reveal more definitively if memory is an issue, but they require understanding what goes on behind the scenes.

First, I will talk about how memory is used in Solaris. Like many modern UNIX systems, Solaris tries to utilize all memory available to it to increase operating system performance so it can run applications quicker. This means memory that is otherwise unused is allocated to caches and buffers used by the operating system to speed relatively slow operations such as disk I/O. When an application starts and makes a MALLOC() or a MMAP() call to the operating system requesting memory, it is given memory from the available memory and from the memory allocated to caches and buffers.

Additionally, Solaris uses shared object files in an intelligent way. For shared object files, multiple applications may have their library calls mapped to the same .so file that is loaded into memory only one time. Monitoring utilities such as prstat will tabulate the memory as if shared object files were allocated once per application. This inflates the amount of memory prstat shows in use.

Now that we know some of the reasons why prstat shows inflated numbers for memory usage, how can we determine if our available memory is sufficient? This is relatively easy to determine using vmstat. The following is an example of vmstat output.

# vmstat 5 3
procs     memory                  page              disk         faults        cpu
r  b w  swap    free   re   mf   pi  po  fr de sr s1 s1 s1 sd  in   sy     cs  us sy id
31 6 0 4542592 1747840 378 3631  94  262 245 0  0  0 25 25  0  2365 17793 4086 62 18 20
52 2 0 4446880 1556736 325 2474  35  300 294 0  0  0  0  0  0  1681 22277 2279 83 11  6
70 1 0 4451632 1557256 567 5702  35  404 390 0  0  0  2  2  0  1879 24490 2432 79 17  4 

Specifically, you want to look at the following vmstat columns:

de = Desperation paging
sr = Scan rate of the paging routine
pi = Page In requests 
po = Page Out requests
re = reclaimed pages

The primary indication is the scan rate. The scan rate shows how aggressively the paging routine is looking for pages of memory that were not recently used so it can page them to disk. A small amount of paging is perfectly acceptable and normal. When the scan rate is low or 0, there is no memory exhaustion. The values for the paging system are changed in some of the versions of Solaris, but in general when scan rates exceed 200 pages per second for extended periods, it is an indication that memory may be running short. The scanner begins scanning when available memory drops below the value of lotsfree. When the lotsfree threshold is breached, the scanner begins looking for memory at the slowscan rate. Slowscan is normally 1/10th the value of fastscan. As memory drops to the minfree value, the scan rate increases linearly to the fastscan rate. Below the minfree threshold, the operating system will begin to refuse to allocate memory for new processes. Between lotsfree and minfree is desfree which is generally lotsfree/2. When you are at the desfree threshold, scan rates will be 100 pages per second.

As you now see, you can have just under lotsfree memory available, and quite a bit of unused memory depending on your configuration, and still have paging activity. As you run out of memory, the scanner gets more aggressive about finding least recently used pages of memory to page to disk. If it is unsuccessful, it stops allocating new memory, and if it is still unsuccessful, it will start desperation paging and killing running processes in order to provide memory.

The parameters for the paging system can be modified if there is a compelling reason to do so. If you have fast disks that you are using for your paging space, you might increase maxpgio to increase the rate at which I/O is queued to the swap devices, for instance.

Personal tools