Unless otherwise noted, articles © 2005-2008 Doug Spencer, SecurityBulletins.com. Linking to articles is welcomed. Articles on this site are general information and are NOT GUARANTEED to work for your specific needs. I offer paid professional consulting services and will be happy to develop custom solutions for your specific needs. View the consulting page for more information.
AIX Debugging and Profiling -- Memory Exhaustion
Written by Doug Spencer 10/7/2006
Like most UNIX operating systems, AIX has several debugging and profiling tools available to help track down problems and improve performance. This is an overview of debugging and profiling tools and techniques available on AIX.
Commands are available to monitor and tune Disk I/O, CPU, Kernel, Memory, Applications, and Networking. To get the best benefit for your efforts, you must determine what is the source of your bottleneck.
To determine if memory exhaustion is causing performance issues on an AIX system, a good starting point is the vmstat command. Exhaustion is indicated by large values in the Scan Rate column headlined as "sr" on vmstat's output. That column indicates how rapidly the page-replacement algorithm is looking for memory that is not pinned in memory and has not recently been used. When such a page is found, it can be written to disk freeing up RAM for more recently used data.
If you determine memory exhaustion is the cause of your issues, there are several things to look at.
First, how much RAM does the system have? The command "svmon -G -i 2 5" will tell you the number of 4K pages in the "size" column. The command will also tell you the number of pages in use and free as well as pages pinned in memory, which cannot be paged to disk.
Next, try to isolate which applications are using the most memory. The command "svmon -Pau" will display memory usage for the processes on the system. You can then use "svmon -D pid" to get detailed memory usage on a specific process ID.
If you find an application is using more memory than is expected, or memory usage grows without bounds, the application may have a memory leak. You can use trace, truss, or similar program to look for "malloc" calls without corresponding "free" calls and similar indications of memory leaks. Another fairly common source of memory leaks is using strcpy rather than strncpy on a string that does not have a null terminator, or overwriting an existing null terminator with other data. If an application is leaking memory, the best remedy is to fix the application. Other remedies, such as restart scripts, memory limits, or alerts for memory usage are often reasonable interim solutions until a permanent fix can be implemented, but should not be relied on as a fix.
With an application, there might be a setting that is overcommitted. Oracle is a common example. Often, DBAs will make the System Global Area (SGA) much larger than it really needs to be rather than taking the time to tune the database to the application and system it runs on.