Monday, October 7, 2013

How to measure memory usage in Linux

http://www.openlogic.com/wazi/bid/315941/how-to-measure-memory-usage-in-linux


Whether you are a system administrator or a developer, sometimes you need to consider the use of memory in GNU/Linux processes and programs. Memory is a critical resource, and limited memory plus processes that use a lot of RAM can cause a situation where the kernel goes out of memory (OOM). In this state Linux activates an OOM killer kernel process that attempts to recover the system by terminating one or more low-priority processes. Which processes the system kills is unpredictable, so though the OOM killer may keep the server from going down, it can cause problems in the delivery of services that should stay running.
In this article we'll look at three utilities that report information about the memory used on a GNU/Linux system. Each has strengths and weaknesses, with accuracy being their Achilles' heel. I'll use CentOS 6.4 as my demo system, but these programs are available on any Linux distribution.

ps

ps displays information about active processes, with a number of custom fields that you can decide to show or not. For the purposes of this article I'll focus on how to display information about memory usage. ps shows the percentage of memory that is used by each process or task running on the system, so you can easily identify memory-hogging processes.
Running ps aux shows every process on the system. Typical output looks something like this:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19228 1488 ? Ss 18:59 0:01 /sbin/init
root 2 0.0 0.0 0 0 ? S 18:59 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 18:59 0:00 [migration/0]
...
...
root 742 0.0 0.0 0 0 ? S 19:00 0:00 [ext4-dio-unwrit]
root 776 0.0 0.0 0 0 ? S 19:00 0:00 [kauditd]
root 785 0.0 0.0 0 0 ? S 19:00 0:00 [flush-253:0]
root 939 0.0 0.0 27636 808 ? S
If you are searching for memory hogs, you probably want to sort the output. The --sort argument takes key values that indicate how you want to order the output. For instance, ps aux --sort -rss
 sorts by resident set size, which represents the non-swapped physical 
memory that each taskuses. However, RSS can be misleading and may show a
 higher value than the real one if pages are shared, for example by 
several threads or by dynamically linked libraries.
You can also use -vsz – virtual set size – but it does not reflect the actual amount of memory used by applications, but rather the amount of memory reserved for them, which includes the RSS value. You usually won't want to use it when searching for processes that eat memory.
ps -aux alone isn't enough to tell you if a process is thrashing, but if your system is thrashing, it will help you identify the processes that are experiencing the biggest hits.

top

The top command displays a dynamic real-time view of system information and the running tasks managed by the Linux kernel. The memory usage stats include real-time live total, used, and free physical memory and swap memory, with buffers and cached memory size respectively. Type top at the command line to see a constantly updated stats page:
top – 19:56:33 up 56 min, 2 users, load average: 0.00, 0.00, 0.00
Tasks: 67 total, 1 running, 66 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.4%us, 1.7%sy, 0.2%ni, 88.7%id, 5.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1922680k total, 851808k used, 1070872k free, 19668k buffers
Swap: 4128760k total, 0k used, 4128760k free, 692716k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
1 root 20 0 19228 1488 1212 S 0.0 0.1 0:01.29 init 
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 
4 root 20 0 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/0 
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 
6 root RT 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/0 
7 root 20 0 0 0 0 S 0.0 0.0 0:01.27 events/0 
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns 
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 
....
In top memory is mapped as VIRT, RES, and SHR:
  • VIRT is the virtual size of a process, which is the sum of the memory it is actually using, memory it has mapped into itself (for instance a video cards's RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the process is able to access at the present moment.
  • RES is the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This number corresponds directly to top's %MEM column.) This amount will virtually always be less than the VIRT size, since most programs depend on the C library.
  • SHR indicates how much of the VIRT size is actually sharable, so it includes memory and libraries that could be shared with other processes. In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and counted in VIRT and SHR, but only the parts of the library file that contain the functions being used are actually loaded in and counted under RES.
Some of these numbers can be a little misleading. For instance, if you have a website that use PHP, and in particular php-fpm, you could see something like:
top – 14:15:34 up 2 days, 12:38, 1 user, load average: 0.97, 1.03, 0.93
Tasks: 124 total, 1 running, 123 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.9%us, 0.3%sy, 0.0%ni, 94.6%id, 0.0%wa, 0.0%hi, 0.1%si, 0.1%st
Mem: 1029508k total, 992140k used, 37368k free, 150404k buffers
Swap: 262136k total, 2428k used, 259708k free, 551500k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6695 www-data 20 0 548m 307m 292m S 0 30.6 8:06.55 php-fpm
6697 www-data 20 0 547m 306m 292m S 0 30.4 7:59.64 php-fpm
6691 www-data 20 0 547m 305m 291m S 2 30.4 8:04.96 php-fpm
6689 www-data 20 0 547m 305m 291m S 2 30.3 8:07.55 php-fpm
6696 www-data 20 0 540m 298m 292m S 1 29.7 8:13.43 php-fpm
6705 www-data 20 0 540m 298m 292m S 0 29.7 8:17.24 php-fpm
6699 www-data 20 0 540m 298m 291m S 4 29.7 8:07.39 php-fpm
6701 www-data 20 0 541m 297m 289m S 0 29.6 7:59.87 php-fpm
6700 www-data 20 0 540m 297m 290m S 0 29.5 8:09.92 php-fpm
6694 www-data 20 0 541m 296m 288m S 2 29.5 8:05.18 php-fpm
6707 www-data 20 0 541m 296m 288m S 0 29.5 8:09.40 php-fpm
6692 www-data 20 0 541m 296m 289m S 0 29.5 8:14.23 php-fpm
6706 www-data 20 0 541m 296m 289m S 3 29.5 8:07.59 php-fpm
6698 www-data 20 0 541m 295m 288m S 4 29.4 8:04.85 php-fpm
6704 www-data 20 0 539m 295m 289m S 2 29.4 8:13.58 php-fpm
6708 www-data 20 0 540m 295m 288m S 1 29.4 8:14.27 php-fpm
6802 www-data 20 0 540m 295m 288m S 3 29.3 8:11.63 php-fpm
6690 www-data 20 0 541m 294m 287m S 3 29.3 8:14.54 php-fpm
6693 www-data 20 0 539m 293m 287m S 2 29.2 8:16.33 php-fpm
6702 www-data 20 0 540m 293m 286m S 0 29.2 8:12.41 php-fpm
8641 www-data 20 0 540m 292m 285m S 4 29.1 6:45.87 php-fpm
8640 www-data 20 0 539m 291m 285m S 2 29.0 6:47.01 php-fpm
6703 www-data 20 0 539m 291m 285m S 2 29.0 8:17.77 php-fpm
Is it possible that all these processes use around 30 percent of the total memory of the system? Yes it is, because they use a lot of shared memory – and this is why you cannot simply add the %MEM number for all of the processes to see how much of the total memory they use.

smem

While you'll find ps and top in any distribution, you probably won't find smem until you install it yourself. This command reports physical memory usage, taking shared memory pages into account. In its output, unshared memory is reported as the unique set size (USS). Shared memory is divided evenly among the processes that share that memory. The USS plus a process's proportion of shared memory is reported as the proportional set size (PSS).
USS and PSS include only physical memory usage. They do not include memory that has been swapped out to disk.
To install smem under Debian/Ubuntu Linux, type the following command:
$ sudo apt-get install smem
There is no smem package in the standard repository for CentOS or other Red Hat-based Linux distributions, but you can get it with the following commands:
# cd /tmp
# wget http://www.selenic.com/smem/download/smem-1.3.tar.gz
# tar xvf smem-1.3.tar.gz
# cp /tmp/smem-1.3/smem /usr/local/bin/
# chmod +x /usr/local/bin/smem
Once it's installed, type smem on the command line to get output like this:
PID User Command Swap USS PSS RSS 
1116 root /sbin/mingetty /dev/tty6 0 76 110 568 
1105 root /sbin/mingetty /dev/tty2 0 80 114 572 
1109 root /sbin/mingetty /dev/tty4 0 80 114 572 
1111 root /sbin/mingetty /dev/tty5 0 80 114 572 
1107 root /sbin/mingetty /dev/tty3 0 84 118 576 
939 root auditd 0 336 388 808 
1205 root dhclient eth0 0 564 571 688 
1103 root login -- root 0 532 749 1680 
1090 root crond 0 704 784 1420 
1 root /sbin/init 0 736 813 1488 
1238 root -bash 0 380 856 1924 
1283 root /usr/sbin/sshd 0 676 867 1152 
1135 root -bash 0 392 868 1932 
426 root /sbin/udevd -d 0 948 973 1268 
955 root /sbin/rsyslogd -i /var/run/ 0 996 1069 1628 
1080 root /usr/libexec/postfix/master 0 984 1602 3272 
1089 postfix qmgr -l -t fifo -u 0 1032 1642 3284 
1234 root sshd: root@pts/0 0 1772 2328 3912 
19319 postfix pickup -l -t fifo -u 0 2376 2738 3276 
19352 root python ./smem 0 5756 6039 6416 
As you can see, for each process smem shows four interesting fields:
  • Swap – The swap space used by that process.
  • USS – The amount of unshared memory unique to that process – think of it as unique memory. It does not include shared memory, so it underreports the amount of memory a process uses, but this column is helpful when you want to ignore shared memory. This column indicates how much RAM would be immediately freed up if this process exited.
  • PSS – This is the most valuable column. It adds together the unique memory (USS) and a proportion of shared memory derived by dividing total shared memory by the number of other processes sharing that memory. Thus it will give you an accurate representation of how much actual physical memory is being used per process, with shared memory truly represented as shared. Think of it as physical memory.
  • RSS – Resident Set Size, which is the amount of shared memory plus unshared memory used by each process. If any processes share memory, this will overreport the amount of memory actually used, because the same shared memory will be counted more than once, appearing again in each other process that shares the same memory. Thus it is an unreliable number, especially when high-memory processes have a lot of forks.

Now what?

Each of these memory utilities has some pros and cons. ps and top can be useful, but you have to understand what the numbers they show mean. smem is the rookie here, but it shows the most interesting information about your programs, and you can use it with the parameter -u to show the total memory used by all your users – an interesting feature on multiuser systems.
Now that you have the tools to discover what's eating up your memory, what you should do about it?
If you are a developer and you have found that your program is at fault, that's good news! You can work on the code and use a debugger to find out which function, call, or procedure is using all that memory.
If the process or program that eats up most of your memory is a daemon, such as Apache, MySQL, or nginx, you can search online for information that explains how to tweak the parameters of that daemon to save RAM.
When your uber-optimized Java web app becomes so popular that your server can't serve all your users, sometimes the only thing to do is add more RAM. This should be your last alternative, after you have checked all the other steps. If this happens, don't be sad – it means that your application is a big success!

Helpful resources

Understanding memory usage on Linux OOM Killer Linux memory management Thread about Linux memory

No comments:

Post a Comment