Detailed explanation of Linux vmstat command in practice (transfer)

Reprinted from: https://www.cnblogs.com/ggjucheng/archive/2012/01/05/2312625.html

 

 

The vmstat command is the most common Linux/Unix monitoring tool, which can display the status values ​​of the server at a given time interval, including the server's CPU usage, memory usage, virtual memory swap, IO read and write. This command is my favorite command for checking Linux/Unix. One is that Linux/Unix supports both. The other is that compared to top, I can see the CPU, memory, and IO usage of the entire machine, instead of just seeing each process. The CPU usage and memory usage (the usage scenarios are different).

Generally, the use of the vmstat tool is done through two numerical parameters. The first parameter is the number of sampling time intervals, in seconds, and the second parameter is the number of sampling times, such as:

root@ubuntu:~# vmstat 2 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 rb swpd free buff cache si so bi bo    in    cs us sy id wa
  1   0       0  3498472  315836  3819540     0     0      0      1     2     0   0   0  100   0

2 indicates that the server status is collected every two seconds, and 1 indicates that it is collected only once.

In fact, during the application process, we will continue to monitor for a period of time, and we can just end vmstat without monitoring, for example:

copy code
root@ubuntu:~# vmstat 2  
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 rb swpd free buff cache si so bi bo    in    cs us sy id wa
  1   0       0  3499840  315836  3819660     0     0      0      1     2     0   0   0  100   0 
 0   0       0  3499584  315836  3819660     0     0      0      0    88   158   0   0  100   0 
 0   0       0  3499708  315836  3819660     0     0      0      2    86   162   0   0  100   0
 0  0      0 3499708 315836 3819660    0    0     0    10   81  151  0  0 100  0
 1  0      0 3499732 315836 3819660    0    0     0     2   83  154  0  0 100  0
copy code

This means that vmstat collects data every 2 seconds, and keeps collecting until I end the program. After collecting data 5 times, I end the program.

Well, the introduction of the command is completed, and now we will start to explain the meaning of each parameter in practice.

r  represents the running queue (that is, how many processes are actually allocated to the CPU). The server I tested is currently relatively idle, and no programs are running. When this value exceeds the number of CPUs, there will be a CPU bottleneck. This is also related to the load of top. Generally, if the load exceeds 3, it is relatively high, if it exceeds 5, it is high, and if it exceeds 10, it is not normal, and the state of the server is very dangerous. The load of top is similar to the run queue per second. If the run queue is too large, it means that your CPU is very busy, which generally results in high CPU usage.

b  represents the blocked process, this is not much to say, the process is blocked, everyone understands.

swpd 虚拟内存已使用的大小,如果大于0,表示你的机器物理内存不足了,如果不是程序内存泄露的原因,那么你该升级内存了或者把耗内存的任务迁移到其他机器。

free   空闲的物理内存的大小,我的机器内存总共8G,剩余3415M。

buff   Linux/Unix系统是用来存储,目录里面有什么内容,权限等的缓存,我本机大概占用300多M

cache cache直接用来记忆我们打开的文件,给文件做缓冲,我本机大概占用300多M(这里是Linux/Unix的聪明之处,把空闲的物理内存的一部分拿来做文件和目录的缓存,是为了提高 程序执行的性能,当程序使用内存时,buffer/cached会很快地被使用。)

si  每秒从磁盘读入虚拟内存的大小,如果这个值大于0,表示物理内存不够用或者内存泄露了,要查找耗内存进程解决掉。我的机器内存充裕,一切正常。

so  每秒虚拟内存写入磁盘的大小,如果这个值大于0,同上。

bi  块设备每秒接收的块数量,这里的块设备是指系统上所有的磁盘和其他块设备,默认块大小是1024byte,我本机上没什么IO操作,所以一直是0,但是我曾在处理拷贝大量数据(2-3T)的机器上看过可以达到140000/s,磁盘写入速度差不多140M每秒

bo 块设备每秒发送的块数量,例如我们读取文件,bo就要大于0。bi和bo一般都要接近0,不然就是IO过于频繁,需要调整。

in 每秒CPU的中断次数,包括时间中断

cs  The number of context switches per second. For example, when we call a system function, we need to perform context switching, thread switching, and process context switching. The smaller the value, the better. If it is too large, consider reducing the number of threads or processes. , For example, in web servers such as apache and nginx, we generally perform performance tests with thousands or even tens of thousands of concurrent tests. The process of selecting a web server can be downgraded by the peak of the process or thread, and the stress test will be performed until cs To a relatively small value, the number of processes and threads is a more appropriate value. The same is true of system calls. Every time a system function is called, our code will enter the kernel space, resulting in context switching. This is very resource-intensive, and we should try to avoid calling system functions frequently. Too many context switches means that most of your CPU is wasted in context switching, resulting in less time for the CPU to do serious work, and the CPU is not fully utilized, which is not desirable.

The us  user CPU time, I used to be on a server that performs frequent encryption and decryption, and I can see that the us is close to 100, and the r running queue reaches 80 (the machine is doing a stress test, and the performance is not good).

sy  System CPU time, if it is too high, it means that the system call time is long, such as frequent IO operations.

id   idle CPU time, generally speaking, id + us + sy = 100, generally I think id is idle CPU usage, us is user CPU usage, and sy is system CPU usage.

wt  wait for IO CPU time.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326022978&siteId=291194637