Almighty system monitoring tool dstat command [transfer]

First look here: http://man.linuxde.net/dstat

Transfer: http://www.cnblogs.com/vincent-hv/p/3358194.html

1. What is dstat?

With the help of man, you can see that the official definition of dstat is: versatile tool for generating system resource statistics. The information obtained is somewhat similar to a collection of tools such as top, free, iostat, and vmstat. It is officially interpreted as a multi-functional substitute for tools such as vmstat, iostat, and ifstat, and many additional functions have been added (Dstat is a versatile replacement for vmstat, iostat and ifstat. Dstat overcomes some of the limitations and adds some extra features.); the results can be persisted to a csv file for performance analysis using scripts or third-party tools (e.g. monitoring via a monitoring platform, or to the database). The basic server is installed by default on Centos 6.x systems, while manual installation may be required on other operating systems.

Second, the basic use of dstat:

2.1 Default options of dstat

Like many commands, the dstat command has default options. Execute the dstat command without any parameters. By default, it will collect the data of -cpu-, -disk-, -net-, -paging-, -system-, once a second . Entering dstat by default is equivalent to entering dstat -cdngy 1 or dstat -a 1.

2.2 Common options of dstat:

The usage of dstat is as follows:

dstat [-afv] [options..] [delay [count]]

 

Use dstat -h to view all options, not listed here one by one, the following briefly introduces the common options

Common options are as follows:

# Follow the number directly, indicating that data is collected once in # seconds, the default is one second; dstat 5 means update every 5 seconds

-c,--cpu Statistics CPU status, including user, system, idle (percentage of idle waiting time), wait (waiting for disk IO), hardware interrupt (hardware interrupt), software interrupt (software interrupt), etc.;

-d, --disk Statistics disk read and write status

-D total,sda statistics the specified disk or summary information

-l, --load Statistical system load, including 1-minute, 5-minute, and 15-minute averages

-m, --mem Statistics system physical memory usage, including used, buffers, cache, free

-s, --swap count swap used and remaining

-n, --net Statistics network usage, including receiving and sending data

-N eth1,total Statistics summary traffic of eth1 interface

-r, --io Count I/O requests, including read and write requests

-p, --proc Statistics process information, including runnable, uninterruptible, new

-y, --sys Statistics system information, including interrupts, context switches

-t Displays statistical time, which is very useful for analyzing historical data

--fs count the number of open files and inodes

The above are the most commonly used options, and they are generally used in combination. Personally, the more commonly used options are:

  • dstat -cmsdnl -D sda9 -N lo,etho 100 5

3. The meaning of each parameter in the monitoring interface (part)

 

Procs

  • r: The number of processes running and waiting (CPU time slice) to run, this value can also determine whether to increase the CPU (long-term greater than 1) 
  • b: The number of processes in an uninterruptible state, commonly caused by IO
Memory
  • swpd: switch to memory on swap memory (default in KB). If the value of swpd is not 0, or is still relatively large, such as more than 100M, but the value of si, so is 0 for a long time, we can not worry about this situation, it will not affect the system performance.
  • free: free physical memory
  • buff: As the memory of the buffer cache, the read and write of the block device is buffered
  • cache: The memory used as the page cache, the cache of the file system. If the value of the cache is large, it means that there are many files in the cache. If the frequently accessed files can be cached, the read IO bi of the disk will be very small.
Swap
  • si: used for swap memory, transferred into memory from disk
  • so: swap memory usage, transferred from memory to disk

When the memory is sufficient, the two values ​​are both 0. If the two values ​​are greater than 0 for a long time, the system performance will be affected. Both disk IO and CPU resources are consumed.

I found that some friends think that the memory is not enough when they see that the free memory (free) is very small or close to 0. In fact, we can't just look at this, but also combine si and so. If there is very little free, but si, so is also very small (most of the time it is 0), so don't worry, the system performance will not be affected at this time.
Disk IO
  • bi: total amount of data read from block device (read from disk) (KB/s)
  • bo: the amount of data written to the block device (write to disk) (KB/s)
Note: When reading and writing random disks, the larger these two values ​​are (such as exceeding 1M), you can see that the value of CPU waiting in IO will also be larger.
System
  • in: the number of interrupts generated per second
  • cs: the number of context switches generated per second
The larger the above 2 values, the more CPU time you will see consumed by the kernel
Cpu
  • usr: percentage of CPU time consumed by user processes

When the value of us is relatively high, it means that the user process consumes a lot of CPU time, but if the usage exceeds 50% for a long time, then we should consider optimizing the program algorithm or speeding up (such as  PHP / Perl )

  • sys: percentage of CPU time consumed by kernel processes 

When the value of sys is high, it means that the system kernel consumes a lot of CPU resources, which is not a benign performance, and we should check the reason.

  • wai: percentage of CPU time consumed by IO waiting
When the value of wa is high, it means that the IO wait is serious, which may be caused by a large number of random accesses on the disk, or it may be a bottleneck (block operation) in the bandwidth of the disk.
  • idl: percentage of time the CPU is idle

4. Advanced usage of dstat

The functions of dstat are very powerful. In addition to the above common usages, there are also some advanced usages that are not commonly used by everyone, as follows:

3.1 Find out the processes and users that consume the most resources

--top-(io|bio|cpu|cputime|cputime-avg|mem) Through these options, you can see which user and process occupy related system resources, which is very effective for system tuning. For example, you can use dstat --top-mem --top-io --top-cpu to view the process information currently occupying the highest I/O, cpu, memory, etc.:

<iframe id="iframe_0.06976359244436026" style="margin: 0px; padding: 0px; border-style: none; border-width: initial; width: 675px; height: 409px;" src="data:text/html;charset=utf8,%3Cimg%20id=%22img%22%20src=%22http://www.toxingwang.com/wp-content/uploads/2013/07/dstat%25E9%25AB%2598%25E7%25BA%25A7%25E7%2594%25A8%25E6%25B3%2595.jpg?_=3358194%22%20style=%22border:none;max-width:1576px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.06976359244436026',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no"></iframe>

3.2 Obtain other application information:

In addition to obtaining system key information, dstat can also obtain other application information. For example, through the following options, you can obtain other common application information:

--postfix show postfix queue size

--sendmail show sendmail queue size

--ntp show ntp server time

--nfs3 Get nfs client information

--nfsd3 Get the nfs server information, but the nfs server version needs to be the third version. There are more uses for this option, you can refer to the man help to get

--mysql5-(cmds|conn|io|keys) Get mysql5 related information

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326443314&siteId=291194637