Seven weekly lessons (May 4th) Use w to view system load, vmstat command, top command, sar command, nload command

10.1 Use w to view system load

1. Use w to view the system load

21:08:54 is the system time, up 5 days is the startup time, on the TTY side, if it is a network login, it will be displayed as pts/0 or pts/1 

load average : 0.00,0.01,0.05 . The meaning of these three values ​​is related to the CPU, which means the number of CPU active processes in a unit time period. Of course, the larger the value, the greater the pressure on your server. Under normal circumstances, this value does not matter as long as it does not exceed the number of CPUs on the server. If the number of CPUs on the server is 8, then if the value is less than 8, it means that the current server is not under pressure, otherwise, you should pay attention. It is not good for these values ​​to be 0, indicating that the server is idle, which is too wasteful. So when are these values ​​optimal? At this time, you need to check the number of CPUs. The number of CPUs here refers to logical CPUs, not physical CPUs. Use the command cat /proc/cpuinfo as shown below:

0 means there is only one logical CPU. When the first value on the load average side is 1, it is ideal, neither empty nor too much load. The other two values ​​are the same, generally the first value is the most concerned. As long as the value doesn't exceed the number of CPUs, there's not much of a problem. '/proc/cpuinfo' This file records detailed information about the cpu. At present, the servers on the market usually have 2 4-core CPUs. In Linux, it is 8 CPUs. When viewing this file, 8 similar pieces of information will be displayed, and the last piece of information is processor : followed by '7' So to check how many CPUs the current system has, we can use this command: grep -c 'processor' /proc /cpuinfo , and how to see how many physical cpus, you need to check the keyword "physical id", because my virtual machine has only one cpu, so the information about "physical id" is not displayed.

There is also a command uptime, see the figure below,

The result is exactly the same as the first line of the w command, so the w command is generally used to view

 10.2 The vmstat command

The w mentioned above refers to the overall load of the system. By looking at these values, you can know whether the current system is under pressure, but it is impossible to judge where there is pressure (CPU, memory, disk, etc.). At this time, you need to use the vmstat command to know where the pressure is. The results printed by the vmstat command are divided into 6 parts: procs, memory, swap, io, system, cpu, as shown in the figure below.

1) procs displays process-related information
r: indicates the number of processes running and waiting for CPU time slices. If it is greater than the number of server CPUs for a long time, it means that the CPU is not enough;
b: indicates the number of processes waiting for resources, such as waiting for I/O , memory, etc. If the value of this column is greater than 1 for a long time, you need to pay attention;
2) memory memory related information
swpd: indicates the amount of memory switched to the swap partition;
free: the current amount of free memory;
buff: buffer size , (to be written to disk);
cache: cache size, (read from disk);
3) swap memory swap situation
si: the amount of data written to the memory by the swap area;
so: written from the memory to the swap
4) io disk usage bi: the amount of
data read from the block device (read disk);
bo: the amount of data written from the block device (write disk);
5) system displays the data that occurred within the collection interval Interrupt times
in: Indicates the number of device interrupts per second observed in a certain time interval;
cs: Indicates the number of context switches generated per second;
6) The CPU displays the cpu usage status
us: displays the cpu time spent by the user Percentage;
sy: Displays the percentage of CPU time spent by the system;
id: Indicates the percentage of time that the CPU is in an idle state;
wa: Indicates the percentage of CPU time occupied by I/O waiting;
st: Indicates the percentage of stolen CPUs (usually 0, don't care);

Among the parameters introduced above, we often pay attention to the r column, the b column, and the wa column. The meaning of the three columns has been clearly stated above. The bi and bo of the IO part are also objects to be frequently referred to. If the disk io pressure is very high, the values ​​of these two columns will be relatively high. In addition, when the values ​​of the si and so columns are relatively high and keep changing, it means that the memory is not enough, and the data in the memory is frequently exchanged to the swap partition, which often has a great impact on the system performance.
When we use vmstat to view the system status, we usually look at it in the form of the following figure:

What mstat 1 shows is that the status is printed every 1 second, until we end by pressing Ctrl + c. There is another way, see below

vmstat 1 5 means to print the status every one second, a total of 5 times, and then the command ends automatically. In the results displayed here, we generally only need to focus on these columns: r, b, swpd, si, so, bi, bo, us, wa. r(run) indicates how many processes are running. b (block) means that the process is blocked by resources other than the CPU (such as hard disk, network) and is in a waiting state. swpd swap partition. When the memory is not enough, the system will release some data in the memory and temporarily put it in the swpd space. When the data in this column does not change, it means that there is no problem. If the data in this column has been If it is beating, it means that the memory is not enough. The two columns of data si and so are related to swpd, and their units are KB, si is the amount of data written to memory by swpd, so is the amount of data written to swpd from memory, i is in, o is out. The two columns of data bi and bo are related to the disk, bi is the amount of data read from the block device (read disk), bo is the amount of data written from the block device (write disk), these two data are very large If so, it means that the disk is being read and written frequently. The data of bi and bo are very large, which will inevitably cause the data of column b to increase. us represents the user level, and shows the percentage of CPU time spent by the user. This data will not exceed 100. If the number is greater than 50 for a long time, it also means that the resources are not enough. us + sy (system spend CPU percentage) + id (idle percentage) = 100. wa (wait waiting) is similar to b, indicating the percentage of waiting for the CPU, that is, how many processes are waiting for the CPU. If this number is large, it means that the CPU is not enough.

10.3 top command

View the resource usage of the process. This command is used to dynamically monitor the system resources occupied by the process, which changes every 3 seconds. Its characteristic is to put the processes that occupy the highest system resources (CPU, memory, disk IO, etc.) to the front. The top command prints out a lot of information, including system load (loadaverage), number of processes (Tasks), cpu usage, memory usage, and swap usage. In fact, the above content can be viewed through other commands, so the focus of using top is to view the details of the system resources used by the following processes. This part of things reflects a lot of things, but you need to pay attention to a few items: %CPU, %MEM, COMMAND. RES is the memory size of the process, and %MEM is the percentage of memory used. In the top state, press "shift + m" to sort by memory usage. Press the number '1' to list the usage status of each cpu.
Enter top, press Enter, see the figure below,

The default is to sort by CPU percentage (%CPU), with the highest %CPU value in the first place. %MEM is the percentage of memory used, and RES is the physical memory size in K. If I want to sort by %MEM now, what should I do? Press shift + m, which is a capital M, see the figure below,

Now it is sorted by the size of %MEN, and the first one is system. Now if you want to switch back to the default %CPU sorting, just press shift+p, which is an uppercase P .

There is also an option, the number 1, press the number '1' to list the usage status of each cpu, see the figure below,

%CPU0 is single core, so there is only one line. Press 1 again to revert to the default state, which is viewing the average by default.
Press the letter q to exit the top view state.
Another usage is top -c, enter top -c, press Enter, see the figure below,

We can see the process of the specific command and see all the path names. If you simply use the command top, you can only view the name of the last process.
There is another usage, enter the command top -bn1, press Enter, see the figure below,

The above figure lists all the processes at one time and displays them statically. This command is suitable for use when writing shell scripts.
To terminate a process here, use the command kill+PID number and press Enter.

10.4 The sar command

The sar command is very powerful, it can monitor all the resource status of the system, such as load average, network card traffic, disk status, memory usage, etc. It is different from other system status monitoring tools in that it can print historical information and display system status information from zero to the current moment of the day. If this command is not installed on your system, please use yum install -y sysstat command to install it. The first time you use the sar command, an error will be reported, because the sar tool has not yet generated the corresponding database file (it will not be monitored from time to time, because there is no need to query that library file). Its database files are stored in the "/var/log/sa/" directory and are saved for one month by default. Because this command is too complicated, only a few are introduced here.

Install sar: yum install -y sysstat

Just installed the sar command, all executed the command ls /var/log/sa, there is no result, it takes ten minutes to generate a file, because a file is generated every ten minutes. So you need to add specific options and parameters to the sar command, see the figure below,

1 means every 1 second, 10 means display 10 times. The first column is the time, the second column is the name of the network card (IFACE) means the device name, the third column rxpck/s means the number of incoming packets per second, the first Column 4 txpck/s indicates the number of packets sent per second, column 5 rxkb/s indicates the amount of data received per second (in Byte), and column 6 txkb/s indicates the amount of data sent per second. The last 3 columns do not need attention, they are always 0.00. If one day the packet loss of the server you manage is very serious, then you should check whether the network card traffic is abnormal. If the value in the rxpck/s column is greater than 4000, or the rxkb/s column is greater than 5,000,000, it is very likely It is attacked, and the normal server network card traffic will not be higher than this much, unless you are copying the data yourself.

Then look at the sar command, see the figure below,

Then check ls /var/log/sa, there is a file sa25, the sa file is named after the date of the day, and you can also view the network card traffic history of a certain day, use the -f option, followed by the file name

sar -n DEV -f /./var/log/sa/sa24

View system load

sar -q 1 10

View system disk

sar  -b 

View disk read and write

sar -b 1 5

10.5 nload command

Monitor network card traffic. This command is not installed by default. Before installing nload, you need to install epel-release. After the installation is complete, then go down and enter the command nload directly, press Enter, and enter the following figure.

yum install epel-release

yum install  nload

The information shown in the above picture is dynamic, this is just one of the network cards ens33, press the right arrow key → , you can view the information of the other network card lo, see the figure below,

Press the left arrow key ←, you can go back to the previous network card ens33, you can switch back and forth. Press the letter q to exit this interface

Summary:
w View the system load
date View the current date and time
vmstat n Print the status
every n seconds vmstat nm Print the status every n seconds, a total of m times
top View the process usage
uppercase M %MEN memory sorting
uppercase P %CPU sort.
Number 1 to view the usage status of each specified CPU
Letter q to exit top to view the status
top -c to view the process of a specific command top -bn1 to statically display all processes, suitable for applying sar -n DEV to view the network card traffic history
when writing shell scripts sar -q View system load sar -b View system disk
nload Monitor network card traffic


 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325406971&siteId=291194637