System Performance Monitoring (2)

System Performance Monitoring (2)

Memory

see memory usage

[root@xiaoyang ~] free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        240M        2.8G         11M        705M        3.2G
Swap:          2.0G          0B        2.0G
[root@xiaoyang ~]# 

Mem physical memory

Swap swap partition

total = used+free+shared+buff/cache

buffer: the place where data is stored at zero time in memory data from memory to disk -> write operation

cache: In the memory is the place where data is stored at zero time data from disk to memory -> read operation

available: free space free + free space in buff/cache that can be provided to the next process


Swap When the physical memory is insufficient, it will be transferred from the disk

This is the swap partition. Take out a piece of space from the disk to use as memory. There was an unspoken rule in the past: the swap partition is recommended to be set to twice the physical memory.

[root@xiaoyang ~] cat /proc/sys/vm/swappiness
30
[root@xiaoyang ~]# 

virtual memory virtual memory = physical memory + swap partition

swappiness Use the swap partition when only 30% of the physical memory is left

swappiness

Modification at zero time

[root@xiaoyang ~] cat  /proc/sys/vm/swappiness
30
[root@xiaoyang ~] echo 0 >/proc/sys/vm/swappiness  当物理内存只剩下0的时候就使用交换分区
[root@xiaoyang ~] cat  /proc/sys/vm/swappiness
0
[root@xiaoyang ~]# 

permanent modification

[root@xiaoyang ~] vim /etc/sysctl.conf 
[root@xiaoyang ~] cat /etc/sysctl.conf 
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
vm.swappiness=0
[root@xiaoyang ~] sysctl -p
vm.swappiness = 0
[root@xiaoyang ~]#

See memory information

cat /proc/meminfo

Clear it, buffer /cache cache

insert image description here
3 works well

Empty the cache:

[root@xiaoyang ~] sync ; echo 3 >/proc/sys/vm/drop_caches
[root@xiaoyang ~] free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        231M        3.4G         11M         52M        3.3G
Swap:          2.0G          0B        2.0G
[root@xiaoyang ~]# 

The above behavior clears the data in the cache, which is risky and will lead to data loss

dstat, you can see page in/out

[root@xiaoyang ~] dstat
You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   0 100   0   0   0|  13k 1497B|   0     0 |   0     0 |  59   104 
  0   0 100   0   0   0|   0     0 |  60B  818B|   0     0 |  60    97 
  0   1 100   0   0   0|   0     0 |  60B  338B|   0     0 |  53    83 
  0   0 100   0   0   0|   0     0 | 120B  398B|   0     0 |  54    87 

The int in system is the number of interruptions

csw is context switch

dstat -ma can see a lot of information

see disk, cpu etc.

[root@xiaoyang ~] dstat -ma
------memory-usage----- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
 used  buff  cach  free|usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
 252M    0  45.9M 3473M|  0   0 100   0   0   0|  13k 1481B|   0     0 |   0     0 |  59   103 
 252M    0  45.9M 3473M|  0   0 100   0   0   0|   0     0 |  60B 1034B|   0     0 |  59    96 
 252M    0  45.9M 3473M|  0   0 100   0   0   0|   0     0 |  60B  418B|   0     0 |  55    91 
 252M    0  45.9M 3473M|  0   0 100   0   0   0|   0     0 |  60B  418B|   0     0 |  72   105 
 252M    0  45.9M 3473M|  0   0 100   0   0   0|   0     0 |  60B  418B|   0     0 |  68    97 

Find the top ten processes consuming the most CPU, and memory

There are two methods, one is top ---> P(cpu)

M —-> (memory)

The second type:

ps aux|more

[root@xiaoyang ~]# ps aux|more
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.1 125384  3896 ?        Ss   16:44   0:01 /usr/lib/systemd/systemd --switched-root --system --deserialize
 22
root          2  0.0  0.0      0     0 ?        S    16:44   0:00 [kthreadd]
root          4  0.0  0.0      0     0 ?        S<   16:44   0:00 [kworker/0:0H]
root          5  0.0  0.0      0     0 ?        S    16:44   0:00 [kworker/u256:0]
root          6  0.0  0.0      0     0 ?        S    16:44   0:00 [ksoftirqd/0]
root          7  0.0  0.0      0     0 ?        S    16:44   0:00 [migration/0]
root          8  0.0  0.0      0     0 ?        S    16:44   0:00 [rcu_bh]
root          9  0.0  0.0      0     0 ?        S    16:44   0:00 [rcu_sched]
root         10  0.0  0.0      0     0 ?        S<   16:44   0:00 [lru-add-drain]
root         11  0.0  0.0      0     0 ?        S    16:44   0:00 [watchdog/0]
root         12  0.0  0.0      0     0 ?        S    16:44   0:00 [watchdog/1]
root         13  0.0  0.0      0     0 ?        S    16:44   0:00 [migration/1]
root         14  0.0  0.0      0     0 ?        S    16:44   0:00 [ksoftirqd/1]
root         16  0.0  0.0      0     0 ?        S<   16:44   0:00 [kworker/1:0H]
root         18  0.0  0.0      0     0 ?        S    16:44   0:00 [kdevtmpfs]
root         19  0.0  0.0      0     0 ?        S<   16:44   0:00 [netns]
root         20  0.0  0.0      0     0 ?        S    16:44   0:00 [khungtaskd]
root         21  0.0  0.0      0     0 ?        S<   16:44   0:00 [writeback]
root         22  0.0  0.0      0     0 ?        S<   16:44   0:00 [kintegrityd]
root         23  0.0  0.0      0     0 ?        S<   16:44   0:00 [bioset]
root         24  0.0  0.0      0     0 ?        S<   16:44   0:00 [bioset]
root         25  0.0  0.0      0     0 ?        S<   16:44   0:00 [bioset]
root         26  0.0  0.0      0     0 ?        S<   16:44   0:00 [kblockd]
root         27  0.0  0.0      0     0 ?        S<   16:44   0:00 [md]
root         28  0.0  0.0      0     0 ?        S<   16:44   0:00 [edac-poller]
[root@xiaoyang ~]# 
[root@xiaoyang ~] ps aux|tail -n +2|sort -k3 -rn|head
root        675  0.1  0.1 273192  4884 ?        Ssl  16:44   0:16 /usr/bin/vmtoolsd
root        980  0.0  0.1 222740  4644 ?        Ssl  16:44   0:01 /usr/sbin/rsyslogd -n
root        977  0.0  0.5 574284 19468 ?        Ssl  16:44   0:02 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root        976  0.0  0.1 112900  4344 ?        Ss   16:44   0:00 /usr/sbin/sshd -D
root          9  0.0  0.0      0     0 ?        S    16:44   0:00 [rcu_sched]
root          8  0.0  0.0      0     0 ?        S    16:44   0:00 [rcu_bh]
root        783  0.0  0.1 102904  5540 ?        S    16:44   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-ens33.pid -lf /var/lib/NetworkManager/dhclient-9cabfa34-d47b-4024-85ee-2ed52a00621c-ens33.lease -cf /var/lib/NetworkManager/dhclient-ens33.conf ens33
root        704  0.0  0.0  99208  2708 ?        Ss   16:44   0:00 login -- root
root        701  0.0  0.0 126388  1628 ?        Ss   16:44   0:00 /usr/sbin/crond -n
root          7  0.0  0.0      0     0 ?        S    16:44   0:00 [migration/0]
[root@xiaoyang ~]# ps aux|tail -n +2|sort -k3 -rn|head |awk '{print $11,$3}'
/usr/bin/vmtoolsd 0.1
/usr/sbin/rsyslogd 0.0
/usr/bin/python2 0.0
/usr/sbin/sshd 0.0
[rcu_sched] 0.0
[rcu_bh] 0.0
/sbin/dhclient 0.0
login 0.0
/usr/sbin/crond 0.0
[migration/0] 0.0


insert image description here

insert image description here

What are the consequences of 100% cpu usage and 100% memory usage?

Stuck and stopped

Abnormal business program: old users freeze, new users cannot connect

network

port scanning tool

[root@xiaoyang ~]# yum install nc nmap fping telnet -y

Install first

nc

[root@xiaoyang ~]# nc -z 8.219.110.232 22 Do not transmit data to the other party

-z Zero-I/O mode, report connection status only Report whether it can be connected

Then look at echo $? to see the return value

-w, --wait

[root@xiaoyang ~] nc -z -w 1  8.219.110.232 2233
[root@xiaoyang ~] echo $?
1
[root@xiaoyang ~]# 

You can also test Baidu

nc -z -w 1 www.baidu.com 80

echo $?

nmap

[root@xiaoyang lianxi] nmap 192.168.209.143

Starting Nmap 6.40 ( http://nmap.org ) at 2023-05-08 22:41 CST
Nmap scan report for 192.168.209.143
Host is up (0.0000030s latency).
Not shown: 999 closed ports
PORT   STATE SERVICE
22/tcp open  ssh

Nmap done: 1 IP address (1 host up) scanned in 2.03 seconds
[root@xiaoyang lianxi]#

fping

It can detect which IPs in a network segment are in use and which ones are not used

[root@xiaoyang lianxi] fping -g 192.168.209.0/24

curl

[root@xiaoyang lianxi] curl www.baidu.com

telnet

[root@xiaoyang lianxi] telnet www.baidu.com 80
Trying 14.119.104.254...
Connected to www.baidu.com.
Escape character is '^]'.
^Cq^C^C
Connection closed by foreign host.
[root@xiaoyang lianxi]#

Look at other people's ports

nc
nmap
talent

look at your port

netstat
lsof
ss

Network traffic

ethtool View the speed matching between the local network card and other devices

ethtool ens33

dstat—"Display CPU usage, memory usage, disk I/O, network traffic

dstat is a command-line tool for monitoring system performance and resource usage. It can display CPU usage, memory usage, disk I/O, network traffic, and more. dstat can help system administrators quickly locate system performance bottlenecks and resource bottlenecks, thereby optimizing system performance.

dstat -ma

-N can specify which port to look at

dstat -N ens33 22

iftop —” used to monitor network traffic and bandwidth usage

iftop is a network monitoring tool that can be used to monitor network traffic and bandwidth usage. It displays the transmission status of the network interface in real time, which can help users quickly identify network bottlenecks and abnormal traffic. iftop can run on operating systems such as Linux, Unix, and macOS.

yum install iftop -y
[root@xiaoyang lianxi] iftop
interface: ens33
IP address is: 192.168.209.143
MAC address is: 00:0c:29:9f:59:8a
[root@xiaoyang lianxi]#

glances—"monitor CPU, memory, disk, network, process

glances is a cross-platform real-time system and resource monitoring tool, which can be used to monitor the usage of system resources such as CPU, memory, disk, network, and process. glances can be used in the terminal or monitored through the web interface. It supports multiple operating systems, including Linux, Windows, macOS, and more.

yum install glances -y

[root@xiaoyang lianxi]# glances

nethogs —> monitor network traffic

nethogs is a command-line tool on Linux for monitoring network traffic, which can display the network bandwidth usage of each process in real time. It can filter and sort by process, user, protocol, etc., to help users understand which processes in the system are occupying network bandwidth and the bandwidth they occupy.
yum install nethogs -y

sz,rz—"send and transfer files

sz is to send files to windows

rz is to receive the file

[root@xiaoyang lianxi] rz

[root@xiaoyang lianxi] ls
1.txt           hehaotian.txt         lu           monitor.sh   taohuadao
2.txt           hehaotian.txt,bak     lu2          name.txt     test_big_file2.txt
3-7             hehaotian.txt.bakkup  lu3          nginx.log    test_big_file.txt
big_file.sh     hehaotian.txt=SUFFIX  lu4          nohup.out    user_pwd.txt
bill.txt        hengshan              lu8          passwd       web.txt
create_user.sh  honghuamiji           lu.c         sc.txt       wulin
gaohui.sh       ifcfg-ens33           mail.txt     sshd_config  啦啦啦(1).txt
gaohui.txt      ip.txt                monitor.log  state.txt
[root@xiaoyang lianxi] sz monitor.sh
[root@xiaoyang lianxi]# 

some problems

Where can the speed of the network be determined?

Maximum bandwidth on each link

Switches and routers can limit the speed

Firewall can limit the speed

1. The speed of the hardware itself

2. Whether to limit

If you know which ports are open on this machine

Enter ss directly

​ ss -anplut

​ ss -an


netstat -anplut

[root@xiaoyang ~] netstat -anplut
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      976/sshd            
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      1212/master         
tcp        0     36 192.168.209.143:22      192.168.209.1:53099     ESTABLISHED 2004/sshd: root@pts 
tcp        0      0 192.168.209.143:22      192.168.209.1:53097     ESTABLISHED 1985/sshd: root@pts 
tcp6       0      0 :::22                   :::*                    LISTEN      976/sshd            
tcp6       0      0 ::1:25                  :::*                    LISTEN      1212/master         
udp        0      0 0.0.0.0:68              0.0.0.0:*                           783/dhclient        
udp        0      0 127.0.0.1:323           0.0.0.0:*                           686/chronyd         
udp6       0      0 ::1:323                 :::*                                686/chronyd         
[root@xiaoyang ~]# 

If not, install net-tools

lsof command

[root@xiaoyang ~] lsof -i:22
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
sshd     976 root    3u  IPv4  20857      0t0  TCP *:ssh (LISTEN)
sshd     976 root    4u  IPv6  20859      0t0  TCP *:ssh (LISTEN)
sshd    1985 root    3u  IPv4  33978      0t0  TCP xiaoyang:ssh->192.168.209.1:53097 (ESTABLISHED)
sshd    2004 root    3u  IPv4  34028      0t0  TCP xiaoyang:ssh->192.168.209.1:53099 (ESTABLISHED)
[root@xiaoyang ~]# lsof -p 884

See which processes access this file

[root@xiaoyang ~] lsof /root
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
bash    1443 root  cwd    DIR  253,0     4096 33574977 /root
bash    1987 root  cwd    DIR  253,0     4096 33574977 /root
bash    2006 root  cwd    DIR  253,0     4096 33574977 /root
lsof    2387 root  cwd    DIR  253,0     4096 33574977 /root
lsof    2388 root  cwd    DIR  253,0     4096 33574977 /root
[root@xiaoyang ~]# 

See if a service is running by port

View monitoring status:

[root@xiaoyang ~] netstat -anplut|grep LISTEN
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      976/sshd            
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      1212/master         
tcp6       0      0 :::22                   :::*                    LISTEN      976/sshd            
tcp6       0      0 ::1:25                  :::*                    LISTEN      1212/master         
[root@xiaoyang ~]#

How to see the network traffic of each port of the switch, the router

use monitoring software

zabbix

Prometheus

You need to enable the snmp protocol simple network management protocol on routers and switches

ethtool ens33

How do I know if my NIC is 100M or 1000M?

Speed: 1000Mb/s

ethtool ens33

[root@xiaoyang ~] ethtool ens33
Settings for ens33:
	Supported ports: [ TP ]
	Supported link modes:   10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Full 
	Supported pause frame use: No
	Supports auto-negotiation: Yes
	Supported FEC modes: Not reported
	Advertised link modes:  10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Full 
	Advertised pause frame use: No
	Advertised auto-negotiation: Yes
	Advertised FEC modes: Not reported
	Speed: 1000Mb/s
	Duplex: Full
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	Auto-negotiation: on
	MDI-X: off (auto)
	Supports Wake-on: d
	Wake-on: d
	Current message level: 0x00000007 (7)
			       drv probe link
	Link detected: yes

disk

Read and write speed IO speed input and output

​ tps

​ iops

sar

The sar command is a system performance analysis tool, which is used to collect various performance data of the system, such as CPU usage, memory usage, disk I/O, network traffic, etc., and record and count these data at specified time intervals. So that users can better understand the performance status of the system, so as to optimize the system performance.

[root@xiaoyang lianxi] sar -d 1 3
Linux 3.10.0-1160.el7.x86_64 (xiaoyang) 	2023年05月08日 	_x86_64_	(2 CPU)

23时05分59秒       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
23时06分00秒   dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分00秒    dev8-0    179.00      0.00 171960.00    960.67      0.08      0.46      0.44      7.90
23时06分00秒  dev253-0    179.00      0.00 171960.00    960.67      0.08      0.46      0.44      7.90
23时06分00秒  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

23时06分00秒       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
23时06分01秒   dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分01秒    dev8-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分01秒  dev253-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分01秒  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

23时06分01秒       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
23时06分02秒   dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分02秒    dev8-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分02秒  dev253-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
23时06分02秒  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

平均时间:       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
平均时间:   dev11-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
平均时间:    dev8-0     59.67      0.00  57320.00    960.67      0.03      0.46      0.44      2.63
平均时间:  dev253-0     59.67      0.00  57320.00    960.67      0.03      0.46      0.44      2.63
平均时间:  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
[root@xiaoyang lianxi]# 

tps is the number of transmissions per second

The higher the busyness of %util, the busier the disk is

tps read and write speed per second

100% of the time, the machine freezes

iostat—"View the read and write speed and busyness of disk io

iostat -x View the read and write speed and busyness of disk io

[root@xiaoyang lianxi] iostat -x
Linux 3.10.0-1160.el7.x86_64 (xiaoyang) 	2023年05月08日 	_x86_64_	(2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.01    0.00    0.07    0.00    0.00   99.92

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
scd0              0.00     0.00    0.00    0.00     0.04     0.00   114.22     0.00    1.39    1.39    0.00   0.94   0.00
sda               0.00     0.02    0.40    0.23    14.80     8.29    72.66     0.00    0.47    0.48    0.46   0.27   0.02
dm-0              0.00     0.00    0.31    0.25    13.41     8.20    77.18     0.00    0.54    0.60    0.48   0.29   0.02
dm-1              0.00     0.00    0.00    0.00     0.09     0.00    50.09     0.00    0.14    0.14    0.00   0.09   0.00

[root@xiaoyang lianxi]# 

Look at the usage of disk space, df -Th

[root@xiaoyang ~] dd if=/dev/zero of=/test.dd bs=1M count=1000
记录了1000+0 的读入
记录了1000+0 的写出
1048576000字节(1.0 GB)已复制,0.44042 秒,2.4 GB/秒
[root@xiaoyang ~] df -Th
文件系统                类型      容量  已用  可用 已用% 挂载点
devtmpfs                devtmpfs  1.9G     0  1.9G    0% /dev
tmpfs                   tmpfs     1.9G     0  1.9G    0% /dev/shm
tmpfs                   tmpfs     1.9G   12M  1.9G    1% /run
tmpfs                   tmpfs     1.9G     0  1.9G    0% /sys/fs/cgroup
/dev/mapper/centos-root xfs        17G  4.8G   13G   28% /
/dev/sda1               xfs      1014M  151M  864M   15% /boot
tmpfs                   tmpfs     378M     0  378M    0% /run/user/0
[root@xiaoyang ~]# ^C

[root@xiaoyang ~] rm -rf test.dd   删除产生的零时大文件
[root@xiaoyang ~] df -Th
文件系统                类型      容量  已用  可用 已用% 挂载点
devtmpfs                devtmpfs  1.9G     0  1.9G    0% /dev
tmpfs                   tmpfs     1.9G     0  1.9G    0% /dev/shm
tmpfs                   tmpfs     1.9G   12M  1.9G    1% /run
tmpfs                   tmpfs     1.9G     0  1.9G    0% /sys/fs/cgroup
/dev/mapper/centos-root xfs        17G  4.8G   13G   28% /
/dev/sda1               xfs      1014M  151M  864M   15% /boot
tmpfs                   tmpfs     378M     0  378M    0% /run/user/0
[root@xiaoyang ~]# 

lsblk —> can see how big the disk is

Only see how big the disk is

df -Th --> view disk capacity

View disk capacity

iotop ----> see which process is reading and writing a lot of disk processes

You can see which process is reading and writing a lot of disk processes

in conclusion

cpu

Memory

network bandwidth

Disk capacity and IO (tps/iops)

what is tps?

The speed of access per second (read, write)

What is %util?

busyness

User access to our server is very slow, please analyze the reason and how to troubleshoot?

If he accesses a third party, fast, then that's our problem

see cpu

see memory

look at the bandwidth

Look at the IO read and write space and capacity

Which command to use to see at once —> glances

If everything is normal, it is a problem with the load balancer or the operator

Check the port number

Use the command "netstat -tunlp" to view the current usage of all ports and the corresponding processes. Among them, "-t" means TCP protocol, "-u" means UDP protocol, "-n" means not to use domain name resolution, "-l" means to list only the listening port, "-p" means to display process name and process ID .

Guess you like

Origin blog.csdn.net/investor_/article/details/130786191