快速排除Linux故障

Linux是各种服务器甚至各种基础设施的关键载体。对于Linux的维护者或者说使用者,快速检测其故障原因至关重要。

一、检测硬件相关信息

首先我们要检测硬件的相关信息,排除硬件故障才可以进一步去检测程序运行错误。

可以使用lsblk,lscpu来输出硬件信息,这里我们使用lsblk来举例

lmh@ubuntu:~$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
fd0      2:0    1     4K  0 disk 
loop0    7:0    0  44.9M  1 loop /snap/gtk-common-themes/1440
loop1    7:1    0  14.8M  1 loop /snap/gnome-characters/399
loop2    7:2    0  91.3M  1 loop /snap/core/8592
loop3    7:3    0  54.7M  1 loop /snap/core18/1668
loop4    7:4    0   3.7M  1 loop /snap/gnome-system-monitor/127
loop5    7:5    0   4.2M  1 loop /snap/gnome-calculator/544
loop6    7:6    0  91.4M  1 loop /snap/core/8689
loop7    7:7    0  14.8M  1 loop /snap/gnome-characters/296
loop8    7:8    0   3.7M  1 loop /snap/gnome-system-monitor/100
loop9    7:9    0  1008K  1 loop /snap/gnome-logs/61
loop10   7:10   0 160.2M  1 loop /snap/gnome-3-28-1804/116
loop11   7:11   0  42.8M  1 loop /snap/gtk-common-themes/1313
loop12   7:12   0   956K  1 loop /snap/gnome-logs/81
loop13   7:13   0 149.9M  1 loop /snap/gnome-3-28-1804/67
loop14   7:14   0  54.4M  1 loop /snap/core18/1066
loop15   7:15   0     4M  1 loop /snap/gnome-calculator/406
sda      8:0    0    70G  0 disk 
└─sda1   8:1    0    70G  0 part /
sr0     11:0    1     2G  0 rom  /media/lmh/Ubuntu 18.04.3 LTS amd641
sr1     11:1    1     2G  0 rom  /media/lmh/Ubuntu 18.04.3 LTS amd64

一般这时候我们就可以查看到相关硬件错误。

二、从日志中发现错误和警告

Linux系统在运行时会储存日常运行的日志,我们可以通过日志来分析错误原因。使用dmesg | more可以查看日志中的报错和警告

lmh@ubuntu:~$ dmesg | more
[    0.000000] Linux version 5.3.0-40-generic (buildd@lcy01-amd64-024) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 (Ubuntu 5.3.0-40.32~18.04.1-ge
neric 5.3.18)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.3.0-40-generic root=UUID=e7ca2622-528b-400f-9b21-ac56ff834cd2 ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Hygon HygonGenuine
[    0.000000]   Centaur CentaurHauls
[    0.000000]   zhaoxin   Shanghai  
[    0.000000] Disabled fast string operations
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] BIOS-provided physical RAM map:

三、分析网络正常与否

Linux作为以网络为中心的系统,分析其网络连接正常与否也是我们一大检查点。可以使用ip addr、dig、ping等来分析网络情况。我们使用ping localhost来分析网络

lmh@ubuntu:~$ ping localhost 
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.033 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.029 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.037 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.035 ms
64 bytes from localhost (127.0.0.1): icmp_seq=5 ttl=64 time=0.032 ms
64 bytes from localhost (127.0.0.1): icmp_seq=6 ttl=64 time=0.035 ms
64 bytes from localhost (127.0.0.1): icmp_seq=7 ttl=64 time=0.038 ms

发布了45 篇原创文章 · 获赞 63 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/Groot_Lee/article/details/104930946