分析和存储日志

分析和存储日志

在这里插入图片描述

描述系统日志架构

系统日志记录

进程和操作系统内核为发生的事件记录日志,用于系统审核和问题的故障排除。

一般以文本形式存储在/var/log目录下

systemd-journald服务是操作系统事件日志架构核心。收集许多来源的事件消息,并写进带有索引的结构化系统日志中。默认情况下,该日志存储在系统重启后不保留的文件系统上。

rsyslog服务会从日志读取systemd-journald收到syslog消息。之后,将处理syslog时间,将记录到日志文件中,根据自己配置将它们转发给其他服务。rsyslog会对syslog消息进行排序,写入重启后不保留的日志文件中。rsyslog服务会根据发送每条消息的程序类型或设备以及每条syslog消息的优先级,将日志消息排序到特定的日志文件。

查看系统日志文件

许多程序使用syslog协议将时间记录到系统,每一日志消息根据设备和优先级分类。rsyslog.conf man page对可用功能进行阐述。

rsyslog服务使用日志消息的设备和优先级确定如何处理。配置规则位于/etc/rsyslog.conf/etc/rsyslog.d 目录拓展名为.conf的文件。
在这里插入图片描述

每行左侧表示与规则匹配的syslog消息的设备和严重性,右侧表示要将日志消息保存到的文件。

监控一个或多个日志文件的事件有助于重现问题。tail -f /path/to/file 输出指定文件最后10行。

监控一个失败登录的尝试在一个终端运行tail命令,另一个终端以root身份运行ssh


[root@workstation ~]# ssh root@localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:iVGX9mvT0cLZlAtI8EK2LJGJR8NpIvjkJYCdg//tZB4.
ECDSA key fingerprint is MD5:81:21:db:98:dd:a2:c4:d5:39:72:a2:6e:57:2c:16:6b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
root@localhost's password:
Permission denied, please try again.


[root@workstation ~]# tail -f /var/log/secure
Jul 24 02:01:31 workstation sshd[7120]: Server listening on 0.0.0.0 port 22.
Jul 24 02:01:31 workstation sshd[7120]: Server listening on :: port 22.
Jul 24 02:04:35 workstation sshd[7389]: Accepted password for root from 192.168.182.1 port 55567 ssh2
Jul 24 02:04:35 workstation sshd[7389]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 02:04:35 workstation sshd[7393]: Accepted password for root from 192.168.182.1 port 55569 ssh2
Jul 24 02:04:36 workstation sshd[7393]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 02:12:36 workstation sshd[7417]: Accepted password for root from 192.168.182.1 port 55669 ssh2
Jul 24 02:12:36 workstation sshd[7417]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 02:12:36 workstation sshd[7421]: Accepted password for root from 192.168.182.1 port 55670 ssh2
Jul 24 02:12:36 workstation sshd[7421]: pam_unix(sshd:session): session opened for user root by (uid=0)
loJul 24 02:13:36 workstation unix_chkpwd[7451]: password check failed for user (root)
Jul 24 02:13:36 workstation sshd[7449]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=root
Jul 24 02:13:36 workstation sshd[7449]: pam_succeed_if(sshd:auth): requirement "uid >= 1000" not met by user "root"
Jul 24 02:13:38 workstation sshd[7449]: Failed password for root from ::1 port 39398 ssh2
Jul 24 02:14:24 workstation sshd[7449]: Connection closed by ::1 port 39398 [preauth]

手动发送syslog消息

logger命令可以发送消息到rsyslog服务。默认情况下,优先级为notice的消息发送给user设备,除非-p参数另有指定。


[root@workstation ~]# logger -p local7.notice "Log entry created on host"

[root@workstation ~]# tail -f -n 5 /var/log/boot.log
         Starting Wait for Plymouth Boot Screen to Quit...
[  OK  ] Started Command Scheduler.
         Starting Terminate Plymouth Boot Screen...
[  OK  ] Started NTP client/server.
[  OK  ] Started Load/Save RF Kill Switch Status of rfkill0.
Jul 24 02:20:14 workstation root: Log entry created on host

测试

通过添加rsyslog配置文件/etc/rayslog.d/debug.conf配置rsyslog,将任何服务优先级为debug或以上的所有消息记录到新/var/log/messages-debug日志文件中。

[root@workstation rsyslog.d]# pwd
/etc/rsyslog.d
[root@workstation rsyslog.d]# vi debug.conf
[root@workstation rsyslog.d]# cat debug.conf
*.debug /var/log/messages-debug

[root@workstation ~]# touch /var/log/messages-debug

[root@workstation ~]# systemctl restart rsyslog

验证优先级为debug的所有日志消息是否出现在/var/log/messages-debug文件


[root@workstation ~]# logger -p user.debug "Debug Message Test"


[root@workstation ~]# logger -p local7.notice "Log entry created on host"
[root@workstation ~]# tail -f -n 5 /var/log/messages-debug
Jul 24 02:29:26 workstation systemd: Stopped System Logging Service.
Jul 24 02:29:26 workstation systemd: Starting System Logging Service...
Jul 24 02:29:31 workstation rsyslogd: [origin software="rsyslogd" swVersion="8.24.0-34.el7" x-pid="7470" x-info="http://www.rsyslog.com"] start
Jul 24 02:29:31 workstation polkitd[6137]: Unregistered Authentication Agent for unix-process:7463:169671 (system bus name :1.24, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Jul 24 02:29:31 workstation systemd: Started System Logging Service.
Jul 24 02:31:44 workstation root: Debug Message Test

查看系统日志条目

查找事件

systemd-journald服务将日志数据存储在带有索引的结构化二进制文件中,该文件称为日志。

要从日志检索日志消息,使用journalctl命令。使用此命令可以查看日志中所有消息,根据各种选择和标准来搜素特定事件。


[root@workstation ~]# journalctl
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 02:31:44 EDT. --
Jul 24 02:01:10 servera systemd-journal[86]: Runtime journal is using 6.0M (max allowed 48.6M, trying to leave 72.9M free of
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuset
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpu
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuacct
Jul 24 02:01:10 servera kernel: Linux version 3.10.0-957.el7.x86_64 ([email protected]) (gcc version 4.8.5
Jul 24 02:01:10 servera kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-957.el7.x86_64 root=/dev/mapper/centos_servera-root
Jul 24 02:01:10 servera kernel: Disabled fast string operations
Jul 24 02:01:10 servera kernel: e820: BIOS-provided physical RAM map:
Jul 24 02:01:10 servera kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ebff] usable

journalctl命令突出显示重要日志消息:优先级为notice或warning的消息显示为粗体文本,而优先级为error或以上的消息显示为红色文本。

显示最后5条日志消息

[root@workstation ~]# journalctl -n 5
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 02:31:44 EDT. --
Jul 24 02:29:26 workstation systemd[1]: Starting System Logging Service...
Jul 24 02:29:31 workstation rsyslogd[7470]:  [origin software="rsyslogd" swVersion="8.24.0-34.el7" x-pid="7470" x-info="http
Jul 24 02:29:31 workstation polkitd[6137]: Unregistered Authentication Agent for unix-process:7463:169671 (system bus name :
Jul 24 02:29:31 workstation systemd[1]: Started System Logging Service.
Jul 24 02:31:44 workstation root[7475]: Debug Message Test

显示优先级为err或以上的日志条目

[root@workstation ~]# journalctl -p err
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 02:31:44 EDT. --
Jul 24 02:01:15 servera kernel: sd 2:0:0:0: [sda] Assuming drive cache: write through
Jul 24 02:01:15 servera kernel: sd 2:0:1:0: [sdb] Assuming drive cache: write through
Jul 24 02:01:18 workstation kernel: piix4_smbus 0000:00:07.3: SMBus Host Controller not enabled!
Jul 24 02:01:31 workstation systemd[1]: Failed to start Crash recovery kernel arming.

显示今天最后5条日志条目

[root@workstation ~]# journalctl --since today -n 5
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 02:31:44 EDT. --
Jul 24 02:01:10 servera systemd-journal[86]: Runtime journal is using 6.0M (max allowed 48.6M, trying to leave 72.9M free of
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuset
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpu
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuacct
Jul 24 02:01:10 servera kernel: Linux version 3.10.0-957.el7.x86_64 ([email protected]) (gcc version 4.8.5

2023-7-23 12:00:00 -2023-7-24 12:00:00最新的10条日志消息


[root@workstation ~]# journalctl --since "2023-7-23 12:00:00" --until "2023-7-24 12:00:00" -n 10
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 02:31:44 EDT. --
Jul 24 02:01:10 servera systemd-journal[86]: Runtime journal is using 6.0M (max allowed 48.6M, trying to leave 72.9M free of
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuset
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpu
Jul 24 02:01:10 servera kernel: Initializing cgroup subsys cpuacct
Jul 24 02:01:10 servera kernel: Linux version 3.10.0-957.el7.x86_64 ([email protected]) (gcc version 4.8.5
Jul 24 02:01:10 servera kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-957.el7.x86_64 root=/dev/mapper/centos_servera-root
Jul 24 02:01:10 servera kernel: Disabled fast string operations
Jul 24 02:01:10 servera kernel: e820: BIOS-provided physical RAM map:
Jul 24 02:01:10 servera kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ebff] usable
Jul 24 02:01:10 servera kernel: BIOS-e820: [mem 0x000000000009ec00-0x000000000009ffff] reserved

保留系统日志

永久存储系统日志

默认情况下,系统日志存储在/run/log/journal目录中,意味着系统称其时这些日志会被清除。可以在/etc/systemd/journal.conf中更改配置使日志在系统重启后保留下来。

/etc/systemd/journal.conf 文件中Storage参数决定系统日志以易失性方式存储,还是持久保留。

  • volatile :易失性/run/log/journal目录中
  • persistent:存储在/var/log/journal,可在系统重启后持久保留。
  • auto:rsyslog决定使用持久存储还是易失性存储。如果/var/log/journal目录存在,南无rsyslog会使用持久存储,否则使用易失性存储。

持久系统日志优点时系统启动后就可以利用历史数据。然而即便是持久日志,并非所有数据都能持久保留。日志大小不能超过文件系统10%,也不能造成文件系统可用空间低于15%,可以在/etc/systemd/journald.conf为运行时和持久日志调整这些值。


[root@workstation ~]# journalctl | grep -E 'Runtime|System journal'
Jul 24 02:01:10 servera systemd-journal[86]: Runtime journal is using 6.0M (max allowed 48.6M, trying to leave 72.9M free of 480.1M available → current limit 48.6M).
Jul 24 02:01:17 workstation systemd-journal[3147]: Runtime journal is using 6.0M (max allowed 48.6M, trying to leave 72.9M free of 480.1M available → current limit 48.6M).
Jul 24 02:01:23 workstation systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...
Jul 24 02:01:23 workstation systemd[1]: Started Tell Plymouth To Write Out Runtime Data.

配置持久系统日志

/etc/systemd/journald.conf设置存储策略


[root@workstation ~]# vi /etc/systemd/journald.conf
[root@workstation ~]# cat  /etc/systemd/journald.conf | grep 'Storage'
Storage=persistent

重启服务


[root@workstation ~]# systemctl restart systemd-journald

重启后,可以看到/var/log/journal目录已创建,包含一个或多个子目录,子目录名称包含16进制字符,含有*.journal文件(存储带索引的结构化日志条目的二进制文件)。

[root@workstation ~]# cd /var/log/journal/
[root@workstation journal]# ls
70088e734c8348c2b09d5c6c14125c21
[root@workstation journal]# ls /var/log/journal/70088e734c8348c2b09d5c6c14125c21/
system.journal

由于系统日志重启后保留,在journalctl输出中得到了大量条目,限制系统为特定输出。-b选项用于journalctl命令,检索第一次系统启动的条目。


[root@workstation ~]# journalctl -b 1 -n 5
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 03:12:43 EDT. --
Jul 24 03:11:17 workstation systemd[1]: Shutting down.
Jul 24 03:11:17 workstation systemd-shutdown[1]: Syncing filesystems and block devices.
Jul 24 03:11:17 workstation lvm[7775]: 2 logical volume(s) in volume group "centos_servera" unmonitored
Jul 24 03:11:17 workstation systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Jul 24 03:11:17 workstation systemd-journal[7539]: Journal stopped

#检索第二次系统启动条目
[root@workstation ~]# journalctl -b 2 -n 5
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 03:12:43 EDT. --
Jul 24 03:12:42 workstation sshd[7386]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 03:12:42 workstation sshd[7390]: Accepted password for root from 192.168.182.1 port 56904 ssh2
Jul 24 03:12:43 workstation systemd[1]: Started Session 2 of user root.
Jul 24 03:12:43 workstation sshd[7390]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 03:12:43 workstation systemd-logind[6231]: New session 2 of user root.

检索当前系统启动条目


[root@workstation ~]# journalctl -b -n 5
-- Logs begin at Mon 2023-07-24 02:01:10 EDT, end at Mon 2023-07-24 03:12:43 EDT. --
Jul 24 03:12:42 workstation sshd[7386]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 03:12:42 workstation sshd[7390]: Accepted password for root from 192.168.182.1 port 56904 ssh2
Jul 24 03:12:43 workstation systemd[1]: Started Session 2 of user root.
Jul 24 03:12:43 workstation sshd[7390]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 03:12:43 workstation systemd-logind[6231]: New session 2 of user root.

维护准确时间

设置本地时钟和时区

对于在多个系统间分析日志文件而言,正确同步系统时间至关重要。

显示当前时间和相关系统设置,如当前时间,时区和NTP同步设置。

[root@workstation ~]# timedatectl
      Local time: Mon 2023-07-24 03:20:52 EDT
  Universal time: Mon 2023-07-24 07:20:52 UTC
        RTC time: Mon 2023-07-24 07:20:52
       Time zone: America/New_York (EDT, -0400)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2023-03-12 01:59:59 EST
                  Sun 2023-03-12 03:00:00 EDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2023-11-05 01:59:59 EDT
                  Sun 2023-11-05 01:00:00 EST

系统提供了包含时区的数据库

[root@workstation ~]# timedatectl list-timezones
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
...

将时区设置为Asia/Shanghai


[root@workstation ~]# timedatectl set-timezone Asia/Shanghai

[root@workstation ~]# timedatectl
      Local time: Mon 2023-07-24 15:26:28 CST
  Universal time: Mon 2023-07-24 07:26:28 UTC
        RTC time: Mon 2023-07-24 07:26:28
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a

设置时间


[root@workstation ~]# timedatectl set-time 9:00:00
[root@workstation ~]# timedatectl
      Local time: Mon 2023-07-24 09:00:05 CST
  Universal time: Mon 2023-07-24 01:00:05 UTC
        RTC time: Mon 2023-07-24 01:00:06
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: no
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

设置NTP同步,自动调整时间

[root@workstation ~]# timedatectl set-ntp true
[root@workstation ~]# timedatectl
      Local time: Mon 2023-07-24 09:02:11 CST
  Universal time: Mon 2023-07-24 01:02:11 UTC
        RTC time: Mon 2023-07-24 01:02:11
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

配置和监控CHRONYD

chronyd服务通过与配置ntp服务同步,使不准确的本地硬件时钟(RTC)保持正确运行。

将chronyd指向本地时间源servera


[root@workstation ~]# vi /etc/chrony.conf
[root@workstation ~]# cat /etc/chrony.conf | grep 'server'
# Use public servers from the pool.ntp.org project.
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
server servera iburst

重启该服务


[root@workstation ~]# systemctl restart chronyd



[root@workstation ~]# chronyc sources -v
210 Number of sources = 5

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| /   '?' = unreachable, 'x' = time may be in error, '~' = time too variable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^+ time.cloudflare.com           3   6    77    20    +19ms[  +19ms] +/-  118ms
^* stratum2-1.ntp.mow01.ru.>     2   6   337    24    -26ms[  -56ms] +/-  109ms
^- a.chl.la                      2   6     7    25    +40ms[  +10ms] +/-  109ms
^- tick.ntp.infomaniak.ch        1   6   357    20  -4898us[-4898us] +/-  159ms
^? servera                       0   8     0     -     +0ns[   +0ns] +/-    0ns

开两个终端查看servera和workstation时间是否被同步

[root@workstation ~]# date
Mon Jul 24 15:47:50 CST 2023
[root@servera ~]# date
Mon Jul 24 03:47:50 EDT 2023

时间已被同步,servra是workstation的ntp时间源。

猜你喜欢

转载自blog.csdn.net/weixin_51882166/article/details/131898175
今日推荐