Zabbix solve a host frequent sudden alarm "Zabbix agent on xxxxxx is unreachable for x minutes"

First, a host suddenly one day frequent alarms zabbix agent unreachable

View zabbix agent log and found no abnormalities

Second, the view zabbix server logs found that host a large number of logs the error message "First Networ k error" and " Another Network error"

[root@zabbix_server etc]# cat /tmp/zabbix_server.log|grep 172.28.5.63|more

 27849:20191218:094413.077 Zabbix agent item "perf_counter[\2\250]" on host "172.28.5.63" failed: another network error, wait fo
r 15 seconds
 27848:20191218:094428.098 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27837:20191218:094446.128 Zabbix agent item "net.if.in[Microsoft ISATAP Adapter #2]" on host "172.28.5.63" failed: first networ
k error, wait for 15 seconds
 27849:20191218:094504.088 Zabbix agent item "net.if.out[WAN Miniport (Network Monitor)-QoS Packet Scheduler-0000]" on host "172
.28.5.63" failed: another network error, wait for 15 seconds
 27845:20191218:094519.094 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27836:20191218:094536.258 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #4]" on host "172.28.5.63" failed: f
irst network error, wait for 15 seconds
 27846:20191218:094551.117 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27843:20191218:094600.102 Zabbix agent item "net.if.out[Broadcom NetXtreme Gigabit Ethernet-WFP LightWeight Filter-0000]" on ho
st "172.28.5.63" failed: first network error, wait for 15 seconds
 27843:20191218:094615.127 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27837:20191218:094623.818 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #4-QoS Packet Scheduler-0000]" on ho
st "172.28.5.63" failed: first network error, wait for 15 seconds
 27847:20191218:094641.112 Zabbix agent item "net.if.in[WAN Miniport (SSTP)]" on host "172.28.5.63" failed: another network erro
r, wait for 15 seconds
 27845:20191218:094657.134 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27834:20191218:094702.464 Zabbix agent item "vfs.fs.size[D:,free]" on host "172.28.5.63" failed: first network error, wait for 
15 seconds
 27852:20191218:094720.139 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27840:20191218:094723.709 Zabbix agent item "vm.memory.size[pavailable]" on host "172.28.5.63" failed: first network error, wai
t for 15 seconds
 27847:20191218:094738.149 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27836:20191218:094802.499 Zabbix agent item "net.if.out[Broadcom NetXtreme Gigabit Ethernet #3]" on host "172.28.5.63" failed: 
first network error, wait for 15 seconds
 27843:20191218:094818.149 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27832:20191218:094825.129 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #3-QoS Packet Scheduler-0000]" on ho
st "172.28.5.63" failed: first network error, wait for 15 seconds
 27851:20191218:094859.175 resuming Zabbix agent checks on host "172.28.5.63": connection restored
 27832:20191218:094903.413 Zabbix agent item "vfs.fs.size[E:,free]" on host "172.28.5.63" failed: first network error, wait for 
15 seconds

Third, view the host TCP connections, we found a lot of TIME_WAIT connections

 

Fourth, Baidu, specifically because of the following reasons

Start from your system, Windows Vista, the in Windows 7, Windows Server 2008 and in Windows Server 2008 R2 497 days in the TIME_WAIT state is not closed all the TCP / IP port

This means that, 497 days after the system starts, all TCP link "TIME_WAIT" status will not be closed. TCP port gradually occupied finished, you can not create a new TCP / IP connections

Fifth, log in to see the host system running long

 

 Frequent alarm just the day before yesterday morning appeared

Sixth, the solution

1, restart the server, but run after 497 days, there will be a problem

2, download the Microsoft patch

Microsoft's official website address announcement

https://support.microsoft.com/zh-cn/help/2553549/all-the-tcp-ip-ports-that-are-in-a-time-wait-status-are-not-closed-aft

Now can not download the patch, you can use the window update to update patch

Guess you like

Origin www.cnblogs.com/sky-cheng/p/12066143.html