background:
Today, I suddenly received a message from a colleague in charge of the mail that the mail server was unavailable, so I checked.
process
1. Manually telnet port 465
port found unavailable
2. View nginx logs
Discover Connection reset by peer
3. Packet capture and investigation
#Listening port
tcpdump -nn -i eth0 port 465
#Request proxy address
telnet 127.0.0.1 465
#Request backend address
telnet smtp.test.cn 465
Here it is found that the address after requesting the proxy is not consistent with the address resolved by the domain name
4. Problem discovery
It is because nginx has cached the resolution of dns, and the proxy backend address smtp.test.cn node has changed, and the original node has been discarded, but the nginx cache still caches the original node, causing an exception.
solution
Configure nginx resolver to regularly refresh dns cache
The configuration is as follows
server {
listen 465;
resolver 111.111.111.111 8.8.8.8 valid=4800s;
resolver_timeout 3s;
set $smtp "smtp.test.cn:465";
proxy_pass $smtp;
proxy_connect_timeout 60s;
}
Parameter Description:
1、resolver
Configure in the server, specify the DNS server later, and separate multiple DNSs with spaces
2. Valid cache expiration time
After invalidation, dns resolution will be obtained again
3、resolver_timeout
parsing timeout
Note: If you configure multiple dns, ensure that all configured dns are available, because the resolution uses a polling mechanism. When the valid time is up, the second dns address will be used for resolution. If one address is unavailable, it will Timed out and returned could not be resolved (110: Operation timed out) error.