Nginx domain name resolution timeout

Use Aliyun ddns dynamic domain name resolution, bind the company IP, and then use nginx as a reverse proxy. I found that the company's IP has changed, and the service through the nginx proxy reports a timeout error, and it always reports that the back-end connection address cannot be found. If you do not go through nginx, but directly through the domain name access is no problem. I checked the log and found that the domain name resolution in the nginx log still resolved to the originally bound ip address, so I guessed that nginx might have a domain name resolution cache. Sure enough, similar problems have appeared on the Internet around the nginx domain name caching problem.

The nginx error message is as follows

[error] 1828#0: *146807 upstream timed out (110: Connection timed out) while connecting to upstream, client: 106.122.174.199, server: xxx.zhaodao.info, request: "GET / HTTP/1.1", upstream: "http://106.122.xxx.xxx:xxx/", host: "xxx.zhaodao.info"

I found an article on Zhihu, the analysis is quite good, reprint it.

-------------------------------------------------------------------

 

background:

Here is only for the case where nginx acts as a proxy to the backend, and the backend proxy is in the form of a domain name.

1. After starting nginx under normal circumstances (or -t / reload nginx), nginx will resolve the IP corresponding to the domain name through the DNS server configured by the operating system

2. When all the domain names involved in the nginx configuration file can be resolved normally, it can be started (or checked/reloaded) through

3. Here I need to remind you that in ../sbin/nginx -t or ../sbin/ngins -s reload, you just check whether the domain name can be resolved, and the corresponding IP of the domain name will not be cached at this time, only after passing nginx. When forwarding proxy data to the domain name corresponding to the proxy_pass backend, here nginx will resolve the domain name through the DNS server configured by the operating system. At this time, the IP corresponding to the domain name will be cached, and it will be cached for a long time, even one month (the entire The process is proved by production examples, and packet capture verification)

 

Problems I have encountered:

Production instance

1. The data on our intranet is forwarded to the corresponding domain name of the third-party partner company through nginx, here referred to as domain name A

2. The third-party company's domain name A is CDN, corresponding to multiple IPs (IP1, IP2, IP3...), and it is possible to discard one of the IPs for some reason at any time

3. One day the third-party company discarded the address IP3 corresponding to their domain name A, and the domain name no longer resolves to IP3

4. But because our nginx cached IP3 when requesting domain name A, many subsequent transaction data were still sent to IP3, causing the transaction to fail. This situation existed for about 2 weeks before we did not reload nginx, indicating nginx This IP3 was cached for a long time, which was the reason why our transaction failed (checked for several days at the time). Later, after checking with multiple parties, we learned that the third-party company abandoned this IP3 as early as 3 weeks ago. ( Maybe because the DNS of the entire network is refreshed, domain name A is no longer resolved to IP3, but the server corresponding to IP3 continues to be used for a while, so we started reporting errors only the second week after the abandonment)

 

Analysis and solution:

1. Since it is caused by nginx caching the DNS records corresponding to the IP of the domain name, how can it be solved? There are two methods:

(1) Manually reload nginx and let nginx re-analyze the domain name. At this time, the IP corresponding to the domain name is the latest and will not contain the abandoned IP3

(2), set the nginx DNS cache time, such as 600s invalidation, and then re-analyze

 

2. Method (2) is of course the best, but where is the DNS cache time of nginx set? I didn't find it!

3. But I found another way-nginx's resolver

 

Nginx's resolver solution

1. By default, nginx will resolve the domain name through the DNS server (/etc/resolv.conf) set by the operating system

2. In fact, nginx can also set the DNS server by itself, instead of looking for the operating system's DNS

3. Let’s talk about a resolver

The sample configuration is as follows:

server {
       listen      8080;
       server_name localhost;
       resolver 114.114.114.114 223.5.5.5 valid=3600s;
       resolver_timeout 3s;
       set $qq "www.qq.com";
       location / {
          proxy_pass http://$qq;
       }
   }

 

Parameter Description:

# resolver can be set globally in http or in server

# resolver Specify the DNS server behind, you can specify multiple, separated by spaces

# valid Set the DNS cache expiration time, judge by yourself according to the situation, recommend 600 or more

# resolver_timeout Specify the timeout time of the DNS server when resolving the domain name, it is recommended to be about 3 seconds

#Note : When the resolver is followed by multiple DNS servers, you must ensure that these DNS servers are all valid, because this is a load balancing mode. When the DNS record becomes invalid (exceeding the valid time), the first DNS Server (114.114.114.114) to resolve, the next time it continues to fail, the second DNS server (223.5.5.5) to resolve, personally tested, if any DNS server is broken, then the resolution will continue this time To resolver_timeout, then the resolution fails, and the log reports an error that the domain name cannot be resolved, and a 502 error is thrown through the page.

#Key: As in the above example, when proxying to the back-end domain name http://www.qq.com , do not write directly in proxy_pass, because the resolver is used in the server, the domain name must be defined in a variable first. Then enter the proxy_pass http://$ variable name, otherwise the nginx syntax check will always report an error, prompting that the domain name cannot be resolved

 

postscript

Pro-test the whole process, no problem

If there are other better ways or insights, please reply and discuss together.

forward from

Author: eraser
link: https: //www.zhihu.com/question/61786355/answer/268735267
Source: know almost
copyrighted by the author. For commercial reprints, please contact the author for authorization. For non-commercial reprints, please indicate the source.

Guess you like

Origin blog.csdn.net/Lixuanshengchao/article/details/105491426