DNS protocol analysis principle

0. Preface

In order to ensure the normal access of the website, the domain name resolution protocol (DNS) has actually made a lot of efforts behind it. This article will thoroughly explain the principle of the DNS protocol and understand how the websites we contact every day work.

insert image description here

1. What is the DNS protocol

Before learning the DNS protocol, let's distinguish between the two concepts of domain name and IP address:

  • IP address: A long string of numbers that uniquely identifies a computer on a network
  • Domain name: also known as network domain, is the name of a certain computer or computer group on the Internet consisting of a string of names separated by dots.

The location identification (sometimes also refers to the geographic location) of the computer during data transmission, such as www.baidu.com

I don’t know if any students will confuse the concept of domain name and website address. It can be understood that the website contains a domain name. For example: www.gitee.com/veal98 is a URL, and www.gitee.com is a domain name

Due to the shortcomings of IP addresses that are inconvenient to remember and cannot display the name and nature of the address organization, people have designed a domain name, and map the domain name and IP address to each other through the Domain Name Resolution Protocol (DNS, Domain Name System), making it more convenient for people Access the Internet without having to remember a string of IP addresses that can be read directly by a machine. Mapping a domain name to an IP address is called forward resolution, and mapping an IP address to a domain name is called reverse resolution.

The DNS protocol can use UDP or TCP for transmission, and the port number used is 53. But most of the time DNS is transported using UDP.

2. Domain Name Details

❓ So who is responsible for defining and managing domain names? Can't just write it casually?

The highest management body of domain names in the world is an organization called ICANN (Internet Corporation for Assigned Names and Numbers), headquartered in California, USA. ICANN manages the operation of the domain name system around the world.

insert image description here

The domain name actually has a certain hierarchical structure, from top to bottom: root domain name, top level domain name (top level domain, TLD), second-level domain name, (third-level domain name)

insert image description here

① Top-level domain name

Let's start with a top-level domain (TLD), the highest-level domain name. Simply put, it is the last part of the URL. For example, the top-level domain name of the website www.baidu.com is .com. A major part of ICANN's job is to define which strings can be considered top-level domains. As of July 2015, there are 1058 top-level domains, which can be roughly divided into two categories:

  • One category is generic top-level domains (gTLD), such as .com, .net, .edu, .org, .xxx, etc. There are more than 700 in total.
  • The other is the country top-level domain name (ccTLD), which represents different countries and regions, such as .cn (China), .io (British Indian Ocean Territory), .cc (Cocos Islands), .tv (Tuvalu), etc. , a total of more than 300.

Of course, ICANN will not manage these top-level domains by itself, because there is no way to manage them. Think about it, there are more than 1,000 top-level domain names, and there are many wholesalers under each top-level domain name. If you have to manage each one, it will be too troublesome. ICANN's policy is that each top-level domain has a hoster who is responsible for all aspects of that domain. ICANN only contacts the custodian, which makes it much easier to manage. For example, the custodian of the .cn ccTLD is the China Internet Network Information Center (CNNIC), which determines various policies for the .cn domain name.

② Second-level domain name

The second level domain name (Second Level Domain, SLD) has different meanings under the generic top-level domain name or country top-level domain name:

  • The second-level domain name under the gTLD: generally refers to the online name chosen by the domain name registrant, such as yahoo.com (commercial organizations usually use their own trademarks, trade names or other commercial signs as their own online names, such as baidu.com)
  • The second-level domain name under the national top-level domain name: generally refers to the sign similar to the general top-level domain name to indicate the category and function of the registrant. For example, in the .com.cn domain name structure, .com is now a second-level domain name placed under the country's top-level domain name .cn, representing a commercial organization in China, and so on.

The third-level domain name is a domain name like www.baidu.com, which can be regarded as a sub-domain name of the second-level domain name. It is characterized by the fact that the domain name contains two . For domain name owners/users, third-level domain names are appendages of second-level domain names without separate fees. The third-level domain name cannot even be called a domain name, and is generally called the "second-level directory" under the domain name.

③ root domain name

❓ So where is the root domain name? Isn't the root domain name the topmost in the hierarchy? Why don't I see it in the domain name?

Since ICANN manages all top-level domain names, it is the highest-level domain name node and is called the root domain name (root domain). In some cases, www.xxx.com is written as www.xxx.com., that is, there will be an extra dot at the end. This point is the root domain name.

insert image description here

In theory, all domain name queries must first query the root domain name, because only the root domain name can tell you which server a certain top-level domain name is managed by. In fact, it is true that ICANN maintains a list (root domain name list) that records top-level domain names and corresponding hosting providers.

insert image description here

For example, if I want to access abc.xyz, I must first ask the root domain name list, and it will tell me that the .xyz domain name is hosted by CentralNic company. The root domain list also records that .google is hosted by Google, .apple is hosted by Apple, and so on.

Since the root domain name list rarely changes, most DNS service providers will provide its cache, so the query of the root domain name is actually not so frequent.

3. Domain Name Server Details

A domain name server refers to a host and corresponding software that manages domain names, and it can manage information about domains in which it is located. The layer that a domain name server is responsible for is called a zone (ZONE). Each layer of the domain name has a name server:

  • root name server
  • TLD server
  • Authoritative Name Servers

The following picture is very intuitive:

insert image description here

In addition to the above three DNS servers, there is another DNS server that is not in the DNS hierarchy but is very important, that is, the local domain name server. Let's explain what these four servers are used for.

① Root domain name server

As we mentioned above, ICANN maintains a root domain name list, which records top-level domain names and corresponding hosting providers. In fact, the official name of the root domain name list is DNS root zone (DNS root zone), which is the server that saves DNS root zone files. It is called the DNS root domain name server (root name server). The root domain name server holds the addresses of all top-level domain name servers

Since the earlier DNS query result is a 512 byte UDP packet. This package can contain addresses of 13 servers at most, so it is stipulated that there are 13 root domain name servers in the world, numbered from a.root-servers.net to m.root-servers.net. Ten of these are located in the US, and one each in the Netherlands, Sweden and Japan.

As we said before, in theory, all domain name queries must first query the root domain name, so generally speaking, all domain name servers will register a cache of the IP address of the root domain name server to send requests to it when necessary.

② Top-level domain name server

According to the logic of the root domain name server managing the top-level domain name, the top-level domain name server is obviously used to manage all the second-level domain names registered under the top-level domain name, and record the IP addresses of these second-level domain names.

③ Authoritative domain name server

According to the above logic, the authoritative domain name server should manage all third/fourth-level domain names registered under the second-level domain name, but this is not the case. If a second-level domain name or a third-level/fourth-level domain name corresponds to a domain name server, the domain name There will be a large number of servers, and we need to use partitioning to solve this problem. Then the authoritative domain name server is the domain name server responsible for managing a "zone".

❓ What is a district? How to divide the area?

Districts and domains are actually different, and districts can be divided in many different ways. Taking Baidu as an example, we assume that there are three third-level domain names: fanyi.baidu.com, ai.baidu.com, and tieba.baidu.com. We can partition like this, fanyi.baidu.com and tieba.baidu.com are placed in baidu.com authorized domain name server, and ai.baidu.com is placed in ai.baidu.com authorized domain name server. And baidu.com authoritative domain name server and ai.baidu.com authoritative domain name server have the same status, and the specific division is determined by Baidu according to the number of domain names and the number of visits.

Draw a picture to understand intuitively:

insert image description here

④ Local domain name server

In addition to the above three DNS servers, there is another DNS server that is not in the DNS hierarchy but is very important, that is, the local domain name server (also known as the authoritative domain name server). The local domain name server is the default domain name server for computer resolution, that is, the preferred DNS server and alternate DNS server set in the computer. The common ones are local DNS services of China Telecom, China Unicom, Google, Ali, etc.

insert image description here

Each Internet service provider or a university, or even each department in a university, can have a local domain name server. When a host sends a DNS query request, the query request message is sent to the host's local domain name server. The local domain name server manages the resolution and mapping of the local domain name, and can query the upper-level domain name server.

So how does the specific local domain name server forward the query request to the upper-level domain name server?

4. DNS query method

There are two specific DNS query methods:

  • recursive query
  • iterative query

The so-called iteration means that if the receiver of the request does not know the requested content, the receiver will act as the requester, issue the relevant request until the required content is obtained, and then return the content to the original requester.

In layman's terms, in a recursive query, if A requests B, then B, as the receiver of the request, must give A the answer he wants; while an iterative query means that if receiver B does not have the exact answer required by requester A For content, receiver B will tell requester A how to obtain this content, but he will not send the request himself.

Generally speaking, the query between domain name servers uses an iterative query method to avoid excessive pressure on the root domain name server. It can be well understood by the following two figures

1) Recursive query:
insert image description here

2) Iterative query:
insert image description here

5. Domain caching

The above is the DNS query request process between domain name servers, but in fact, there are countless netizens who want to surf the Internet at any time, so it is obviously unrealistic to visit the local domain name server to obtain the IP address every time. The solution is to use a cache to save the mapping of domain names and IP addresses.

There are two local cache methods for DNS records in the computer: browser cache and operating system cache.

1) Browser cache: After the browser obtains the actual IP address of the website domain name, it will cache it to reduce the loss of network requests. Each browser has a fixed DNS cache time, such as the expiration time of Chrome is 1 minute, and DNS will not be re-requested within this period

2) Operating system cache: The operating system cache is actually the hosts file configured by the user. For example, the hosts file under Windows 10 is stored in C:\Windows\System32\drivers\etc\hosts

insert image description here

The Windows system enables the DNS cache service by default. The service name is DNSClient, which can cache some commonly used domain names.

insert image description here

Use the command ipconfig/displaydns to check the domain name cached in the computer.

insert image description here

⭐ When accessing in the browser, the browser cache will be queried first. If there is a miss, the operating system cache will continue to be queried, and finally the local domain name server will be queried. Then the local domain name server will recursively search for the domain name record, and finally return the result. The query method between the host and the local domain name server is a recursive query, that is to say, the host requests the local domain name server, then the local domain name server, as the receiver of the request, must give the host the answer it wants.

6. Complete domain name resolution process

OK, combining the DNS query request process between domain name servers and the domain name cache we mentioned above is a complete DNS protocol for domain name resolution process. Here we take forward resolution as an example (domain name is resolved into IP address):

1) First search the DNS cache of the browser, and maintain a correspondence table between domain names and IP addresses in the cache;

2) If there is no hit, continue to search the DNS cache of the operating system;

3) If there is still no hit, the operating system will send the domain name to the local domain name server, and the local domain name server will query its own DNS cache, and return the result if the search is successful (note: the query method between the host and the local domain name server is a recursive query);

4) If the DNS cache of the local domain name server is not hit, the local domain name server will query the higher-level domain name server and perform iterative query in the following way (note: the query method between the local domain name server and other domain name servers is an iterative query to prevent root Nameservers are under pressure):

First, the local domain name server initiates a request to the root domain name server. The root domain name server is the highest level. It does not directly indicate the IP address corresponding to the domain name, but returns the address of the top-level domain name server. Road, let him go here to find the answer.
After the local domain name server gets the address of the top-level domain name server, it initiates a request to it to obtain the address
of the authoritative domain name server. The IP address corresponding to the domain name
4) The local domain name server returns the obtained IP address to the operating system, and at the same time caches the IP address by itself

5) The operating system returns the IP address to the browser, and at the same time caches the IP address itself

6) So far, the browser has obtained the IP address corresponding to the domain name and cached the IP address

Intuitive understanding with the following figure:

insert image description here

7. Public domain name resolution qualification

Public network domain name resolution requires the permission of the domain name registrar. For relevant laws and regulations, refer to the Internet Domain Name Management Measures .

Guess you like

Origin blog.csdn.net/qq_32907491/article/details/131715986