Foreword
With the advent of Web 2.0, the Internet network architecture has evolved from the traditional C/S
transformation architecture to a more convenient and efficient B/S
architecture, B / S architecture greatly simplifies the difficulty of the user network applications, improve the user experience.
B/S
The architecture brings the following two benefits:
- The client uses a unified browser (
Browser
). Due to the unity of the browser, no special configuration and network connection are required. In addition, the interactive nature of the browser makes it very easy for users to use it, and the inheritance of user behavior is very strong, that is, as long as the user has learned to surf the Internet, no matter which application is used, once learned, he has the experience of using any other Internet service . - The server (
Server
) is based on a unified oneHTTP
. It is different from the traditional C / S architecture using a custom application layer protocol. Using a unified HTTP simplifies the development mode, and has a lot of HTTP-based servers, such asApache
,Nginx
,Tomcat
and so on, these servers can directly use it, not only that, even the general framework for the development of services can also directly use it, no need to develop separate such asSpring
,Spring MVC
,MyBatis
and so on, we can focus on business logic services, also simplifies our development work.
Overview of B / S network architecture
B/S
Based unified application layer protocol HTTP
to interact with the data, with the most C/S
different interactive mode long connection Internet applications used. HTTP
A stateless short connection communication method is usually used. Normally, a request completes a data interaction, and then the communication connection is disconnected this time. Using this method can effectively respond to more user requests.
When the input in the browser antoniopeng.com this URL
time and press Enter, many operations occur:
- First, a request
DNS
to resolve the domain name to the correspondingIP
address. - Then according to this
IP
address to find the corresponding server on the Internet, launched a (GET / POST / ...) request to the server. The server returns the default data resources to the accessed user, and there may also be very complicated business logic on the server side.- There may be many servers, and a load balancing device (such as
Nginx
) distributes all users' requests evenly. - And whether the requested data is stored in the cache or in a static file, or in the database.
- There may be many servers, and a load balancing device (such as
- Finally, when the data back to the browser, resolve to find some static resources (such as
CSS
, ,JS
)IMG
will initiate additional timeHTTP
request, and these requests are likely to be inCDN
on, then theCDN
server will process these requests.
How to make a request
The problem is simple and complex, simply means that when we are in a browser data URL
, press the Enter key to initiate this HTTP
request, will soon be able to return to see the result of this request. Complex means that the request can be initiated without the help of a browser.
And a HTTP
connection is essentially a Socket
connection, then we can fully simulate the browser to initiate a HTTP
request. Apache HttpClient
A process is implemented by a program open HTTP
request toolkit.
The following is based on a HttpClient
call example:
Introduce dependencies
In pom.xml
the Add org.apache.httpcomponents:httpclient
dependent
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.5</version>
</dependency>复制代码
Create Http Get request
The implementation code is as follows
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import java.io.IOException;
public class MyTest {
public static void main(String[] args) {
get();
}
private static void get() {
// 创建 HttpClient 客户端
CloseableHttpClient httpClient = HttpClients.createDefault();
// 创建 HttpGet 请求
HttpGet httpGet = new HttpGet("http://www.baidu.com");
// 设置长连接
httpGet.setHeader("Connection", "keep-alive");
// 设置代理(模拟浏览器版本)
httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36");
// 设置 Cookie
httpGet.setHeader("Cookie", "UM_distinctid=34342706a09352-0376059833914f-3c604504-1fa400-16442706a0b345; CNZZDATA1262458286=1603637673-1530123020-%7C1530123020; JSESSIONID=805587506F1594AE02DC45845A7216A4");
CloseableHttpResponse httpResponse = null;
try {
// 请求并获得响应结果
httpResponse = httpClient.execute(httpGet);
HttpEntity httpEntity = httpResponse.getEntity();
// 输出请求结果
System.out.println(EntityUtils.toString(httpEntity));
} catch (IOException e) {
e.printStackTrace();
}
// 无论如何必须关闭连接
finally {
if (httpResponse != null) {
try {
httpResponse.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if (httpClient != null) {
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}复制代码
In addition to Java
the use of very common HttpClient
tools, in addition to the command line of curl
command, through the curl + URL you can simply initiate a HTTP
request
- input the command
curl https://www.baidu.com复制代码
- Return HTML data results
<!DOCTYPE html>
<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css><title>鐧惧害涓€涓嬶紝浣犲氨鐭ラ亾</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus=autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=鐧惧害涓€涓?class="bg s_btn" autofocus></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>鏂伴椈</a> <a href=https://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>鍦板浘</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>瑙嗛</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>璐村惂</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>鐧诲綍</a> </noscript> <script>document.write('<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u='+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ '" name="tj_login" class="lb">鐧诲綍</a>');
</script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">鏇村浜у搧</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>鍏充簬鐧惧害</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>©2017 Baidu <a href=http://www.baidu.com/duty/>浣跨敤鐧惧害鍓嶅繀璇?/a> <a href=http://jianyi.baidu.com/ class=cp-feedback>鎰忚鍙嶉</a> 浜琁CP璇?30173鍙?nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>复制代码
HTTP parsing
To understand HTTP
the most important thing is to be familiar with HTTP
the HTTP Header
, which controls the transmission of data. Most importantly, it controls the rendering behavior of the browser and the execution logic of the server. For example, when the server does not have the data requested by the user, it will return a 404 status code, telling the browser that there is no data to request, usually the browser will display a very unwilling to see "This page does not exist" error message.
Common HTTP request headers
Request header | Explanation |
---|---|
Accept-Charset | Specify the character set received by the client |
Accept-Encoding | Specify acceptable encoding (eg Accept-Encoding: gzip.deflate) |
Accept-Language | Specify a natural language (eg Accept-Language: zh-cn) |
Host | Specify the host and port number of the requested resource (eg Host: www.baidu.com) |
User-Agent | The client tells the server the operating system, browser and other attributes |
Connection | Specifies whether the current connection is maintained (eg Connection: Keep-Alive) |
Common HTTP response headers
Response header | Explanation |
---|---|
Server | Server name (eg Server: nginx / 1.17.6) |
Content-Type | The type of entity sent to the recipient (eg Content-Type: text / html; charset = GBK) |
Content-Encoding | Corresponding to Accept-Encoding, the encoding adopted by the server |
Content-Language | Corresponding to Accept-Language, the natural language used by resources |
Content-Length | Body length |
Keep-Alive | The time to keep the connection (such as Keep-Alive: timeout = 5) |
Common HTTP status codes
status code | Explanation |
---|---|
200 | Successful request |
302 | Temporary jump |
400 | The client request has a syntax error that cannot be recognized by the server |
403 | The server receives the request, but refuses to provide the service, ie no authority |
404 | The requested resource does not exist |
500 | An unexpected error occurred on the server |
View HTTP information
Look at a HTTP
request of request headers and response headers can open a browser through F12 shortcut key debugging tool to view, for example, we are visiting www.baidu.com, press F12 and open the Network
debug bar to see this HTTP Header
content
Browser caching mechanism
When viewing a Web page are found, usually to consider it is that the browser did not cached, so the general practice is to press Ctrl + F5
the key combination once again request this page, so be sure the request is the latest page. Because the press Ctrl + F5
key combinations directly to the target URL
sends a request, instead of using the browser's cache data.
As shown in the figure, this request did not reach the server, using the browser's cached data
Press the Ctrl + F5
key combination to refresh the page, you will find in HTTP
the request header is usually more than two parameters, namely, Cache-Control:no-cache
and Pragma:no-cache
the parameters of the role of the requested content is not cached
DNS domain name resolution
Internet is to publish by URL (Uniform Resource Locator) and request resources, and URL
the need to resolve the domain name into IP
an address to establish a connection with the remote host, how to resolve a domain name into an IP address belongs to the scope of work of DNS resolution.
When the user enters www.baidu.com in the browser, the working steps of DNS resolution are as follows
- First, the browser will check whether there is a resolved IP address corresponding to this domain name in the cache. If there is in the cache, this parsing process will end. Cache time limit domain name can
TTL
be set properties. - If the browser cache does not, it checks the operating system whether there is the domain name corresponding
DNS
analytical results, in Windows canC:\Windows\System32\drivers\etc\hosts
be set up file, in Linux this profile is/etc/hosts
, to modify this file can also configure the IP results of name resolution. - If the above steps cannot complete the domain name resolution, it will really request the domain name server to resolve the domain name. The operating system will first send
Local DNS Server
the domain name to the domain name server in the region. For example, you access the campus network in the school, then the local domain name server is certainly in your school, if you are in a community access to the Internet, then thisLocal DNS Server
is the application provider to provide you access the Internet (Telecom, China Mobile and China Unicom), Usually in a corner of the city, not very far. - If you
Local DNS Server
still have not hit directly toROOT DNS Server
(root domain name server) request resolution. - The root domain name server will return to the local domain name server an
gLTD Server
address of the queried domain name (primary domain name server), whichgLTD
is an international top-level domain name server, such as.com
,.cn
etc. Local DNS Server
(Local domain name server) will again have just returned fromgTLD Server
sending a request.- Accepted the request
gTLD Server
to find and return this domain name corresponding to theName Server
address of the DNS server, thisName Server
is usually your domain name registration service provider (such as Ali cloud - million net). Name Server
And then query the storage domain names and IP mapping table, under normal circumstances, to obtain IP domain name record, along with aTTL
value back to theLocal DNS Server
(local name server).Local DNS Server
Caches correspondence between the domain name and IP, cache by the timeTTL
value control, the final result of the analysis is returned to the user.
Domain name resolution
Domain name resolution records are mainly divided into A records, MX records, CNAME records, NS records, and TXT records.
- A record: Specify the IP address corresponding to the domain name (multiple domain names can be resolved to the same IP, and an IP can only point to one domain name).
- MX record: Point the mail server under some other domain name to its own mail server.
- CNAME record: Point one domain name to another domain name.
- NS record: Specify the DNS resolution server.
- TXT record: Set a description for a host name or domain name.
CDN working mechanism
CDN
That is, the content distribution network, mainly caches the static data in the website, such as CSS, JS, IMG and other data. After a user request to start the master server dynamic content, then the CDN
download static data, thereby accelerating the speed of web page data downloaded content.
In general, CDN
to achieve scalability, security, reliability, several objectives. The working steps are as follows:
- First, the
Local DNS Server
request to initiate a local DNS server, usually after iterative resolution back to the domain name registration service provider to resolve. - There is usually a
DNS
parsing the domain name server will againCNAME
resolve to another domain name, the domain name will eventually be directed toCDN
Global in theDNS
load balancing server, and then byGTM
according to access a user's address, recently returned to the user from the accessCDN
node. - To get
CDN
the result of the analysis, the user directly to theCDN
node access the static files, and if this node in the requested file does not exist, it will go back to the source station to get the file, and then returned to the user.
Load balancing
Load balancing ( Load Balance
) is to balance and distribute work tasks to multiple operation units to perform tasks together.
It can improve server response speed and utilization efficiency, avoid single point of failure of software, and solve network congestion problems.
There are usually three load balancing architectures:
- Link load balancing: The advantages are: no need to go through other proxy servers, usually the access speed will be fast, the disadvantage is that there is a cache, it is difficult to update the domain name resolution structure in time.
- Cluster load balancing
- Hardware load balancing: The advantage is that the performance is very good, and the disadvantage is that it is very expensive and cannot be dynamically expanded.
- Software load balancing: The advantage is that the cost is very low, and the disadvantage is that a single access request generally passes through multiple proxy servers, increasing network delay.
- Operating system load balancing: Use operating system-level soft or hard interrupts to achieve load balancing, such as setting up multiple network cards.
CDN dynamic acceleration
Technical principle: In CDN
the DNS
parsing detecting dynamic link back to the source to find the best path, then all DNS requests scheduling scheduled on this selected path back to the source, thereby accelerating the efficiency of the user's access.
Link detection: In each CDN
download a certain file size of the station from the source node to see which link the shortest total time, so that you can form a link list, and then bind to DNS
the resolution, to update Local DNS Server
.
- Author: Chao Peng
- This article first appeared on a personal blog: antoniopeng.com/2020/04/07/…
- Copyright notice: All articles in this blog use the CC BY-NC-SA 4.0 license agreement unless otherwise stated. Reproduced please specify from Chao Peng | Blog !