Computer network protocol detailed --HTTP

I. Introduction

  Some time ago in order to study computer network, I looked at the "computer network top-down approach" this book. I have to say this is really a good book, detailed and easy to understand and explain, using the analogy of a large number of ways to explain, rather than mere narrative theory, while the end of each chapter there are a lot of exercises and very interesting programming problems, it is recommended to begin with the first wave. I see this book being only the second chapter, just read HTTPthe content, so writing a HTTPrelated blog, on when to take notes.


Second, Detailed

 2.1 HTTP Overview

  HTTPIs an application layer protocol, it stands for Hypertext Transfer Protocol , which is the webcore. HTTPImplemented by two procedures - client and server programs, and HTTPthe effect is simply a client sent a request to the server, and the server responds upon request . HTTPIt defines the Webclient request to the server resources, and the way Webthe server to the client loopback resources the way, that is, HTTPthe request + response model. The client sends a request message to request resources, the server receives the request, the client response packet sent back to the server containing these resources.

  HTTPBased TCPprotocol, the TCPprotocol supports the transmission of data, indicating that the HTTP protocol is a reliable connection-oriented protocol . When a client requests a resource to the server, the server first establish a TCPconnection when TCPthe connection is established, you can between the client and the server through socket interface to access TCPthe client through TCPrequest packet transmission connection, and servers through this TCPconnection echo response packets and resources. Because TCPreliable transmission to ensure that the HTTPmessage will be able to complete on the server, and the server's response can also be a complete return to the customer.

  HTTPThe requested resource is typically a Webpage, a Webpage consists of one or more objects consisting of the object may be a htmlfile, a picture, a video or even a small program. For HTTP, the composition of a Webpage of objects do not belong to the same resource, each object is a single resource, the request individually. Suppose we request a server Webpage, the page of a htmlfile as well as 5photos composition ( htmlby path reference picture), then this page there are 6objects, when the server receives a client request for a page, the htmldocument through a response packet to return while the client receives the response of htmlthe file, it also found references 5pictures, then the client will again send 5a HTTPrequest to this request were 5pictures.

  The server sends the requested file to the client, but the client does not record any information, so when you twice in a row with a resource request to the server, the server will respond to you twice, not because you have not been requested give you a response. It is because of HTTPcustomer information is not recorded, so it is a stateless protocol .


 2.2 Non-persistent connections and persistent HTTP connections

  In most cases, we are WebWhen requesting server, have more than send out a request, such as the above mentioned page that contains 5 pictures. This time we need to consider that a problem, for this same destination multiple request / response, HTTPis every request / response using the same TCPconnection, or create a separate request each time a TCPconnection do? Here are two cases, multiple request / response using the same TCPare referred to as connection of persistent connections , and each request / response using a single TCPconnection, it is called a non-persistent connection . The HTTPdefault is to use persistent connections, but also can be configured to switch to a non-persistent connection. Here's a simple talk about the difference between the two.

  (1) a non-persistent connection

  Represents a non-persistent connection for each request / response, will establish a separate TCPconnection to. Suppose we said before or in the example to explain: Our request to the server that contains the 5page images, and the path to the page assumption is HTTP://www.tewuyiang.cn/index.html (This is my personal server, currently deployed a create a simple little game), when we ask this path, the following occurs:

  1. HTTPClient process to the server through port 80 www.tewuyiang.cnto initiate a TCPconnection 80port HTTPdefault port;
  2. HTTPThe client process sends a request message to the server over a socket, the request for the path to the resource /index.html;
  3. HTTPServer processes the request received through the socket from its memory (eg: RAM) searches HTTP://www.tewuyiang.cn/index.html this resource, generates a response packet, and the htmlpage into the package the response message, and this message through the socket back to the client;
  4. HTTPNotification server TCPdisconnect (but until TCPconfirmation that the client has received a complete message after, will be disconnected);
  5. HTTPComplete client process received good response message, TCPdisconnect. After the client parses the response message, found encapsulated object is a htmlfile, and the htmlfile contains 5images of reference;
  6. Repeat the above procedure 1-4, containing the requested page 5images;

  Disadvantage of non-persistent connection is very clear, that is for each request / response needs to establish a TCPconnection, this will cause the server to greatly increase the need to maintain a connection, such as a page contains 10pictures, a total that would have 2established 11a connection, this will server enormous pressure. On the upside, a plurality of connections can be established simultaneously (typically a browser may simultaneously establish 5-10connections), represents a plurality of channels, the data transmission between the channels in parallel, a plurality of request / response may be performed simultaneously, so as not to causing queuing situation, higher efficiency.

  (2) persistent connection

  Represents a continuous connection request and a response sent by the server after a client establishes a connection with a server, a period of time, sent to the server, which can be performed by a connector. This should be well understood. The persistent connection is also divided into two types:

  1. Without persistent connection pipeline: This indicates that only a one-time request / response, and the next have to wait after the completion of the last;
  2. Continuously connected with the line: This represents a request for an object may be sent one, without the need for other outstanding request completion (but not completely in parallel);

  For a long connector unused, HTTPit would close it, and this can also be configured timeout period.

  The benefits of persistent connections is also very clear, that is to save resources, multiple requests share a connection; but the drawback is that efficiency may be relatively lower. HTTP is the default mode with a constant connection pipeline .


 2.3 HTTP message format

  Next, we have to talk about the message format HTTP protocol it. HTTPMessage into the request packet and response packet.

  (1) Request message

  Here is a browser be taken from me down the HTTPrequest packet, a resource request is a picture:

  HTTPThe first line of the request message request line , the rest of the call header row , Let me line by line explanation of the contents of the above:

  First, is the first line, i.e. the request line, which contains three parts: the request method, resource path, HTTP protocol version , they are separated by three spaces. The first part of the request method represents a request sent by the client to the browser category, our usual way request GETand POSTrequest:

  • GET : request resources from the server, the server returns the requested resource;
  • The POST : the data submitted to the server and request processing (like submitting a form), the data is contained in the request body. POSTRequest may result in a revision to establish and / or existing resources, new resources;

  The above embodiment two requests are HTTP1.0defined, and HTTP1.0in addition to the above two request methods, there is a HEADrequest:

  • The HEAD : similar to the GET request, returns a response but not the specific content, for obtaining the header;

  In the HTTP1.1middle, we have added six requests, respectively OPTIONS, PUT, PATCH, DELETE, TRACEand CONNECTmethods, which I will not define one listed, you can click the link to view the back - HTTP request method .

  Immediately after the request method is a resource in HTTPthe path on the server, the message is the path /img/prop3.pngto express our request is HTTPunder server path, imgfolder under prop3.pngthis picture. After this HTTP1.1indicated that the request to use the HTTPversion of the.

  The request line is below the header row, and the row is the header name: valueformat, nameindicate the name of the header, and valueis a concrete value of the header. The first name called a header row Host, represents the HTTPaddress of the server is located, and the address here is www.tewuyiang.cn . The second name is the header row Connection, this representation is that we mentioned above HTTPconnection of the type, and its value is keep-alive, is to tell the server, using a persistent connection, if the value is close, representation is non-persistent connections . The third line of the head User-Agentrole is to specify a user agent, which tells the server to send the HTTP request type of browser. The role of the fourth row Accept header tells the server what type of hope to receive the resource, if the resource server response inconsistent with this, will be thrown, and can be seen from the above message, this request want to be a picture. RefererThe role is to prevent malicious requests, improve the security of access to resources. Accept-EncodingThe role is to tell the server, the current browser supports encoding type. Accept-LanguageThe role is to tell the HTTPserver the client wants to acquire language version of the resource, if the server does not contain this language, the default version will be sent back.

  Below this map is HTTPa request packet standard format:


  (2) response packet

  Similarly, we look at a response message:

  Response packet first section consists of two parts, namely, HTTP version and status code , the above message, HTTPthe version 1.1, the same version of the request, immediately after the status code 200, which is the most common state code indicating that the request was successful. If want to know other status codes, you can click on the rookie tutorial reference, here are four common:

  • 200 - 请求成功;
  • 301 - 资源(网页等)被永久转移到其它URL;
  • 404 - 请求的资源(网页等)不存在;
  • 500 - 内部服务器错误;

  第一行之后的这些行,被称为首部行,与请求报文中的首部行类似,也是name: value。第一个首部行的名称叫做Accept-Ranges,它的作用是告知客户端,此资源是否支持范围请求,而范围请求可以支持断点续传多线程分片下载bytes表示支持,而none表示不支持。Last-Modified的作用后面说缓存时单独拿出来说。Content-type的作用就是标识资源的类型,这里image/png表示资源是一张图片。Content-Length表示资源的字节数,图片中的值是8729,表示这张图片共有8729个字节。最后一个Date的作用就是表示服务器发送该响应报文的日期时间。

  下面这一张是HTTP响应报文的标准格式,可以看到,在最后面还有一个叫实体体的部分,这里就是用来放服务器回送的资源的,例如请求的图片。


 2.4 Web缓存器

  Web缓存器也叫代理服务器,它在某些情况下可以代替HTTP服务器满足客户的需求。Web缓存器有自己的存储空间,并保存有最近被请求资源的副本。它的作用故名思意,就是提供缓存机制的。若部署了Web缓存器,则可以配置浏览器,使得浏览器的HTTP请求首先发送至Web缓存器,下面我们通过一个例子来讲解Web缓存器的机制。

  假设我现在要请求www.tewuyiang.cn这个服务器上的prop3.png这张图片,结果将发生以下情况:

  1. HTTP客户端创建一个到Web缓存器的TCP连接,并向Web缓存器发送一个请求报文;
  2. Web缓存器接收到请求报文,查看自己的本地是否包含被请求资源的副本,若包含,则由Web缓存器创建响应报文,并将此副本通过响应报文返回给HTTP客户端;
  3. Web缓存器中不包含此资源的副本,则Web缓存器将向HTTP服务器(这里指的就是www.tewuyiang.cn)发起一个TCP连接,并向服务器请求客户端需要的资源;
  4. 服务器创建响应报文,将请求的资源响应给缓存器,缓存器接收到响应报文,解析响应报文携带的资源,并复制一份副本存储在本地,然后重新创建一份响应报文,并将副本封装进其中,发送给最初请求资源的客户端;

  通过上面的步骤我们可以看到,Web缓存器在这个过程中,既充当服务器的角色,又充当客户端的角色。而部署了Web缓存器后,将大大减少服务器响应资源的时间。


 2.5 条件GET方法

  介绍完上面的Web缓存器后,很多人可能会有一个疑问:怎么能够保证Web缓存器上的资源是最新的呢,若服务器上的资源被更新,而我们请求获得的却是缓存器上没有被改变的旧资源怎么办?HTTP自然是有办法解决这个问题,这时候就要用到我们在讲解响应报文时跳过的首部行Last-Modified了,而这种机制叫做条件GET

  Last-Modified首部行记录的是当前被请求的资源,在服务器上最后被修改的时间。当我们请求一个Web缓存器上没有的资源时,Web缓存器向HTTP服务器转发该请求,而服务器响应缓存器,同时在响应报文中包含Last-Modified首部行。Web缓存器在存储资源的副本时,同时也将Last-Modified的值存了下来。当下一次有客户端请求此资源时,Web缓存器会发送一个条件GET请求到服务器,请求中包含这个时间值,且此时的命名为Last-Modified-Since。服务器接收到这个时间值后,将它与服务器本地记录的这个资源的最后修改时间进行比较,若两者相等,表示上次请求到这次请求之间,这个资源并未更新,服务器将告知Web缓存可以直接使用它存储的副本;若两者不相同,则服务器会将最新的资源,以及新的Last-Modified发送至Web缓存器,Web缓存器更新本地的副本,并响应给客户端。


三、总结

  上面的内容对HTTP协议以及它的一些机制进行了一个大致的介绍,相信看完之后,能够让你对HTTP有一个大致的了解。当然HTTP的内容肯定不止这些,只是限于篇幅,以及我的知识储备,这篇博客就先写上这些吧。日后有时间,再写一写HTTP的其他部分,例如cookiesession


四、参考

《计算机网络——自顶向下方法(原书第七版本)》

Guess you like

Origin www.cnblogs.com/tuyang1129/p/12381258.html