HTTP and telnet basic usage basis

HTTP Overview

The early 1990s, a major emerging application that is the World Wide Web (World Wide Web) on the stage. The Web is a public attention of Internet applications. Web application layer protocol is Hypertext Transfer Protocol (HTTP) , which is the Web's core. HTTP is implemented by two programs: a client program and a server program. Client and server programs run on different end systems, conversation by exchanging HTTP messages. HTTP session defines the structure of these messages as well as client and server conduct of packet switching.

Web page (also called a document) consists of objects. An object is simply a file, such as an HTML file, a JPEG graphics, a Java applet or a video clip of these files, and they can be addressed by a URL address . Most Web pages contain a basic HTML file and several referenced objects. For example, if a Web page contains basic HTML files and JPEG graphics 5, then the Web page six objects: a basic HTML files plus 5 graphics. The basic HTML files referenced by other objects in the page URL address of the object. Each URL address consists of two parts: the path name of the object storage server host name and the object. Web browsers implement the HTTP client, Web server implements the HTTP server, which is used to store Web objects, each addressable by a URL.

HTTP defines the way Web client requests a Web page, and how the server delivers a Web page to the Web server to the client, its basic idea is that when a user requests a Web page (such as clicking a hyperlink), the browser sends to the server the page included in an HTTP request message object, the server receives the request and responds with a packet containing HTTP response to these objects.

HTTP uses TCP as its transport protocol support (instead of running on UDP). HTTP client first initiates a TCP connection to the server. Once the connection is established, the browser and server processes can access the interface via TCP socket. The client sends an HTTP request message and the message from its socket interface receives the HTTP response to its socket interface. Similarly, the server receives the HTTP request and its socket interface transmits an HTTP response packet from its socket interface. Once the customer sends his sockets interface to request a packet, the packet from the client to control and access control of TCP. TCP provides reliable data transmission for the HTTP service. This means that each HTTP request sent by a client process eventually complete packet reaches the server; Similarly, the HTTP server process for each message sent in response to eventually complete reach customers.

It is important to note that the following phenomenon: the server sends the requested files to clients without storing any state information about the customer. If a particular customer requests the same object twice in a matter of a few seconds, and the server will not just provide the object for the client will no longer respond, but resend the object, as server had completely forgotten as long ago as done. Because the HTTP server does not save any information about the customer, so we say that HTTP is a stateless protocol .

Non-persistent connections and persistent connections

In many Internet applications, the client and server in a long time range communication, where the client issues a series of requests and the server responds to each request. Based applications and use of the application, this request may be a series of periodic intervals issued regular or intermittent contact a. When such client - server interaction is performed by the TCP, the developer of the application to do an important decision, i.e. each request / response pair is connected via a separate TCP transmission, or all requests via their respective the same TCP connection to send it? The former method is employed, the application is called a non-persistent connection; the latter method is employed, the application is referred to the use of persistent connections. Can be used as both non-persistent HTTP connections, it is possible to use persistent connections. Although the use of HTTP persistent connections in default mode, HTTP client and server can also be configured to be non-persistent connection.

Using non-persistent HTTP connections

We look at the non-persistent connection, the step of transmitting a Web page from the server to the client. Assume that the page contains a basic HTML file and JPEG graphics 10, 11 and which objects are located on the same server. URL of the HTML file is: http://www.someSchool.edu/someDepartment/home.index .

We look at what happened:

  • HTTP client process initiates a TCP connection to the server www.someSchool.edu port 80, the port number is the default port for HTTP. On the client and server each have a socket associated with the connection.
  • HTTP client to the server through socket it sends an HTTP request message. Request packet contains the path name /someDepartment/home.index.
  • HTTP server process receives the request packet through its socket, the object http://www.someSchool.edu/someDepartment/home.index retrieved from memory (RAM or disk), in one HTTP response packet encapsulated object and its socket by sending a response message to the client.
  • HTTP server process notification TCP disconnect the TCP connection. (But TCP acknowledgment until the customer has received the complete response message so far, it will actually break the connection.
  • HTTP client receives the response packet, TCP connection is closed. The newspaper noted encapsulated object is an HTML file, the client is extracted from the response message out the file, check the HTML file, get a reference to 10 JPEG graphics.
  • Repeat the previous four steps for each referenced JPEG graphic object.

The above steps illustrate the use of a non-persistent connection, wherein each TCP connection after closing the server sends an object, i.e. the connection is not for other objects down continuously. It is noted that each TCP connection only to transmit a packet and the request response packet.

In the above-described steps, we deliberately did not clear which customers receive 10 JPEG graphic object is to use the 10 serial TCP connections, or some JPEG objects using some parallel TCP connections. In fact, users can configure modern browsers to control the degree of parallelism . In default mode, most browsers open 5 to 10 concurrent TCP connections, and each connection processing in response to a transaction request. When the user wants, the maximum number of simultaneous connections may be set to 1, so that the serial connection is established 10.

We simply estimate the basic HTML file request from the client to play the customer receives the time it takes to stop the entire file. To this end, we give round-trip time (Round-Trip Time, RTT) definition, which refers to a short time before returning packet time it takes customers from the client to the server. RTT includes a packet propagation delay, queuing delay of the packet and the packet processing delay in the intermediate routers and switches. Now consider what happens when the user clicks on a hyperlink. As shown in Figure 2-7, which caused between the browser and the Web server that initiates a TCP connection; this involves a "three-way handshake" process. Namely the client to the server sends a small TCP segment, the server responds with a confirmation and make a small segment TCP, finally, the customer returns an acknowledgment to the server. The first two parts in the time it takes to take up a three-way handshake RTT. After completion of the first two parts of the three-way handshake, the client in conjunction with the third portion of the three-way handshake (acknowledgment) to the TCP connection transmitting an HTTP request message. Once the request packet reaches the server, the server in the HTML file sent on the connection of the TCP. The HTTP request / response spent another RTT. Accordingly, coarsely, the total response time is two time RTT plus the server transmits an HTML file.

Using HTTP persistent connections

Non-persistent connections have some shortcomings. First, you must request each object the establishment and maintenance of a new connection. For each such connection, the client and server TCP buffers must be allocated and TCP variables remain, which gives Web server brought serious burden, since a Web server may also serve hundreds of different the customer's request. Second, as we have just described, each object is subjected to cross-Fu Shiyan twice the RTT, RTT that is used to create a TCP, RTT to request and receive another one object.

In the case of a persistent connection, the server holds the TCP connection is opened after sending a response. In subsequent requests between client and server, and the same message can be sent using the same connection response. In particular, a complete Web page (HTML file on the basic embodiment of the pattern plus 10) may be connected continuously with a single transmit TCP. What is more, located on the same server when sending multiple Web pages from the server to the same customer, you can be on a single persistent TCP connection. May be issued one after the request for an object, without waiting for the answer to the pending requests (pipelining) of. Generally, if a connection after a certain time interval (a configurable timeout interval) is not used, HTTP server closes the connection. The default mode is to use HTTP pipelined persistent connections.

HTTP packet format

HTTP messages, there are two: the request packet and response packet.

HTTP request packet

The following provides a typical HTTP request:

GET /somedir/page.html HTTP/1.1

Host: www.someschool.edu

Connection: close

User-agent: Mozilla/5.0

Accept-language: fr

By carefully observing this simple request message, we will know a lot of things. First of all, we see that the message is written in plain ASCII text, we see that the message consists of five lines, each line terminated by a carriage return and line feed. The last line then append a carriage return. A request packet can have more rows or at least one row. The request line method field can take on several different values, including GET, POST, HEAD, PUT and DELETE. When the browser requests an object, using the GET method, the request with the URL field identifying the object, in the present embodiment, the browser is requesting the object /somedir/page.html. Version field which is self-explanatory; in this case, the browser is implemented HTTP / 1.1 version. Now we look at the header line of this example. Header line Host: www.someschool.edu specifies the host object is located. You might think that this header line is unnecessary, because the host has a TCP connection exists, however, this header information line provides Web proxy cache is required. By including the Connection: close header line, the browser tells the server not want to bother to use persistent connections, it requires the server to close connections after this finished sending the requested object. User-agent: header row for specifying the user agent, the browser sends a request to the server type ie. Here the type of browser is Mozilla / 5.0, that is, the Firefox browser. The header row is useful because the proxy server can effectively different versions of the same object actually sent to different types of users. (Each version by the same URL address.) Finally, Accept-language: header line indicates that the user wants the French version of the object. If no such object server, the server should send its default version.

Next, look at FIG request message a common format as shown in 2-8. You may have noticed in the header line has a "subject entity" (and additional carriage return and line) after. GET method is the entity body is empty, and when using the POST method using the entity body. When the user submits the form, HTTP clients often use the POST method, for example, when users search keywords to the search engine. When using the POST message, the user can still request a Web page to the server, but the specific content of Web pages depends on what the user entered in the form field. Is the input value of the user in the form field if the field is a method POST, the entity body contains.

Of course, if you do not mention "with form generation request packet does not have to use the POST method," this, it would be a dereliction of duty. HTML forms often use the GET method, URL and (form field) included in the requested data input. For example, a form using the GET method, which has two fields, each fill is "monkeys" and "bananas",

In this way, the URL structure www.somesite.com/animalsearch? Monkeys & bananas.

HEAD method is similar to the GET method. When the server receives the HEAD request process, it will respond with an HTTP message, but does not return the requested object. HEAD application developers commonly used method for debugging trace. PUT method often used in conjunction with issuance and Web tool that allows users to upload objects to the specified Web server specified path (directory). PUT also by those who need to use the Web server to the application to upload objects. DELETE method allows users or applications remove objects on the Web server.

HTTP response message

Below we offer a typical HTTP response message. The response message may be a response to the example just discussed a request message.

HTTP/1.1 200 OK

Connection: close

Date: Tue, 09 Aug 2011 15:44:04 GMT

Server: Apache/2.2.3 (CentOS)

Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT

Content-Length: 6821

Content-Type: text/html

(data data data data data ...)

​ 我们仔细看这个响应报文。实体主体部分是报文的主要部分,即它包含了所请求的对象本身(表示为data data data data data ...)。我们现在来看看首部行。服务器用Connection:close首部行告诉客户,发送完报文后将关闭该TCP连接。Date:首部行指示服务器产生并发送该响应报文的日期和时间。值得一提的是,这个时间不是指对象创建或者最后修改的时间;而是服务器从它的文件系统中检索到该对象,插入到响应报文,并发送响应报文的时间。Server:首部行指示该报文是由一台Apache Web服务器产生的,它类似于HTTP请求报文中的User-agent:首部行,Last-Modified:首部行指示了对象创建或者最后修改的日期和时间。Last-Modified:首部行对极可能在本地客户也可能在网络缓存服务器(代理服务器)上的对象缓存来说非常重要。Content-Length:首部行知识了被发送对象中的字节数。Content-Type:首部行指示了实体主体中的对象是HTML文本。(该对象类型应该正式地由Content-Type:首部行而不是用文件扩展名来指示。)

​ 看过一个例子后,我们再来查看响应报文的通用格式(如图2-9所示)。我们补充说明一下状态码和它们对应的短语。状态码及其相应的短语指示了请求的结果。一些常见的状态码和相关的短语包括:

  • 200 OK:请求成功,信息在返回的响应报文中。

  • 301 Moved Permanently:请求的对象已经被永久转移了,新的URL定义在响应报文的Location:首部行中。**客户软件将自动获取新的URL。

  • 400 Bad Request:一个通用差错代码,指示该请求不能被服务器理解。

  • 404 Not Found:被请求的文档不在服务器上。

  • 505 HTTP Version Not Supported:服务器不支持请求报文使用的HTTP协议版本。

你想看一下真正的HTTP响应报文吗?很容易做到。首先用Telnet登录到你喜欢的Web服务器上,接下来输入一个只有一行的请求报文去请求放在该服务器上的某些对象。

在linux终端输入完telnet www.baidu.com 80后,会是下面这种情况:

然后按下ctrl + ]呼出telnet命令行出现下面这种情况:

先按下回车键,再输入HTTP请求,最终得到HTTP响应如下:

在telnet命令行上输入quit退出telnet,如下图:

Guess you like

Origin www.cnblogs.com/Gland/p/11944159.html