Computer Network-Chapter Two, Application Layer

1. Overview of application layer protocol

应用层对应用程序的通信提供服务

Application layer protocol definition

The type of message exchanged by the application process, whether it is a request or a response.

The syntax of various message types, such as each field in the message and its detailed description.

The semantics of the field, that is, the meaning of the information included in the field.

When and how the process sends messages, and the rules for responding to messages.

Main functions and corresponding protocols of the application layer

File transfer, access management. (FTP)
email (SMTP, POP3)
virtual terminal (HTTP)
query service, remote login (DNS)

Client-server (C/S) model

Server: A device that provides computing services, provides services, and access addresses/domain names.

Client: The host requesting computing services. Communicate with the host, use the service provided by the server, access the network intermittently, use a dynamic IP address, and not directly communicate with other clients.

P2P model

Insert picture description here

P2P can be understood as a peer-to-peer Internet. In the P2P model, network participants share some of the resources they own. These resources provide services and content through the network and can be directly accessed by other peer nodes. Nodes may change their IP addresses. Network participants are both service providers (servers). ), it is also the resource acquirer (client).

Process communication

The communication is actually the thread. The processes on the two different end systems communicate with each other by exchanging messages across the computer network. The sending process generates and sends messages to the network, and the receiving process receives these messages and may respond by sending the messages back.

Client and server processes

Network applications are composed of pairs of processes that send messages to each other over the network.

For each pair of communication processes, we usually identify one of these two processes as a client, and the other process as a server. In a given communication session scenario between a pair of processes, the process that initiates the communication (that is, initiates contact with other processes at the beginning of the session) is identified as the client, and the process waiting to be contacted at the beginning of the session is the server.

For the Web, the browser is a client process, and the Web server is a server process. For PP file sharing, the peer who downloads the file is identified as the client, and the peer who uploads the file is identified as the server.

Process and computer network interface

Messages sent from one process to another must pass through the following network, and the process 套接字sends and receives messages to the network through a software interface called .

Since the socket is a programmable interface for establishing network applications, the socket is also called an application programming interface between the application and the network （Application Programming Interface，API). Application developers can control everything on the application layer of the socket, but they have almost no control over the transport layer of the socket. Application developers’ control over the transport layer is limited to

①Select the transport layer protocol

②May be able to set several transport layer parameters, such as the maximum buffer and the maximum segment length, etc.

Once the application developer selects a transport layer protocol, the application is built on the transport layer services provided by the protocol.

Process addressing

In order for a process running on one host to send packets to a process running on another host, the receiving process needs to have an address. In order to identify the receiving process, two kinds of information need to be defined:

①The address of the
host ②Define the identifier of the receiving process in the destination host.

Insert picture description here

domain name

Domain name server

Insert picture description here

Domain name resolution process

Insert picture description here

Two, HTTP protocol

Overview

The application layer protocol of the Web is the Hypertext Transfer Protocol (HTTP). HTTP defines the way that a web client requests a web page from a web server, and the way the server transmits a web page to the client.

Web pages are composed of objects, and an object is a file. The web server implements the HTTP server, which is used to store web objects, and each object is addressed by url.

HTTP uses TCP as the supporting transmission protocol. The HTTP client first initiates a TCP connection with the server. Once the connection is established, the browser and server process can access TCP through the socket interface.

Note : The server sends the requested file to the client without storing any status information about the client. So HTTP is one 无状态协议. (The communication parties do not need to establish an HTTP connection when exchanging HTTP messages) The web server is always open and has a fixed IP address.

Insert picture description here

Persistent and non-persistent connections

In many Internet applications, the client and server communicate over a fairly long period of time, where the client issues a series of requests and the server responds to each request. This series of requests can be sent out one after another periodically or intermittently at regular intervals. When this kind of client-server interaction is carried out via TCP. Then each request/response pair is sent via a separate TCP connection, or all requests and responses are sent via the same TCP connection.

The former method is called non-persistent connection, and the latter method is called persistent connection.

Disadvantages of non-persistent connections: a brand new connection must be established and maintained for each requested object. For each such connection, the client and server must allocate TCP buffers and maintain TCP variables, which brings a serious burden to the Web server.

Insert picture description here

Message structure

HTTP messages are text-oriented, so each field is a string of ASCII codes.

Insert picture description here

The following is a typical request message:
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/5.0 Accept-language:fr

The first line is called the request line, and the rest are called the header line. Host indicates the host where the object is located. connection: close tells the server to not continue to connect, and close it after the page is transferred. User-agent, indicates the user agent.

响应报文
HTTP/1.1 200 OK
Connection: close
Date: Tue,09 Aug 2011 15:44:04GMT
Server: Apache/2.2.3 (centos)
Last-Modified:Tue,09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html

An initial state line, 6 header lines, and then the entity body.
The server uses the Connection: close header line to tell the client that the TCP connection will be closed after sending the message.
Date: The header line indicates the date and time when the server generated and sent the response message.
Server: The header line indicates that the message is generated by an Apache Web server. Similar to the User-agent in the request message.
Last-Modified: The first line indicates the date and time when the object was created or last modified.
Content-Length: The header line indicates the number of bytes in the object being sent.
Content-Type: The header line indicates that the object in the entity body is HTML text.

cookie

Cookie technology has 4 components

A cookie header line in the HTT response message
A cookie header line in the HTTP request message
A cookie file is kept in the client system and managed by the user's browser
A back-end database located on the Web site.

Cookies can be used to identify a user. When a user visits a site for the first time, he may need to provide a user identification. In subsequent sessions, the browser passes a cookie header to the server, thereby identifying the user to the server. Therefore cookies can establish a user session layer on top of stateless HTTP.

web cache

The web cache server is a proxy server, which is a network entity that can satisfy HTTP requests on behalf of the initial web server.

If the browser is requesting an object, the following will happen:

The browser establishes a TCP connection to the web cache and sends an HTTP request to the object in the web cache.
The web cache checks to see if a copy of the object is stored locally. If so, the web cache returns the object to the client browser with an HTTP response message.
If the object is not in the web cache, it opens a TCP connection with the original server of the object. The web cache sends an HTTP request to the object on the TCP connection from the cache to the server. After receiving the request. The initial server sends an HTTP response with the object to the web cache.
When the web cache receives the object, it stores a copy in the local storage space and sends the copy to the client's browser with an HTTP response message (through the existing TCP connection between the client browser and the web cache) ).

Three, file transfer protocol FTP

In an FTP session, a user transfers (or receives) files from a remote host on a host (local host). In order for a user to access its remote account, the user must provide a user ID and password. After providing this authorization information, the user can transfer files from the local file system to the remote host file system.

Both HTTP and FTP are file transfer protocols, and have many common features, for example, they both run on TCP. However, these two application layer protocols also have some important differences. One of the most notable is that FTP uses two parallel TCP connections to transfer files, one is the control connection and the other is the data connection.

The control connection is used to transfer control information between two hosts, such as user IDs, passwords, commands to change remote directories, and commands to "put" and "get" files. The data connection is used to actually send a file.

Insert picture description here

The FTP server must retain the user's state during the entire session, and the server must associate a specific user account with the control connection.

A large part of the FTP servers on the Internet are called "Anonymous" FTP servers. The purpose of this type of server is to provide file copy services to the public, and does not require users to register with the server in advance, nor do they need to obtain the authorization of the FTP server.

Anonymous file transfer) enables users to establish a connection with a remote host and copy files from the remote host with an anonymous identity, without being a registered user of the remote host. Users log in to the FTP service with the special user name "anonymous", and then they can access the public files on the remote host.

4. E-mail system

The email system mainly consists of three components: user agent, mail server, and simple mail transfer protocol.

Composition structure

Insert picture description here

A typical sending process is:

Starting from the user agent of the sender, it is transmitted to the mail server of the sender, and then transmitted to the mail server of the recipient, and then distributed to the mailbox of the recipient here.
When the recipient wants to read the message in his mailbox, the mail server containing his mailbox (using the user name and password) will authenticate the recipient.
If the sender’s server can deliver the mail to the recipient’s server, the sending server keeps the message in a message queue and tries later. If it fails after a few days, the server deletes the message and sends it to the recipient’s server. Inform the sender in the form of e-mail.

SMTP protocol

SMTP is the core of Internet e-mail applications.

SMTP specifies how information should be exchanged between two SMTP processes that communicate with each other. The SMTP process responsible for sending mail is the SMTP client, and the process responsible for receiving mail is the SMTP server.

SMTP provides 14 commands (several letters) and 21 response messages (three-digit code + simple text description).

SMTP is limited to 7-bit ASCII code. If a binary file such as a non-English language or picture appears, it needs to be re-encoded to 7-bit ASCII code. At the same time it puts all objects in one message. However, HTTP encapsulates each object into an HTTP response message.

TCP connection port number 25 C/S

Three stages of SMTP communication:连接建立，邮件传送，释放连接

Insert picture description here

MIME

Disadvantages of SMTP: SMTP cannot transmit executable files or other binary objects (encoding is required). The SMTP instrument is limited to transmitting 7-bit ASCII codes, and cannot transmit characters in other non-English speaking countries. The SMTP server will reject emails that exceed a certain length.

So MIME is the universal Internet mail extension

Insert picture description here

POP3

POP3 is an extremely simple mail access protocol, defined by RFC 1939. Because the protocol is very simple, its function is quite limited. When the user agent (client) opens a TCP connection to the mail server (server) port 110, POP3 starts to work. With the establishment of a TCP connection, POP3 works in three stages.

特许( authorization)、事务处理以及更新。

In the first stage, the authorization stage, the user agent sends (in clear text) the username and password to authenticate the user.

In the second stage, the transaction processing stage, the user agent retrieves the return message; at the same time, the user agent can also perform the following operations at this stage, mark the message for deletion, cancel the message deletion mark, and obtain the statistical information of the message.

In the third stage, the update stage, it appears after the client issues the quit command in order to end the POP3 session. At this time, the mail server deletes the messages that are marked as deleted.

During POP3 transaction processing, the user agent issues some commands, and the server responds to each command. There may be two answers: +OK (sometimes followed by server-to-client data), which is used by the server to indicate that the previous command is normal. -ERR, used by the server to indicate that some errors occurred in the previous command.

During the POP3 session between the user agent and the mail server, the POP3 server retains some state information. In particular, it records which user messages are marked as deleted. However, the POP3 server does not carry status information during the POP3 session.

IMAP

The IMAP protocol is more complicated than the POP protocol. When the IMAP client program on the user's PC opens the mailbox of the IMAP server, the user can see the header of the mailbox. If the user needs to open an email, the email is uploaded to the user's computer.

The IMAP server associates each message with a folder. When the message arrives at the server for the first time, it is associated with the recipient's INBOX folder. The recipient can move the message to a new, user-created folder, read the message, delete the message, etc. The IMAP protocol provides users with commands to create folders and move mail from one folder to another. IMAP also provides users with commands to query mail in remote folders.

IMAP allows users to use different computers in different places to read and process e-mails at any time, and it also allows only a certain part of the e-mail to be read.