Computer Network--Application Layer

Table of contents

Application layer overview

Insert image description here

  • Features
    • No application layer software in the network core
    • Network core has no application layer functionality
    • Network applications only exist on the end system, rapid network application development and deployment

1. Process communication

In a computer network, processes , not programs , communicate . Processes on different end systems are exchanged over a computer networkmessageAnd communicate with each other.

  • Process and computer network interface
    • The process is calledsocketThe software interface sends messages to and receives messages from the network. The socket is equivalent to the "door" of the "house" of the process. Sockets can also be called Application Programming Interface (API, Application Programming Interface) .

    • The application developer has control over everything on the application layer side of the socket , but control over the transport layer is limited to:

      • Select transport layer protocol
      • Perhaps a few transport layer parameters can be set, such as maximum buffering and maximum segment length

Insert image description here

  • process addressing
    • In order to identify the receiving process, it is defined: ① The host address; ② The identifier of the designated receiving process in the destination host
      • A host is uniquely identified by its IP address (a 32-bit quantity)
      • The port number can be used to specify the receiving process (receiving socket) of the destination host.

      A list of well-known port numbers for all Internet standard protocols can be found at: http://www.iana.org.

2. Transportation services for use by applications

  • Reliable data transfer
    packets may be dropped in computer networks due to router buffer overflow or bit corruption .
    • During the data transmission process of some applications, a small part of the data loss will not have a great impact on the files, such as multimedia applications (audio/video, etc., small interference will not have a large impact)
    • But for some applications such as email, file transfer, remote host access, web document transfer, and financial applications, data loss will have catastrophic consequences, so it is necessary to provideReliable data transmission to ensure data integrity
  • Throughput
    • Bandwidth-sensitive applications
      have specific requirements for throughput , and if the throughput is lower than the required throughput, the data will either be encoded at a lower rate, or the data will be given up (the throughput is too low to be of little or no use). Such as: Internet telephony applications.
    • Elastic applications
      utilize available throughput based on available bandwidth (more is always better, of course). Such as: email, file transfer, web transfer.
  • safety
    • The transport protocol encrypts data before sending it to the host and decrypts it before delivering it to the receiving process. (Provides confidentiality between sending and receiving)
    • Transport protocols also provide data integrity and endpoint authentication

3. Transportation services provided by the Internet

  • TCP service
    Insert image description here
    • Features

      • connection-oriented services(After the three-way handshake, a TCP connection is established to transmit data)

      • Reliable data transfer service

        • Error-free, data delivery in proper order
        • No byte loss and redundancy
      • full duplex connection

      • The connection must be dismantled after sending the message.

    • TCP congestion control mechanism

      • When network congestion occurs, TCP will inhibit the sending process .
      • Congestion control also attempts to limit each TCP connection so thatFair sharing of network bandwidth

It does not necessarily bring benefits to the communication process, but brings benefits to the Internet as a whole.

  • UDP service
    A lightweight transport protocol that does not provide unnecessary services and only provides minimal services.

    • Features:
      • no connection(so no handshake process)
      • unreliable data transfer
        • Data may be lost
        • Data may arrive out of order
    • No congestion control mechanism
      • The sender can select any rate to inject data into the network layer ( actual throughput may be less than this rate due to limited bandwidth or congestion on the intermediate link )
  • other
    Insert image description here

1. Network application model

1.1 C/S model (client/server, client/server model)

definition

The C/S model refers to the Client/Server model, which is a common network application architecture. In this model, there is aClient program (Client) and a server program (Server), client and serverCommunicate via specific protocols

working principle

In this model, the client program sends a request to the server, the server receives the request and returns a response, and the client performs corresponding operations based on the response. The C/S model can be implemented through the network, so that the client and server can be located on different computers .从而实现分布式计算和数据存储

Features

  • client:

    • Actively communicate with the server
    • Intermittent connection to the internet
    • Possibly a dynamic IP address
    • Does not communicate directly with other clients
  • server:

    • Need to keep running to provide services
    • Fixed IP address and well-known port number (convention)
    • data center expansion

Advantage

  • Interactivity : The client can interact with the server in real time to achieve real-time exchange of data.
  • Scalability : The C/S model has good scalability, and clients or servers can be added as needed to achieve system scalability.
  • Security : Because between the client and the server 通信是加密的, data transmission is relatively secure.

application

The C/S model is widely used, such as online banking, online shopping, social networks, etc. all require the use of this model.
In the C/S model, clients usually need to install independent applications, while the server is a place that provides public services and can be accessed by multiple clients at the same time.
For example, online games, online banking, e-commerce, etc.

Insert image description here

1.2 P2P (peer to peer) model and P2P file distribution

2.1 P2P model

definition

The P2P model refers to the peer-to-peer model (Peer-to-Peer model). This is a network architecture pattern,Each node acts as a client and server and can request and provide services to other nodes.. so that every node in the networkCan communicate directly with each other, without the need for relaying through a central server .

Features

  • IP address changeable: Nodes can have dynamic IP addresses
  • 可扩展性好: The P2P model can be infinitely expanded and is not limited by the number of servers , making large-scale distributed computing and file sharing possible.
  • 网络健壮性强The network is not easily paralyzed, every node in the P2P network can provide data, so when some hosts or nodes are broken or a large number of hosts flood into the network, hosts can still request and provide services normally (resources are sufficient and bandwidth is not limited). Affects the functionality of the entire network.
  • (Almost) no "server" running all the time
  • Communication between any end systems is possible

  • Self-scalability - new peer nodes bring new service capabilities and of course new service requests

  • Participating hosts are intermittently connected and can change IP addresses

  • The P2P model also has disadvantages :

    • 1. Search efficiency: In the P2P model, 搜索特定资源需要遍历整个网络,Less efficient
    • 2. Resource sharing problem: In the P2P model,Resource sharing requires network bandwidth and node computing resources, which may affect network performance.
    • 3. Difficult to manage
      • The main reasons are as follows:
        • Decentralization: The P2P model cancels the central server, and each node acts as a client and server, making management decentralized and complex.
        • Node anonymity: In the P2P model, communication between nodes is usually based on anonymity, which makes it difficult for managers to identify and locate specific nodes.
        • Dynamic: Nodes in the P2P model change dynamically and may join or leave the network at any time, which makes it difficult for managers to track and monitor network status.
        • Resource sharing issues: In the P2P model, resource sharing requires network bandwidth and node computing resources, which may affect network performance. At the same time, there are also some resource sharing problems, such as copyright infringement, illegal content, etc.

Therefore, the management of the P2P model requires some special measures, such as reputation-based systems, filters, content review, etc., to achieve effective management of the P2P network.

application

The P2P model can be applied to various fields, such as file sharing, distributed computing, network storage, etc. The P2P model is suitable for large-scale, distributed computing and file sharing scenarios, such as BitTorrent, Emule and other file sharing software.
* Example: Gnutella, Thunder

Insert image description here

2.2 P2P file distribution

  • Distribution time:
    In C/S(客户端/服务器)模型, distribution time usually refers to the time it takes for the client to request data or services from the server, and for the server to respond and return the data or services .
    In P2P模型, refers to the time required for N peers to get a copy of the file.
  • F: file length
  • dn: nth node download rate
  • un: nth node upload rate

1. Extensibility of P2P architecture

File distribution: C/S vs P2P

Insert image description here
Insert image description here

C/S mode

Insert image description here

P2P model

Insert image description here

2.Bittorrent

2.1 Overview
  • Definition: A popular P2P protocol for file distribution.
  • Torrent: In Bittorrent terms, a collection of all peers participating in the distribution of a specific file.
    A peer can leave the torrent at any time with only a subset of blocks.

Peers in a torrent download file chunks of equal length, with a typical chunk length being 256KB.

Insert image description here
Insert image description here

2.2 Request, send file block
  • Scarcest first technique: A peer requests blocks with the smallest number of replicas among its neighbors first, to even out the number of replicas of each block in the flood.

Insert image description here

Insert image description here

Insert image description here

Insert image description here

2.3 Others

Insert image description here
Insert image description here
Insert image description here

Insert image description here
Insert image description here
Insert image description hereInsert image description here
Insert image description here

*2.4 DHT

DHT (Distributed Hash Table) is a structured P2P model.

A tree or ring ordered topology is maintained between each node. Each node has its own IP address hashed as a 16-byte ID value.

Taking the ring topology as an example, each node forms a ring according to its ID value to form an ordered network topology.
The file content is also converted into a 16-byte hash value as the ID value. 6-88 is maintained at the 88 node location. When searching for the file, directly request the 88 node to find the location of the file.

5
88
1011
199

Insert image description here

1.3 Hybrids: Client-Server and Peer-to-Peer Architectures

definition

Hybrid: Client-server and peer-to-peer architecture is a network structure pattern that combines the client-server model and peer-to-peer architecture. In a hybrid, communication between client and server and communication between peer nodes can occur simultaneously, cooperating with each other to complete specific tasks .

  • Hybrid application examples of C/S and P2P architecture
    • Napster (MP3 download software)
      • File Search: Centralized
        • The host registers its resources on the central server
        • The host queries the central server for resource location
      • File transfer: P2P
        • Between any Peer nodes
    • instant messaging
      • Online testing: centralized
        • When a user goes online, register his or her IP address with the central server
        • Users contact a central server to find the location of their online friends
      • Chat between two users: P2P

advantage

The hybrid structure has the following advantages:

Scalability : The hybrid structure has good scalability, and new nodes or components can be added at any time to improve the performance and capacity of the system.

Strong flexibility : Nodes in the hybrid structure can communicate with each other directly without going through a central server, making network connections more flexible.

High reliability : Each node in the hybrid structure bears a certain load, making the network more reliable and stable.

shortcoming

However, hybrid structures also have some disadvantages:

High complexity: The hybrid structure needs to handle client-server communication and communication between peer nodes at the same time, and the system design and implementation are relatively complex.
Resource sharing issues: In a hybrid structure, resource sharing requires network bandwidth and node computing resources, which may affect network performance .

application

The hybrid structure is suitable for scenarios that need to support both client-server communication and communication between peer nodes, such as some applications such as distributed computing, file sharing, and social networks.

2. Application related

2.1 Web and HTTP

1.1 Some terms

basic concept

  • Web page: (also called a document) consists of some objects.
  • Object: It can be HTML, JPEG images, Java applets, sound clip files, etc.
  • The web page contains aBasic HTML file, the HTML file also contains references to several objects ( 链接). Web pages are nested with links to many objects, rather than the objects themselves.
    Number of objects in a web page = HTML basic files + number of others (images, videos, etc.)
  • By URLmaking a reference (access) to each object
    • 访问协议,用户名,口令,端口等

WWW

Insert image description here

URL

  • Definition:
    Uniform resource locator (URL) is the Internet'sUsed to specify the location of information on the World Wide Web server (www)Expression method

  • Format:
    The general syntax format of URL is:
    protocol :// hostname[:port] / path / [:parameters][?query]#fragment
    The ones with square brackets are optional

    • protocol: Used to specify the protocol used. For example, commonly used protocol names are: HTTP (Hypertext Transfer Protocol), HTTPS (Hypertext Transfer Protocol Secure), FTP (File Transfer Protocol), FTPS (Secure File Transfer Protocol), SFTP ( SSH file transfer protocol) and so on.

    HTTPS is an HTTP channel aimed at security. Simply put, it is a secure version of HTTP. It uses Secure Socket Layer (SSL) for information exchange, so the encryption details require SSL.
    URLs prefixed with https:// use the SSL/TLS security protocol to encrypt between requests and responses.
    The main function of the HTTPS protocol is to encrypt the http protocol and ensure the security of data transmission by establishing a secure channel.
    During HTTPS communication, digital certificates are used for authentication and encrypted communication.
    Digital certificates are issued by a trusted third-party organization and are used to verify the legitimacy of the identities of both communicating parties.
    HTTPS uses port 443, not port 80 like HTTP.

    • hostname:Refers toThe Domain Name System (DNS) hostname or IP address of the server hosting the resource. Some websites support anonymous access, while others require a username and password to access.在主机名前包含连接到服务器所需的用户名和密码(格式:username:password@hostname)。

    • port (port number) : used forSpecify the port number where the resource is located默认情况下,HTTP 使用的端口号是80,HTTPS 使用的端口号是443。

    In standard URL syntax , the port number is placed after the host name , not at the end of the URL. For example, the following is a URL that contains a port number:
    http://www.example.com:8080/index.html
    In this example, the host name is www.example.com, the port number is 8080, and the path is /index. html.
    However, some non-standard URLs may put the port number at the end , but this is not standard URL syntax. Such a URL may work fine in some specific cases, but may cause problems in most cases .
    Therefore, it is recommended to follow standard URL syntax specifications and put the port number after the host name.

    • path: Used to specify the path where the resource is located, that is, the relative path of the file or directory.

    • parameters (query parameters) : used to specify some query parameters to pass some additional information to the server when requesting.

    • query :
      optional, used to pass parameters to dynamic web pages (such as web pages produced using CGI, ISAPI, PHP/JSP/ASP/ASP.NET and other technologies). There can be multiple parameters, separated by the "&" symbol . On, the name and value of each parameter are separated by the "=" symbol .

    • fragment (fragment identifier): Used to specify a part of a resource, such as a timestamp in a video or a page number in a document, etc.

1.2 HTTP Overview

HTTP vs. HTTPS
Insert image description hereRFC explained

"Hyper Text" (HT) is the Chinese abbreviation of Hyper Text.

  • Hypertext is a network text that uses hyperlinks to organize text information in various spaces. Display text and content related to the text.
  • The text contains links to other locations or documents.Connect different 对象links together to form a network structure, allowing you to switch directly from the current reading position to the position pointed to by the hypertext link.

Hypertext Transfer Protocol (HTTP)

  • Simple request-response protocol
  • It usually runs on top of TCP .
  • It specifies what kind of messages the client may send to the server and what kind of response it gets.
  • 请求和响应消息的头以ASCII形式给出
  • The message content has a MIME-like format.

Insert image description here

  • Socket is an abstraction of endpoints for two-way communication between application processes on different hosts in the network.
  • TCP (Transmission Control Protocol): is aReliable and connection-orientedtransport protocol, canRealize functions such as reliable transmission and flow control of data packets
  • UDP (User Datagram Protocol):
    is aunreliable connectionless protocolFast data transfer can be achieved, but there may be problems such as data loss
  • Connectionless service
    means that the communicating parties do not need to establish a communication line in advance . Instead, each packet (message group) with a destination address is sent to the line, and the system independently selects a route for transmission.
  • Connection-oriented service
    When communicating, both parties must establish a communication line in advance . The process is as follows:Establishing connections, using connections and releasing connectionsthree processes

In computer networks,Stateful and stateless refer to whether the network protocol saves information about the status of the network connection.

1. Stateless protocol means that the protocol does not save information about the status of the network connection.Stateless protocols do not keep records of previous requests or responses, complete information is required for each request. For example, the HTTP protocol is a stateless protocol. It does not save the request information sent by the client. Each request is independent.

2. Stateful protocols refer to protocols that save information about the status of network connections.Stateful protocols keep records of previous requests or responses, for subsequent processing. For example, the FTP protocol is a stateful protocol that saves the client's connection status information on the server for subsequent processing.

In computer networks, stateful and stateless protocols each have advantages and disadvantages. The stateless protocol can better support concurrent processing and load balancing , because the stateless protocol does not save connection status information, so it is easier to process multiple requests , and it can support more users under the same server resource conditions .
Stateful protocols can better support scenarios where connection state information needs to be saved, such as web applications that need to maintain sessions.

It should be noted that stateful protocols are vulnerable to attacks if not handled properly. For example, anonymous access rights to an FTP server can easily cause the FTP server to be attacked. Therefore, in practical applications, it is necessary to select an appropriate protocol according to the specific situation.

1.3 HTTP connection

HTTP connections are divided into 持续(持久)连接and非持续(非持久)连接

If you want to see which method is used for a certain HTTP connection, you can capture the packet and judge based on the Connection value in the HTTP request or response message.

  • non-persistent connection
    Insert image description here

  • persistent connection

    Connection: Keep-Alive
    

3.1 HTTP with non-persistent connections

A non-persistent connection means that after the client and server establish a connection, the client and server can only communicate between one request and response (that is, the TCP connection only performs one data transmission), and then the connection will be closed.

  • At most only 一个对象sent on TCP connection
  • Downloading multiple objects requires multiple TCP connections
  • HTTP/1.0 uses non-persistent connections

Example of a flow chart for transferring data over a non-persistent connection :
Insert image description here
Insert image description here

Response time model and RTT

  • RTT includes: groupingpropagation delay, the packets are routed on intermediate routers and switchesQueuing delay, groupprocessing delay
    Insert image description here

3.2 HTTP with persistent connections

  • Persistent connection refers toRequests and responses can be interleaved, and multiple objects are transferred on a single TCP connection (between client and server). All requests/responses are sent over the same TCP connection.
  • HTTP/1.1 uses persistent connections by default

Insert image description here

1.4 HTTP message format

The HTTP specification [RFC:1945; RFC2616; RFC7540] contains the definition of HTTP message format.

HTTP message type:

  • request message
  • response message

4.1 HTTP request message

Insert image description here

1. Common format of HTTP request messages

Insert image description here

Insert image description here
实体体在使用POST方法时被才会被使用,当使用GET方法时实体体为空。

  • HTTP request line: method field, URL field, HTTP version field
    • Method field types: GET, POST, HEAD, PUT, DELETE
      • GET: Often used in HTML forms and includes the entered data in the requested URL (in the form field).The input data and the normal URL should be separated by question marks (?) in English, and multiple parameters should be combined with &., together form the expanded URL link. For example: http://www.somesite.com/animalsearch?monkey&bananas.

      • POST: HTTP clients often use the POST method when a user submits a form. existThe entity body of the request message contains the value entered by the user in the form field. (Some websites will encrypt the entered value, as shown below password)
        Insert image description hereInsert image description here

      • HEAD: Similar to GET. whenserverAfter receiving a HEAD method request message,Will respond with an HTTP message, but will not return the request object. Commonly used for debugging traces of web applications.

      • PUT: Often used in conjunction with web publishing tools, allowing usersUpload objects to the specified path (directory) of the specified web server

      • DELETE: Allow user or applicationDelete objects on web server

  • Header line:
    • HOST: The host where the object is located
    • Connection: whether to use persistent connections
    • User-agent: User agent (that is, the type of browser that the browser sends requests to the server)
    • Accept-language: Indicates the version of the object that the user wants to get (if the object exists), fr indicates that he wants to get the French version.
2. Submit form input

Insert image description here

3. HTTP version type

Insert image description here

4.2 HTTP response message

1. Response message format
  • Common response message format
    Insert image description here

  • The figure below shows the response message of the request message in the previous example.
    Insert image description here

  • HTTP status line

    • Version: HTTP/1.1
    • Status code: 200, statement: OK
  • Header line:

    • Connection : Connection mode (persistent, non-persistent).
    • Date : The date and time when the server generated and sent the response message ( 服务器从它的文件系统中检索到对象,将该对象插入到响应报文并发送该响应报文的时间)
    • Server : The server that generates (response) messages, similar to the User-agent in the request message.
    • Last-ModifiedThe date and time the object was created or last modified
    • Content-Length : sendThe number of bytes in the object
    • Content-Type : indicationText format for objects in entities. (The picture shows HTML text)
2. Response status code

Insert image description here

1.5 User-server interaction: Cookies

5.1 User-Server Status: Cookies

Insert image description here

  • When a user visits an e-commerce that uses cookies for the first time, the server assigns him an ID (to associate the user's behavior, and summarize his subsequent behavior and user information under an ID to facilitate the provision of personalized services to the user. User information is also saved for identification.)

  • Subsequent requests sent to the server will carry a header line cookie: xxx. (numeric or alphabetical code)

  • Due to the function of cookies to store user information, HTTP can be changed from a stateless protocol to a "stateful protocol" to support more applications and provide more services.

5.2 Cookies: maintaining status

Insert image description here

  • Use cookies to track user status:
    Insert image description here

5.3 Cookies convenience and privacy issues

Insert image description here

1.6 Web caching

6.1 Definition and explanation

  • Web cache: Also called proxy server, it is a network entity that can replace the initial server to satisfy HTTP requests.
  • The web cache has its own disk storage and keeps a copy of the most recently requested object.
  • work process:
    • When a requested object is in the cacheexisthour,The web cache is directlyUse HTTP response message to client browserreturnthe object
    • in web cacheWhen there is no such objectOpens a TCP connection to the object's initial server. Send an HTTP request for this object on the cache's TCP connection to the originating server.
      • bufferAfter receiving the request response, store a copy in the local storage spaceandVia TCP connection between client browser and web cacheSend to client browser
      • In this process, the web cache is both a client and a server.

Insert image description here

Push vs. pull cache

Push cache and pull cache are both types of network cache, which are mainly used to alleviate network delays and increase data transmission speed.

Push caching refers to pushing data from the server to the client's browser , whenWhen the customer needs this data, the browser obtains it directly from the cache without downloading it from the server again.. Some browsers cache frequently visited web pages locally to reduce network latency and increase access speed.

Pull cache means that when the client browser initiates a request, the server caches the data in the local network .When customers need this data, they can obtain it directly from the local network without downloading it from the server again.. This method is often used for caching big data such as videos and pictures, which can reduce network latency and increase data transmission speed.

In general, push cache and pull cache are intended to improve network performance, reduce network latency, and increase data transmission speed.

6.2 Caching example

Insert image description here

1. Faster access links

Insert image description here

2. Install local cache

Insert image description here
Insert image description here

1.7 Conditional GET method

  • Generation:
    Although caching can reduce response time, the copy of the object stored in the cache may be stale. In order to ensure that the object is up to date, HTTP introduces the conditional GET method.
    Insert image description here
  • Composition conditions:
    • request messageUse GET method
    • request messageContains an "If-Modified-Since" header line

If-Modified-Since首部行的值为服务器发送的响应报文中 Last-Modified 首部行的值

  • This request message tells the server,Send the object only if it has been modified since the specified date
    • If the object has been modified, send the modified object and update the cache
    • Otherwise, the object will not be sent, and the response message will be as shown in the following figure:
      Insert image description here

2.2 File Transfer Protocol FTP

2.2.1 Definition

  • Function:
    Early useshare filesprotocol (including uploading and downloading files), the role of FTP servers such as the current Thunder, Baidu, and Cloud Disk

Network administrators maintain FTP servers. The sharer (a user) uploads (uploads) files to share via an FTP client. Other users access this content through FTP clients.

  • Constitute
    user interface, local file interface. The FTP server has a storage hard disk (cloud hard disk) and transfers files between local files and server file directories.
TFTP:TFTP(Trivial File Transfer Protocol,简单文件传输协议)是TCP/IP协议族中的一个用来在客户机与服务器之间进
          行简单文件传输的协议,提供不复杂、开销不大的文件传输服务。
      TFTP通常用于从TFTP服务器下载或上传文件,
          例如操作系统引导程序、配置文件等。TFTP客户端和TFTP服务器之间的通信是基于UDP协议进行的,端口号为69。
      与FTP协议相比,TFTP协议更加简单和轻量级,但功能较少。TFTP协议不支持用户认证,只能以匿名方式进行文件传输。
      此外,TFTP协议不支持文件目录操作,只能进行文件传输。
      在TFTP协议中,有两种传输模式,分别是读写模式(读写模式)和只读模式(只读模式):
          读写模式:客户端可以从服务器下载文件,也可以向服务器上传文件。
          只读模式:客户端只能从服务器下载文件,不能向服务器上传文件。

      需要注意的是,由于TFTP协议的安全性较低,因此在实际应用中,通常会使用FTP协议来代替TFTP协议。
  • FTP server and client
    Insert image description here

Windows system comes with FTP command

Users can use the FTP command through the following steps:

打开命令提示符。
输入“ftp”并按下回车键,进入FTP模式。
输入FTP服务器的地址,并按照提示进行操作。

It should be noted that when using FTP commands, users need to have certain basic computer knowledge and skills, as well as correct security measures and operating procedures to ensure data security and reliability2

2.2.2 Work stages

Insert image description hereInsert image description here

1. Establish a connection

When FTP客户端向FTP服务器发送连接请求时,FTP服务器会向客户端发送连接应答,建立起FTP控制连接.
Then, the FTP client needs to send the username and password to the FTP server for authentication. If the authentication is successful, operations such as file transfer can be started.

2. Identity verification

User authentication belongs to control connectionThe content is performed after the connection is established. See the following for controlling the connection.

  • Enter username and password to log in (FTP server performsUsername and password are transmitted in clear text, which poses a security risk and can be easily intercepted by hackers using packet capture tools)
  • Log in anonymously

Insert image description here

3. File operation and transmission

After the connection is established, two differentTCP connection, respectively calledcontrol connectionandData Connections.
Instructions are transmitted on the control connection, and file data is transmitted on the data connection.

2.2.3 FTP control connection and data connection

FTP控制连接与数据连接分开

control connection

existcontrol connectionsuperior

  • User authentication
    user, password使用的是明文传输

  • The client sends instructions to the server, such as instructions to switch directories, delete, upload, and download files.
    The transmission of data commands is called "out-of-band"

FTP command and response status code explanation example:
Insert image description hereThe lowercase words in the command example in the above figure are variable parameters and specific values. Capitalized words are proper nouns and are fixed.

The RETR (retrieve) instruction is an instruction to download a file from the server and specifies a certain file name.
STOR upload (upload).

上载
下载
客户端
服务器

When the client sends the download file command to the server服务器的使用自己的20号端口主动与客户端建立数据连接

Data Connections

existData connection

  • Perform file downloading, uploading (uploading) and other data stream transmission.

The transmission of data takes place over the data connection. Data transfer is called "in-band"

During the FTP data connection process, the client sends an FTP command to the server to tell the server that it needs to transmit or receive data. After receiving the command, the server will use its own port 20 to send a confirmation response to the client. After receiving the response, the client opens a data transfer connection and sends a data connection request to the server. After receiving the request, the server will use its own port 20 to establish a data transfer connection with the client process.

It should be noted that the FTP data connection port is dynamically allocated and is not a fixed port 20. But generally speaking, the data connection port of FTP is port 20, because the FTP protocol stipulates that port 20 is used for data connection.

FTP is a stateful protocol, the server needs to maintain client state.

HTTP transmission is inA TCP connectioncarried out on. The initial design of HTTP wasno statusYes, it can be turned into a stateful protocol through the function of cookies.

Illustration

The specific working process can be seen in the figure below:
Insert image description here

Insert image description here

In computer networks,Stateful and stateless refer to whether the network protocol saves information about the status of the network connection..
A stateless protocol means that the protocol does not save information about the state of the network connection.Stateless protocols do not keep records of previous requests or responses, complete information is required for each request. For example, the HTTP protocol is a stateless protocol. It does not save the request information sent by the client. Each request is independent.
Stateful protocols refer to protocols that save information about the status of network connections.Stateful protocols keep records of previous requests or responses, for subsequent processing. For example, the FTP protocol is a stateful protocol that saves the client's connection status information on the server for subsequent processing.
In computer networks, stateful and stateless protocols each have advantages and disadvantages. Stateless protocols can better support concurrent processing and load balancing because stateless protocols do not save connection state information, making it easier to handle multiple requests. Stateful protocols can better support scenarios where connection state information needs to be saved, such as web applications that need to maintain sessions.
It should be noted that stateful protocols are vulnerable to attacks if not handled properly. For example, anonymous access rights to an FTP server can easily cause the FTP server to be attacked. Therefore, in practical applications, it is necessary to select an appropriate protocol according to the specific situation.

2.3 Email Email

2.3.1 Email Overview

Email is a form of communication using computer networks that can be used to send and receive digital information over the Internet.
The basic components of an email include an email header and an email body. The email header includes the sender, recipient, subject, date and other information. The email body is the main part of the email content and can contain text, pictures, links and other elements.

The process of sending and receiving emails generally requires three steps: the user writes the email and sends it to the sending server, the sending server forwards the email to the transfer server based on the recipient's address, and the transfer server then forwards the email to the final recipient. file server.

The recipient can connect to the receiving server through a client program (such as Outlook, Thunderbird, etc.) and download the email, and then read the email content.
Using email requires setting up protocols such as SMTP and POP3, which stipulate the transmission and reception standards for email.

2.3.2 Email system structure

Email includes:User agent, mail server, protocol

Email protocols includeSend protocols (SMTP) and pull protocols (POP3, IMAP, HTTP). Since sending and receiving emails are push and pull operations, they can also be called "push" protocols and "pull" protocols.

Sender (User Agent), 发送协议SMTP, Sender Mail Server, SMTP协议, Receiver Mail Server, 拉取协议, Receiver (User Agent).

SMTP is a "push" protocol, where the sender pushes email data to the outgoing message queue of the SMTP server;

HTTPmainIt is a "pull" protocol that pulls files from the server at any time through the browser client. In web browser-based email, the HTTP protocol can also function as the sending protocol.

Insert image description hereInsert image description here

  • Overall workflow

Insert image description here
Insert image description here

2.3.3 Email message format

Insert image description here

2.3.4 Simple Mail Transfer Protocol SMTP

SMTP (Simple Mail Transfer Protocol) is a protocol used for email transmission.

The SMTP protocol defines the communication protocol between the sender and the receiver of the email during the email transmission process, including regulations on the email format, transmission method, authentication mechanism, etc.

1.SMTP protocol working

The SMTP protocol workflow is as follows:

TCP three-way handshake.
Send connection request SMTP HELO.
Send sender information SMTP MAIL FROM.
Send recipient information SMTP RCPT TO.
Send email content command SMTP DATA.
Send email content.
Mail transfer completed, SMTP QUIT.

There are some important protocol commands in SMTP, such as:

HELO: The hostname used to identify itself to the receiving server.
MAIL FROM: Used to specify the sender's address.
RCPT TO: Used to specify the recipient's address.
DATA: The SMTP DATA command is used to specify the beginning of the email body and tells the SMTP server to start receiving the email content. DATA should be followed by the email text and end with a single line "."

Key points:

  • C/S: client/server mode
  • TCP connection
  • direct transfer
    Direct transmission from the sending server to the receiving server, without transferring to other servers for storage.
  • The port number
    Use port 25 by defaultMessages are transmitted between the client and the server.
  • communication stage
    • Connection established (handshake)
    • Sending emails (transmitting messages)
    • Connection released (closed)
  • command/response interaction
    • Commands: ASCII text (14 commands, several letters)
    • Response: status code and status information (21 types of response information: three-digit code + simple text description)
  • limit
    • The message must be a 7-bit ASCII code
  • persistent connection
    • A TCP connection can send multiple emails in sequence, and the connection is closed after all are sent.

Insert image description here

2. Limitations

Insert image description here

  • SMTP protocol is insufficient: it can only transmit 7-bit ASCII email data, including email headers and bodies. If there is data in the email that has not been ASCII encoded, it will be encoded before transmission. This design was wise in the early days of the Internet, because no one would transmit large data streams such as videos, pictures, etc. in emails.

3. Example

Insert image description here

  • Simple SMTP interaction

Insert image description here

Insert image description here

4. SMTP format

Insert image description here

5. Summary

HTTP is mainly "pull", but it can also be "push".

Insert image description here

2.3.5 The third version of Post Office Protocol POP3

POP3 is a very simple email access protocol. When the recipient reads the email, there are disadvantages because SMTP does not support pulling files. The third generation email reading protocol POP3 was born, which is used to send emails from the recipient's SMTP server to the recipient's mailbox.

1. Illustration of working position

Insert image description here

2. Working method

Insert image description here
POP3 has two working modes: download and retain (retained on the receiving server). This working mode allows multiple clients (such as mobile phones, macs, computers, fax machines, etc.) to download and read.
The download and delete working method only supports downloading once and then deleting it. For example, if you receive an email on your mobile phone, you cannot receive it from the email server on other devices.

3. POP3 working phase

Insert image description here

  • concession stage
    User agent sends (plaintext) username and password to authenticate the user.
  • Transaction processing stage
    • User agent returns message
    • Mark the message for deletion and unmark the message for deletion
    • Get email statistics
  • update stage: Appears after the user issues the quit command
    • End POP3 session
    • Delete messages marked for deletion

2.3.6 Multimedia extension MIME

Insert image description here
MIME supports encoding non-ASCII bytes and then converting them to ASCII for transmission to SMTP. It is common to base64 encode Chinese characters and then convert them.
Insert image description here

2.3.7 Internet Mail Access Protocol IMAP

The POP3 protocol does not provide users with any method to create remote folders and assign folders to messages. In order to solve this problem, the Internet Mail Access Protocol IMAP was created.
Insert image description here

POP3 and IMAP

Insert image description hereInsert image description here

2.3.8 Web-based email

Insert image description here
From the user to the mail server, the HTTP protocol is used, including the sender and receiver.. But 邮件服务器与邮件服务器之间的传输依然是使用SMTP协议.

2.4 DNS (Domain Name System)

Origin:
There are two ways to identify computers: IP address and host name.
Host names are generally accepted because they are easy to remember, but host names provide little information about the computer's location on the Internet.
Routers prefer fixed-length, hierarchical IP addresses.
In order to compromise these two needs, a directory service is needed that can convert host names into IP addresses, which is the main task of Domain Name System (DNS).

Features:

  • Layered, domain-based naming mechanism
  • SlightlydistributedThe database completes the conversion of names to IP addresses.
  • operatingPort 53 on UDPapplication services
  • Core Internet functionality, but implemented as application layer protocols
    • Dealing with complexity at the edge of the network (end systems, application layer of hosts)

Function:

Insert image description here

2.4.1 Domain name

  • definition:
    domain name(Domain Name) is a core service of the Internet. It is the name of a computer or computer group on the Internet consisting of a string of names separated by dots. It adopts a hierarchical structure.

Domain names are made up of two different character sets: ASCII and Unicode. The ASCII character set includes 128 characters such as numbers, letters and symbols. The Unicode character set includes texts, symbols and symbols from almost all countries and regions.

In the ASCII character set,The length of each level of domain name shall not exceed 63 characters.. In the Unicode character set, although there is no clear limit on the length of a domain name, it still needs to comply with the restrictions of the Domain Name System Specification (DNS). According to DNS specifications, the length of each level of domain name cannot exceed 253 characters.

It should be noted that although there is a limit on the length of each level of domain name, there is no limit on the length of the entire domain name.

Domain names can be used inThe electronic location, and sometimes geographic location, of a computer during data transmission.

For example: Our commonly used portal websites, such as Sohu, Sina, etc., all use capital letters as domain names.

  • Features
    • 1.Naming devices at one level may have many duplicate names., but each host on the Internet can be uniquely identified by combining the host name and the domain in which it resides
    • 2.DNS adoptionHierarchical tree structure namingmethod

Insert image description here

  • Analyze the problem

Single point of failure: If there is only one DNS server, once it is damaged, the impact will be huge.
Communication capacity: One DNS server handles all DNS queries, which is too heavy a workload.
Maintenance issues: One DNS server needs to keep records for all Internet hosts, making the central database huge. , and will also be updated as new hosts are added.
Remote centralized database: Because one DNS server cannot be "near" all users, propagation will occur through low-speed and congested links, resulting in severe delays.

Insert image description here

Hierarchical classification of domain names

下图中 "叶" 只是一个通用代指,并不是说所有域名都归于一个主机或设备。“根”同理(There are 13 DNS root servers in total)
From the root of the tree to the leaves, the upper domain has a pointer to its subdomain server.

一系列划分域
一系列划分域
一系列划分域
一系列划分域
一系列划分域
顶级域1
顶级域2
顶级域3
顶级域...
二级域1
二级域2
二级域3
二级域...
二级域...
二级域...
三级域...
...

Insert image description here

Top-level domains are divided into two categories:

  • Generic
  • Countries

There are duplicates in the following pictures, just take a look.
Insert image description hereInsert image description here

The picture below appearsReverse domain name arpa is used to reversely resolve IP addresses into domain names.
Insert image description here
The following extensions are taken from Baidu AI:

arpa is the abbreviation of Reverse Domain Name System (Reverse DNS), which is used to resolve IP addresses into domain names.
On the Internet, an IP address is an address that uniquely identifies a computer or device, while a domain name is a string of characters used to make it easier for people to remember and access these addresses. The function of the reverse domain name system is to reversely resolve the corresponding domain name through the IP address.
The arpa domain name character set is different from the general domain name character set. It only contains numbers and letters, and does not include special characters such as countries, regions or symbols.
arpa domain names are usually used in the following situations:

Reverse domain name resolution: Reversely resolve the corresponding domain name through the IP address, which can be used for network management and security monitoring.
DNS blacklist: Add the IP address of malware or network attackers to the blacklist of arpa domain names to restrict their access to network resources.
Mail Server: The arpa domain name is used for the mail server to facilitate receiving and sending emails.
Temporary domain name: When a domain name is deleted or expires, its IP address may still need to be accessed. At this time, the arpa domain name can be used as a temporary replacement.

需要注意的是,arpa域名的使用需要遵守特定的规范和标准,例如逆向DNS解析协议(DNS Reverse Resolution Protocol)等。

域名的构成

  • 命名设备的域名
    主机名.第N级域名.(…).第二级域名.顶级域名
    从树叶开始,每过一个层级用句点分隔开

  • 命名一个域的域名(对某个域做标识)
    树枝开始到顶级域。
    如:ustc.edu.cn (中国科技大学域名)

注:(少数采用)设备也可以直接挂在顶级域名或二级域名之下,不必非要顺着所有域层级来命名。

如:

  • mit.edu
  • xxx.gov

域名管理

.cn:中国的一个顶级域名
.jp:日本的一个顶级域名

Insert image description here

2.4.2 域名服务器

Insert image description here

DNS:根名字服务器

互联网共有13个根服务器 (分布在:欧洲,北美(大部分),日本),不同国家域名划分不一定一样

权威服务器

  • 前置
    为了解决域名的维护(域名到IP地址转换)和解析问题,划分出区域(zone)的概念:
    Insert image description here下图中每个圈就是一个区域。
    Insert image description here

  • 定义
    Insert image description here是否是某个域的权威DNS服务器看是否维护中这个区域的域名到IP地址对应关系
    权威服务器 :清楚本区域内部域名与IP对应关系

TLD服务器

Insert image description here

本地名字服务器(Local Name Server)

地址使用手工配置或动态配置
Insert image description here

名字服务器(Name Server)

Insert image description here

2.4.3 DNS工作机理方面

1. DNS缓存

为了改善时延性能并减少在互联网上传输的DNS报文数量,DNS广泛采用了缓存技术。

  • 原理
    在一个请求链中,DNS服务器收到一个DNS应答(如包含某个主机名到IP地址的映射)时,它能将映射缓存在本地存储器中

由于主机和主机名与IP地址间的映射不是永久的DNS服务器在一段时间后(通常设置为两天),将丢弃缓存信息。

  • 产生的影响
    本地服务器也能缓存 TLD 服务器的 IP 地址,因而允许本地DNS绕过查询链中根DNS服务器。事实上,由于缓存的存在,除了少数DNS查询以外,根服务器被绕过了。

2. 资源记录

共同实现DNS分布式数据库的所有DNS服务器存储了资源记录(Resource Record,RR):提供主机名到IP的映射。
资源记录是一个包含了下列字段的四元组:
(Name,Vaule,Type,TTL),具体见下图:
Insert image description here

资源记录可以类比于数据库的记录方法
  • TTL(生存时间 time to live):是指某个记录的生存时间,决定了某个资源记录删除的时间
    • 1.TTL为无限大:指权威值
    • 2.TTL为有限值: 指缓冲值

下面给出的例子,忽略掉TTL字段。Name和Vaule的值取决于Type:

(结构图)
Insert image description here
(原书)
Insert image description here

  • 资源记录的一个例子:
    Insert image description here

3. DNS工作过程:

Insert image description here

Insert image description here

查询

如果本地服务器有缓存,则直接返回缓存信息(主机与IP映射)。如果没有缓存,需要查询具体映射。
查询方法有以下两种:

  • 递归查询
  • 迭代查询

递归查询

简单来说就是主机任意找一个根服务器,由于通常上一级知道下一级信息,然后从根服务器开始一级一级往下查找,直到最终查到结果
Insert image description here

迭代查询

上一级不会明确下一级的信息,但会有一个指定方向,相当于"踢皮球"。
Insert image description here

4. DNS协议与报文

DNS报文有查询报文和回答报文两种,它们的格式都是相同的。结构如下图:

idenfication即id号
Insert image description here

(原书参考)
Insert image description hereInsert image description here

5. 维护问题:新增域

Insert image description here

2.4.4 DNS安全问题

Insert image description here

总的来说,DNS比较健壮

2.5 CDN

1. 因特网视频

视频

  • 定义:一系列以恒定速率(如每秒24或30张图像)来展现的图像。
  • 特点:可以被压缩(因而可以用比特率1来权衡视频质量)

一幅未压缩,数字编码的图像由像素阵列组成,其中每个像素是用来表示颜色和亮度的比特编码。

  • 比特率与音、视频压缩的关系:
    比特率越高,传送数据速度越快,简单的说就是比特率越高,音、视频的质量就越好,但编码后的文件就越大;如果比特率越少则情况刚好相反
    • 例如:低质量视频:100kbps;流式高分辨率电影:超过3Mbps;用于4K流式展望的超过10Mbps。

Insert image description here
Insert image description here

视频流

视频流是指将视频数据以流的形式进行传输这种传输方式可以实时观看视频内容,而不需要等待整个视频文件下载完成。视频流技术使用了流媒体协议,例如HTTP Live Streaming(HLS)和Real-time Messaging Protocol(RTMP)等,使得视频可以在各种设备上流畅播放。视频流技术广泛应用于在线视频播放、直播、视频会议等领域。

流式视频

Insert image description here

  • 定义:
    流式视频是指在数据通过互联网到达计算机的同时在显示器上播放的网络视频。与可下载的互联网视频不同,流媒体形式在收到压缩数据后即开始播放,从而消除了下载时可能伴随病毒的担忧。但是,由于视频在接收数据时播放,视频可能会被慢速连接中断,并可能会自动暂停并重新启动,以尝试"缓冲"数据。此外,与可下载视频不同,流式视频不会"保留"在计算机上;只有当主网站决定继续发布视频时,才可以访问流式视频到网站。

因此,视频流是一种传输方式,而流式视频是一种播放方式视频流可以以流式视频的方式进行播放,而流式视频必须使用视频流技术进行传输

  • 性能度量:平均端到端吞吐量

平均端到端吞吐量是指单位时间内成功地传送数据的数量,也就是传输速率。吞吐量和带宽是两个不同的概念,带宽是指链路的能力,单位是比特每秒(bps),是设计值;而吞吐量是实际测试的传输速率,单位也是bps。平均端到端吞吐量通常用来衡量网络传输性能,可以通过网络延迟、传输速度等指标来评估。

  • 例子:使用流式视频的公司:Netflix,YouTube(谷歌),亚马逊和优酷等。

2. HTTP流和DASH

2.1 HTTP流

  • 定义:
    HTTP流(HTTP streaming)是一种在HTTP协议下进行实时数据传输的技术。它允许服务器将数据以流的形式发送给客户端,而无需等待整个响应完全生成。这种方式可以实现实时性要求较高的应用,如视频直播、音频流传输等。HTTP流可以通过多种方式实现,包括长轮询、服务器推送事件(Server-Sent Events)和WebSocket等。这些技术都允许服务器主动向客户端发送数据,而不需要客户端主动请求。HTTP流的实现可以提供更好的用户体验和更高效的网络通信。

  • 特点:
    在HTTP流中,视频只是存储在HTTP服务器中作为一个普通文件,每个文件有一个特定的URL

  • HTTP流播放视频的工作原理:

  • 1.客户端发起请求

客户端使用HTTP协议向服务器发起视频播放请求(TCP连接)。请求中通常包含视频的URL或其他必要的参数。

    1. 服务器响应

服务器接收到客户端的请求后,开始准备视频数据。服务器使用一种支持流式传输的视频格式,如MPEG-DASH(Dynamic Adaptive Streaming over HTTP)或HLS(HTTP Live Streaming)。

    1. 客户端接收和播放

客户端接收的视频字节被收集在应用缓存中,当缓存中的字节数量超过设定门限时,客户应用程序开始播放。特别的,流式视频应用程序周期性的从客户应用程序缓存中抓取帧,对这些帧解压缩并呈现在用户屏幕上,因此,流式视频应用接收到视频就开始播放,同时缓存视频后面部分的帧

通过以上步骤,HTTP流播放视频实现了实时的视频传输和播放。这种方式可以根据网络条件和设备性能进行自适应调整,提供更好的观看体验。同时,使用分段传输的方式,可以在保证视频连续播放的同时,提高网络传输的效率。

  • 缺点:尽管对不同客户或相同客户的不同时间而言,可用带宽大小不同,但是所有用户接收到相同编码的视频。(这导致了新型的基于HTTP流的研发,即DASH)

2.2 DASH

DASH(Dynamic Adaptive Streaming over HTTP)(基于 / 经HTTP的动态适应性流)
使用DASH后,每个视频版本存储在HTTP服务器,每个版本都有一个不同的URL

  • 特点:
    • 在DASH中,视频编码分为几个不同版本,每个版本具有不同比特率,对应不同的质量水平
    • 客户请求:客户端动态地请求来自不同版本且长度为几秒的视频数据块。使用HTTP GET请求报文一次选择一个不同的块
    • 带宽与版本
      • 带宽较高时:高速率的版本块
      • 带宽较低时:低速率的版本块
    • 带宽变化适应性如果端到端的带宽在会话过程中发生改变,DASH允许客户适应可用带宽。(例如:当移动用户相对基站移动时,能感觉到带宽波动。)
  • 告示文件(manifest file)
    Insert image description here
  • 总述

Insert image description here
Insert image description here

3. 内容分发网CDN

对于一个因特网视频公司提供流式视频服务最直接的方法或许时建立一个单一的大规模数据中心,存储所有视频,并向全世界范围客户传输流式视频。
但存在三个问题
1.出现停滞时延的可能性随中间通信链路数量增加而增加
2.流行的视频可能经过相同通信链路多次发送,导致网络带宽的浪费和公司需要给ISP的费用增加
3.单点故障。数据中心崩溃则无法发送视频。
Insert image description here
Insert image description here

3.1 CDN

Insert image description here

  • 定义:
    内容分发网 (CDN : Content Delivery Network)是一种分布式网络系统,通过将内容缓存到多个地理位置和服务器上,以提高网站或其他互联网应用的性能和可靠性。

  • 分类:
    专用CDN,由内容提供商自己所有。例如,谷歌的CDN分发YouTube视频和其他类型内容。
    第三方CDN,代表多个内容提供商发表内容。

  • CDN的工作原理将网站的内容分发到靠近用户的地方,以便用户可以快速获取所需的内容。这可以通过将内容存储在分布在全球的各个节点上实现。当用户请求网站内容时,CDN会根据用户的地理位置和网络条件选择最合适的节点将内容传递给用户

  • CDN的主要优点包括:

    性能提升:CDN可以将内容存储在靠近用户的地方,从而减少内容的传输延迟和网络拥塞,提高用户访问网站的速度和性能。
    可靠性增强:CDN可以通过多个节点同时提供内容,如果某个节点出现故障或网络故障,其他节点可以自动接替,
               确保用户可以继续访问网站。
    安全性提高:CDN可以提供DDoS攻击防护、CC攻击防护等安全服务,保护网站免受网络攻击。
    节省成本:  使用CDN可以减少服务器负载和带宽成本,提高网站的成本效益。
    

CDN被广泛用于各种互联网应用,如网站、视频流媒体、游戏等,可以提高用户体验和网站性能。

  • 服务器安置原则
    • 深入
      Insert image description here

    • 邀请做客
      Insert image description here

3.2 CDN操作

客户端使用浏览器指令检索特定视频(由URL标识)时,
CDN必须截获该请求以便能够:
1.确定适合的CDN服务集群。
2.将 客户请求重定向到该集群某台服务器。
大多数CDN利用DNS来截获和重定向请求。如下图示:

Insert image description here

  • 1.用户访问目标URL的网页
  • 2.用户主机发送DNS请求
  • 3.本地DNS服务器(LDNS)把DNS请求中继给权威服务器,并得到一个KingCDN域的主机名
  • 4.LDNS发送第二个请求,发送对象为3步骤得到的主机名,并得到IP地址
  • 5.LDNS向转发内容服务CDN的IP地址
  • 6.客户接受到IP后创建TCP连接,并发送HTTP GET请求。

3.3 集群选择策略

CDN的集群选择策略是CDN部署的核心之一,目的是将客户定向到CDN中某个服务器集群或数据中心的机制。常见的集群选择策略包括:

地理最近策略:指派客户到地理上最为临近的集群,这种选择策略忽略了时延和可用带宽随因特网路径时间而变化,总是为特定的客户指派相同的集群。
实时测量策略:基于集群和客户之间的时延和丢包性能执行周期性检查,这种策略可以实时测量集群和客户之间的网络路径,并根据测量结果选择最优的集群。

此外,还有其他集群选择策略,如基于负载均衡的策略、基于DNS解析的策略等。总之,CDN的集群选择策略是为了提高网站的性能和可靠性,根据不同的因素选择最优的集群,并提供更好的用户体验。

4. 学习案例

4.1 案例学习: Netflix视频分发

Insert image description here

  • 主要部件:亚马逊云,专用CDN基础设施
  • 亚马逊云处理的关键功能:
    • 1.内容摄取:用户分发电影之前,Netflix必须首先获取和处理该电影。接收电影母带并上载到亚马逊云主机
    • 2.内容处理:为每部电影按不同格式和比特率生成不同的版本,允许使用DASH经HTTP适应性播放流。
    • 3.向其CDN上载版本:某电影所有版本生成后,亚马逊云中的主机向其CDN上载它们

Insert image description here

  • 补充:Netflix创建了自己专用的CDN,现在它从这些专用CDN发送它所有的视频,(Netflix仍使用Akamai来分发它的Web网页。)所以Netflix已经能够简化并定制其CDN设计。
  • 特点:
    • 1.包含适应性流和CDN分发
    • 2.不需要使用DNS重定向将特殊用户连接到一台CDN服务器;相反,Netflix软件(运行在亚马逊云中)直接告知该用户使用一台特定CDN服务器
    • 3.Netflix CDN使用推高速缓存而不是拉高速缓存:内容在非高峰时段的预定时间被推入服务器,而不是在高峰缓存未命中时动态地被推入。

推高速缓存和拉高速缓存都是网络缓存的一种,主要用来缓解网络延迟和提高数据传输速度。

推高速缓存是指将数据从服务器推到客户的浏览器,当客户需要这些数据时,浏览器直接从缓存中获取,而不需要再次从服务器下载。一些浏览器会将经常访问的网页缓存到本地,以减少网络延迟和提高访问速度。

拉高速缓存是指客户浏览器发起请求时,服务器将数据缓存在本地网络中,当客户需要这些数据时,直接从本地网络中获取,而不需要再次从服务器下载。这种方式常用于视频、图片等大数据的缓存,可以减少网络延迟和提高数据传输速度。

总的来说,推高速缓存和拉高速缓存都是为了提高网络性能,减少网络延迟,提高数据传输速度。

4.2 案例学习: Youtube

2005年4月开始服务,在2006年11月被谷歌公司收购。谷歌/Youtube设计和协议是专用的。

  • 特点:
    • 谷歌使用专用CDN分发视频,从几百个ISP和IXP位置安装服务器集群及它的数据中心分发Youtube的视频。
    • 使用拉高速缓存和DNS重定向
    • 选择策略:大部分时间,谷歌的集群选择策略将客户定向到某个集群,使客户与集群之间RTT最低。但有时为了平衡经集群的负载,有时客户经DNS被定向到更远的集群
    • 没有应用适应性流(如DASH),要求用户人工选择版本
    • In order to save 将被重定向或提前终止wasted bandwidth and server resources , use HTTP byte range requests to limit the transmitted data stream after obtaining the video target amount .

4.3 Case Study: Take a look

Look at (owned and run by Xunlei) which uses P2P delivery instead of client-server delivery.

  • P2P streaming video features:
    • When a peer requests to view a video, it contacts a tracker to find other peers that have a copy of the video and requests chunks of the video from them.
    • Request priority to be given to upcoming chunks to ensure continuous playback

Insert image description here


  1. Bit rate refers to the number of bits transmitted per second, in bps (Bit Per Second). Indicates how many bits are needed to represent the encoded (compressed) audio and video data per second, and the bit is the smallest unit in binary, either 0 or 1. ↩︎

Guess you like

Origin blog.csdn.net/qq_74259765/article/details/131573331