Network programming - basic introduction of RPC and HTTP, historical traceability, mainstream application scenarios, comparative analysis, why RPC is still needed

1. Basic introduction to HTTP and RPC

  • HTTP protocol (Hyper Text Transfer Protocol) hypertext transfer protocol :
    • A standard protocol for exchanging information on the web that defines how clients (such as browsers) and servers communicate. For example, when surfing the Internet, you can access the webpage by typing a URL on the browser, and the HTTP protocol is used here.
    • The earliest version of HTTP was proposed in 1989. After years of development, the HTTP/1.1 version released in 1996 has been used until now, and HTTP/2.0 also appeared in 2015.
  • RPC (Remote Procedure Call) remote procedure call :
    • The ability to call a function located in another address space (usually another computer on the network), transparent to the programmer . It is just a remote call, not an RPC, because RPC emphasizes a procedure call, and the call process should be transparent to the user. The user should not care about the details of the call, and can call the remote service like a local service. Therefore, RPC must encapsulate the calling process.
    • It is not a specific protocol itself, but a specification of a protocol. Specifically, it is a concept, mechanism or idea . It has no specific implementation. Only the communication framework implemented according to the RPC communication protocol specification, that is, the RPC framework, is The specific implementation of the protocol, such as Dubbo, gRPC, etc. It includes: interface specification + serialization and deserialization specification + communication protocol, etc.
    • The concept of RPC can be traced back to the 1970s, but it really became popular in the mid-to-late 1980s, with the rise of distributed computing. ONC RPC released in 1984 and DCE RPC released in 1986 are the earliest RPC implementations.
      insert image description here

In the interview questions, you may see why there is HTTP and RPC is needed, but from the above introduction, it can be seen that RPC was proposed and used earlier than HTTP, so we first need to think about the first question-why RPC is needed? There is HTTP protocol?

2. Why do you need HTTP when you have RPC?

2.1 Historical Tracing - B/S and C/S Architecture

First of all, we need to understand that B/S and C/S refer to two common software architecture models:

  1. C/S: Client/Server, that is, client/server architecture.
    • In this architecture, the client and the server are two independent programs that communicate through a certain network protocol.
    • The client needs to install special software to access the services provided by the server. The server provides various functions for the client to call. Typical C/S architecture applications such as QQ, Thunder, game clients and other services that require the installation of corresponding software
    • The concept of C/S architecture can be traced back to the 1960s with the rise of distributed computing, but it really became popular in the mid-to-late 1980s.
  2. B/S: Browser/Server, that is, browser/server architecture.
    • In this architecture, the client only needs to use a web browser to access the services provided by the web server. The web server provides pages such as HTML, JavaScript, and CSS, and the browser is responsible for displaying them.
    • A typical B/S architecture application is a web application. Users only need a browser to access the service without installing any client software.
    • The B/S architecture appeared at the same time as the World Wide Web and the HTTP protocol, around 1989.

Therefore, it can be seen from the above that when the C/S architecture was first popular, for software under the C/S architecture, such as chat software, office software, etc., they only need to communicate with their own company's server, so they can use their own customized RPC protocol Just make a remote call. But with the emergence of the World Wide Web and the B/S architecture, browsers have emerged , and browsers need to access many websites from different companies, which cannot be accessed through RPC, so a unified standard is needed to communicate with these website servers. This is where the HTTP protocol comes into play. HTTP provides a unified standard for B/S architecture (browser/server architecture), allowing servers of different websites to interact with browsers.

2.2 RPC and HTTP mainstream application scenarios

From the above analysis, we can see that many years ago, RPC and HTTP had corresponding mainstream application scenarios

  • RPC is more used in the C/S architecture, that is, the communication between the client (Client) and the server (Server). Because the C/S architecture is usually within an organization (inside an organization), you can use your own customized RPC protocol to realize the communication between the client and the server.
  • HTTP is mainly used in the B/S architecture, that is, the communication between the browser (Browser) and the server (Server). Because the browser needs a unified standard to access the servers of different websites, HTTP has become the standard protocol of the B/S architecture. The scenario where HTTP was invented is for web architecture, not for communication between distributed systems. This has led to a long period of time when HTTP is used for communication between browser programs and back-end web systems, and the format of the transmitted document is cumbersome. HTML format, so no one uses HTTP as a protocol for distributed system communication.

However, with the needs of users, many software supports multiple terminals at the same time. Now the situation has changed, and the B/S architecture and C/S architecture are slowly merging. More and more applications support web, mobile and PC at the same time.

  • With the development of front-end technology, AJAX technology and JSON documents have gradually become mainstream in the front-end industry. HTTP calls get rid of HTML and start to use JSON, a relatively simple document format. Later, with the rise of RESTFUL thought, more and more systems use HTTP to provide services. Therefore, in order to simplify the architecture, many applications now choose to use HTTP as a unified communication protocol to support multi-terminal communication. This enables the server to support all clients only by implementing the HTTP interface once.
  • The RPC protocol is mainly used within an organization (inside an organization), such as communication between microservices (Microservices) within a company.
  • Therefore, the general trend is: HTTP is becoming popular as a common standard protocol, while the self-customized RPC protocol is mainly used for internal clusters (Cluster).

From the above introduction, we can know that HTTP can be used for both B/S and C/S, so why not use HTTP protocol for all, which leads to the next classic interview question-why is there In addition to HTTP, RPC is also required

3. Why do you need RPC when you have HTTP?

3.1 Comparative analysis of HTTP and RPC

HTTP usually refers to the HTTP1.1 version. When analyzing HTTP1.1 and RPC, the following aspects are generally analyzed:

features HTTP 1.1 RPC
Agreement purpose External heterogeneous environment, browser interface calls, APP interface calls, third-party interface calls, etc. Service calls within the company, low performance consumption, high transmission efficiency, service governance
way of communication Request-response model, independent connections call-return model, long connection
Transfer Protocol Transmission through HTTP1.1, the message size is large Custom TCP protocol transmission, can also be based on HTTP protocol
Serialization Text or binary, mostly using json, can also use protobuf binary encoding text or binary
communication efficiency Can support connection pool reuse Multiple calls in the same connection, the communication efficiency is higher
Data Format Use common data formats such as JSON, XML, etc. Interfaces can be defined using a custom IDL
Interface definition and call Use URLs to identify resources and make requests and responses Define interfaces and methods using an interface description language
Security and Authentication TLS/SSL available Authentication, encryption and authorization

As shown above, HTTP 1.1 and RPC do have differences in transport protocol and serialization. The following is an understanding of the gap between HTTP 1.1 and RPC in these two aspects:

  • Differences in transmission protocols: HTTP 1.1 uses a text protocol, and its message header contains a large amount of metadata, which makes the overhead of each communication higher. In contrast, RPC can use a custom transport protocol, the message header is small, and only contains necessary metadata, thereby reducing communication overhead.
  • Serialization differences: HTTP 1.1 commonly used serialization formats are text formats, such as JSON and XML. Although binary encoding protocols such as Protobuf can be used to encode content, due to the poor readability of binary encoding, it is rarely used, and the time to serialize text formats such as json when opening a web page is also negligible, so usually in In HTTP applications, the text format is more common. While in RPC, since it is usually used for inter-service communication such as game servers, performance and efficiency are more important, so binary encoding protocol is more commonly used.

In addition, it should be pointed out that with the development of HTTP, HTTP 2.0 and HTTP 3.0 appeared. Both HTTP/2 and HTTP/3 have made important improvements over HTTP/1.1 in terms of transport protocol and performance. They all use the binary protocol format, which can transfer data more efficiently. At the same time, they all support multiplexing, which can send multiple requests and responses on one connection at the same time, reducing network delay and resource waste. HTTP/2 also introduces header compression technology, which can reduce the size of the message header and improve transmission efficiency. HTTP/3 uses the QUIC protocol and UDP transmission, which can establish connections and transmit data faster, improving real-time performance and reliability.

In summary, HTTP/1.1 has some deficiencies in terms of transmission protocol and performance compared to RPC, but with the emergence of HTTP/2 and HTTP/3, the HTTP protocol has been greatly improved in these aspects, making it in some In the scenario, it can replace RPC for efficient data transmission. For example, the bottom layer of the popular gRPC framework uses the HTTP2.0 protocol for communication.

Of course, HTTP 2.0 only appeared in 2015, so using RPC before 2015 can be considered in terms of efficiency, but now that HTTP 2.0 with the same efficiency has appeared, why do we need to continue to use RPC?

3.2 Why still need to use RPC

According to the above comparative analysis, it can be found that the HTTP2.0 protocol has optimized the coding efficiency. RPC libraries such as grpc use the http2.0 protocol, so why do you need to use RPC for remote calls?

As I said before, RPC is not a specific protocol, but a protocol specification. It is clearly a concept, mechanism or idea. It has no specific implementation. There is only a communication framework implemented in accordance with the RPC communication protocol specification, that is, the RPC framework. , is the specific implementation of the protocol, which includes: interface specification + serialization and deserialization specification + communication protocol, etc. RPC in the narrow sense now generally refers to some frameworks that use IDL (Inteface Description Language) to describe interfaces and then generate stubs, such as grpc, thrift, dubbo, etc. Among them, grpc and dubbo3.0 use HTTP2.0 for transmission, and they already belong to The fusion of RPC and HTTP, the design of this fusion allows developers to enjoy the extensive support of HTTP while obtaining better performance and functionality.

When we use a more mature RPC library, the RPC library usually provides more advanced service-oriented features, such as service discovery, load balancing, and fuse degradation.

  1. Service discovery : The RPC library can provide a mechanism for service registration and discovery, making communication between services more flexible and reliable. Through service discovery, services can automatically register their own address and status, and clients can dynamically discover available service instances. In this way, when the service instance changes, the client can automatically adapt and call the available service.
  2. Load balancing : The RPC library can support load balancing algorithms for dynamically assigning requests to different service instances. Load balancing can decide which service instance to send the request to according to the actual situation, such as the load of the service instance, network delay, etc., so as to realize the balanced distribution of requests and improve the performance and scalability of the system.
  3. Circuit breaker downgrade : The RPC library can implement a circuit breaker downgrade mechanism to deal with situations where services are unavailable or the response time is too long. When a certain service fails or its performance degrades, the circuit breaker can automatically switch to the standby service or return to the default value to ensure the stability and availability of the system. Fusing downgrades can also prevent the propagation of faults and avoid the collapse of the entire system.

In general, the RPC framework is a higher-level encapsulation of the HTTP protocol, which provides advanced service-oriented features such as service discovery, load balancing, and fuse degradation. These features make the RPC framework more convenient and reliable when building a distributed microservice system, and can provide better performance, scalability and fault tolerance.

Guess you like

Origin blog.csdn.net/qq_45808700/article/details/131664118
RPC
RPC