In-depth thinking about rpc framework face-to-face four

7 Some understanding of netty mechanism

Recommended reading: Deep thinking about netty network programming framework

7.1 Port numbers supported by Netty :

Netty can bind to any legal port number, similar to most networking libraries. Valid ports range from 0 to 65535, but it is generally recommended to use ports above 1024 because ports 0-1023 are reserved for well-known services.

7.2 Netty uses NIO mechanism :

  • NIO and multiplexing Channel :
    Netty is built on Java NIO (non-blocking IO), and its core is event-driven and callback mechanism. When we talk about "multiplexing Channel", we usually refer to IO multiplexing technology, such as select, poll and epoll mechanisms in Unix systems. These mechanisms allow a single thread to monitor the IO events of multiple connections (such as readable, writable, etc.), thereby improving concurrent processing capabilities.

  • Programmer's manual setup :
    Netty provides a high-level abstraction, so developers don't need to directly interact with the low-level multiplexing API. IO multiplexing is automatic when developers create and configure Channel, and other Netty components. EventLoopBut, of course, Netty also provides enough flexibility for advanced users to allow them to adjust and optimize the behavior.

7.3 Netty's default transport layer protocol :

When you create a Netty server or client without explicitly specifying the transport type, Netty uses TCP as its transport by default. However, it is always a good practice to explicitly specify the transport type you want to use. For example, in Netty, you can choose NioServerSocketChanneleither NioSocketChannelNIO-based TCP transport, or other Channel types, such as NioDatagramChannelUDP transport.

7.4 The channel here is equivalent to establishing a connection with a certain client. Is EventLoop equivalent to calling the select method to monitor these connections, and then hand over these connections to the corresponding handle according to the thread model

Yes, your understanding is mostly accurate. The concept of Netty has many similarities with the traditional network programming model, but it provides developers with higher-level abstraction and more flexible configuration options. The following is an overview and supplement to several key points you mentioned:

  1. Channel:

    In Netty, Channelit is the core abstraction of network communication, representing a connection or a data transmission channel. For TCP, one Channelcorresponds to a connection between a client and a server.

  2. EventLoop:

    EventLoopResponsible for handling network events and performing I/O operations. In fact, one EventLoopusually corresponds to one thread, and this thread internally uses multiplexing techniques such as select, pollor epollto listen to Channelevents on multiple threads. This way, one thread (or EventLoop) can efficiently manage and handle multiple connections.

  3. Thread model and Handler :

    Netty has a EventLoopGroupconcept called , which is EventLoopa collection of , usually used to represent a group of worker threads. When a new connection is accepted, it will be assigned to EventLoopGroupone of them EventLoop, and during the entire lifetime of the connection, all I/O operations are EventLoophandled by this (that is, a fixed thread) , which guarantees Events related to a particular connection are always processed sequentially by the same thread.

    When EventLoopit detects Channelan event on an object (for example, data is readable), it passes the event to the corresponding Handlerhandler. HandlerIt is a component used in Netty to handle various network events (such as reading, writing, connecting, disconnecting, etc.).

  4. Pipeline:

    Netty also has a ChannelPipelineconcept, which is ChannelHandlera linked list of Channelinbound and outbound events for processing or interception. This provides developers with a powerful tool to combine and reuse various processing logic.

In short, Netty provides a set of powerful and flexible tools and abstractions, making network programming easier and more efficient.

8 Service registration/discovery/offline and load balancing (Alibaba side)

8.1 When does an rpc framework interact with the registration center, can you introduce it?

The RPC framework and the registration center mainly interact at the following moments:

  • Service registration: When the service provider starts, it will register the service information it provides to the registration center. Usually, this process occurs during the initialization phase of the service provider startup.

  • Service discovery: After the service consumer starts, it will obtain the required service information from the registration center. This process is called service discovery. Service discovery may occur when the service consumer is started, or it may be performed periodically during operation to obtain the latest service information.

  • Service offline: When a service provider shuts down or fails to provide services for some reason, it will remove itself from the registration center. This process usually occurs when the service provider shuts down.

  • Service change: When the operating status of itself changes, it can report to the registration center

8.2 What information will be registered when the service is registered?

Correct answer:
Service registration is a very important part of using the RPC (Remote Procedure Call) framework. During the service registration process, the following is some information that is typically registered:

  • Service identifier (different from instance id): This is a globally unique identifier used to distinguish individual services. This identifier usually includes the name of the service and the version number , so that consumers can select the corresponding version of the service according to their needs.
  • Service address: Registered services need to provide the network address where they can be found and accessed, which usually includes IP address and port number.
  • Service interface metadata: This includes some additional service information, such as the description of the service, the state of the service (such as health status), the method or interface details , etc.
  • Service provider information: It may contain the provider 's machine information, such as CPU, memory, etc., to facilitate service discovery and load balancing systems to make more reasonable decisions.

This information will be registered in the service registration center for consumers to query and use. Note that what information to register may vary with different RPC frameworks and usage scenarios.

8.3 How to do rpc client routing (it is talking about how to load balance)? Does it support custom routing operations (custom load balancing strategy)

The real way to achieve expansion:

The RPC framework usually provides some built-in load balancing strategies, such as round robin, random, minimum number of connections, etc. However, in some cases, these built-in policies may not meet specific business needs. Many RPC frameworks also take this situation into account, so they support user-defined load balancing strategies.
Take some common RPC frameworks as examples:

  • In Dubbo, users can define their own load balancing strategy by implementing the LoadBalance interface. On the service consumer side, you can use a custom load balancing strategy through @Reference(loadbalance = "myLoadBalance").

When implementing a custom load balancing strategy, factors such as service health, network delay, and server load need to be considered. In addition, you also need to pay attention to thread safety and performance issues.
Note that the specific method of customizing the load balancing strategy varies with different RPC frameworks. For specific implementation, you need to consult the official documents or related materials of the corresponding RPC framework.

9 Generalized calls

9.1 What is a generic call?

Answer: In the general call process, the method signature used is clear and plain text, but there is another situation where the server itself cannot provide such a clear API signature. At this time, the service and method you pass may be a string , the parameter you passed in the past is a big map, which just specifies which request to hit the corresponding server, and the server layer will do the corresponding reflection and search; when the client gets the response data, Do not use known sdk objects to receive, but use fixed generics or Object objects to receive.

9.1.1 gpt4: What is the generalized call of the RPC framework?

Generic calls allow clients to call methods on the server without knowing the service's interface definition at compile time. In short, it allows calling remote services without service interface and entity class . This is especially useful for certain dynamic invocation scenarios such as API gateways or specific service testing tools.

Using generalized calls, service consumers usually pass detailed information such as service names, method names, and method parameters, and the RPC framework is responsible for converting these information into actual service calls.

9.1.2 Which RPC frameworks support this kind of call?

Most mature RPC frameworks support generic calls. Common such as:

  • Dubbo : Alibaba's open source RPC framework Dubbo supports generalized calls, allowing users to make remote method calls without knowing the service definition.
  • gRPC : Google's gRPC framework also supports similar functionality, although it may be implemented in a slightly different way than frameworks such as Dubbo.
  • Apache Thrift : This is another RPC framework that supports generalized calls.

9.1.3 Under the generalization call, does the service provider still need to register all interface services with the registration center?

Yes, service providers still need to register their interface services with the registry. Generic calls primarily affect clients, as it allows clients to make calls without knowing the interface definition. The service provider still needs to register its service and interface information to the registration center so that consumers can discover and call it. The main purpose of the registry is service discovery. Whether it is a traditional call or a generalized call, consumers need to know the provider's address and other metadata.

9.1.4 For example, the following calling method of my client is non-generalized calling:

    public static void main2(String[] args) {
    
    
        RpcClient client = new NettyClient(CommonSerializer.PROTOBUF_SERIALIZER);
        RpcClientProxy rpcClientProxy = new RpcClientProxy(client);
        HelloService helloService = rpcClientProxy.getProxy(HelloService.class);
        HelloObject object = new HelloObject(12, "This is a message");
        String res = helloService.hello(object);
        System.out.println(res);
        ByeService byeService = rpcClientProxy.getProxy(ByeService.class);
        System.out.println(byeService.bye("Netty"));
    }

Answer:
Based on the code you provided, you are using non- generic calls . Here's why:

  1. Well-defined interfaces : You have HelloServiceand ByeServicetwo well-defined interfaces, and corresponding methods (eg hello()and bye()). In a non-generic call, both the service consumer and the service provider usually know the definition of this interface.

  2. Explicit types : Explicit types are used, eg HelloObject. This is a parameter of a specific type, the structure of which is known to both the consumer and the provider.

  3. Proxy use : You RpcClientProxycreate a proxy for the interface, making remote calls look like local calls. However, you still explicitly call helloService.hello(object)and byeService.bye("Netty"), which means you know the method to call and the parameters that need to be passed.

In generalized calls , calls are usually more dynamic and flexible. For example, you may only need to know the name of the method, the types of the parameters, and the values ​​of the parameters, but you don't need to know the interface or the explicit definition of the method in advance. Whereas in a non-generic call, like in your example, all of these are explicit and pre-defined.

9.1.5 Can you give me a demo of a simple generalization call?

sure. Generalized calling is a more flexible and dynamic calling method, which is mainly used in the following scenarios:

  1. When you don't know the specific service or method to call.
  2. When the service consumer and service provider do not have a shared interface definition.
  3. When it is necessary to build a more flexible system that can decide which service to call at runtime.

The key to generalized calls is that you do not initiate a call based on a fixed, known interface or class, but dynamically construct the service information to be called at runtime.

Here is a simple example of a generic call:

public class GenericInvokeDemo {
    
    

    public static void main(String[] args) throws Exception {
    
    
        RpcClient client = new NettyClient(CommonSerializer.PROTOBUF_SERIALIZER);
        RpcClientProxy rpcClientProxy = new RpcClientProxy(client);

        // 使用泛化调用,不需要预先定义接口
        String serviceName = "HelloService";
        String methodName = "hello";
        
        // 这里仅仅是为了示意,实际的参数名和参数类型可能需要通过其他方式获得
        String[] parameterTypes = new String[] {
    
    "HelloObject"};
        Object[] arguments = new Object[] {
    
    new HelloObject(12, "This is a message")};

        Object result = invoke(rpcClientProxy, serviceName, methodName, parameterTypes, arguments);
        System.out.println(result);
    }

    public static Object invoke(RpcClientProxy proxy, String serviceName, String methodName, String[] parameterTypes, Object[] arguments) throws Exception {
    
    
        // 构建RpcRequest,通常需要更加复杂的逻辑来处理参数类型和参数值
        RpcRequest request = new RpcRequest();
        request.setInterfaceName(serviceName);
        request.setMethodName(methodName);
        request.setParameters(arguments);
        // ... 其他属性设置

        // 使用RpcClientProxy发送请求
        RpcResponse response = proxy.sendRequest(request);
        
        // 检查响应并返回结果
        if (response.getStatus() == RpcResponse.SUCCESS) {
    
    
            return response.getData();
        } else {
    
    
            throw new Exception("RPC调用失败: " + response.getMessage());
        }
    }
}

In this example, instead of using HelloServicethe interface or its hellomethods directly, we dynamically build serviceNameand methodNamedetermine which service to call. This allows us to make RPC calls without knowing the specific service in advance.

Please note that this is just a simplified example, and the actual generalization call may involve more complex logic, such as handling different parameter types, handling different return types, and so on.

9.2 Why do we need generalized calls?

Answer: The main advantage of generalized calling is that it provides a highly flexible calling method, so that the caller does not have to depend on the specific interface definition of the server. Generalized calls have the following main application scenarios and benefits:

  1. Dynamic call : When the number of interfaces is huge or the interfaces change frequently, using generalized calls can avoid frequent client SDK or code updates.

  2. Bridging different systems : For some applications that need to work across platforms or languages, generalization calls can be used as an intermediate layer to bridge different systems, without having to customize APIs for each system.

  3. Testing and tool development : Generalized calls can easily build testing tools, allowing developers or testers to dynamically call services without knowing the specific implementation of services in advance.

  4. API gateway and management : For API gateways or service management platforms, generalized calls allow them to forward requests to specific services, while providing unified logging, monitoring, and security policies.

  5. Reduce code redundancy : Generalized calls avoid generating and maintaining client code for each service or interface, reducing maintenance costs.

9.3 What are the disadvantages of generic calls?

Answer: Although generalization provides a lot of flexibility, it also has some disadvantages:

  1. Performance overhead : Due to the dynamic parsing and reflection involved, generalized calls usually have a higher performance overhead than specific method calls.

  2. Increased complexity : Generalization calls introduce additional parsing and mapping logic, which may increase the complexity of the system.

  3. Type-unsafe : Because method signatures and parameters are specified dynamically, it can lead to runtime errors that are difficult to detect at compile time.

  4. Development experience : For developers, no specific interface definition may make it more difficult to understand and use, and it is also difficult to take advantage of some IDE features, such as code completion and hints.

In general, generalized calling is a powerful and flexible tool, but its advantages and disadvantages need to be weighed according to specific application scenarios.

10 protocol buffer agreement

Recommended reading: How does protobuf transmit complex data structures?

10.1 What is a .proto file, what is contained in it, and where is it generally stored?

  1. .proto file :
    .protoThe file is a language-neutral description file of Protocol Buffers (protobuf). It defines the format of the message you want to serialize or deserialize via protobuf. Data access classes can be generated from the file using language-specific compilers such as protocJava, C++, and Python ..proto

  2. .proto file content :
    .protoThe file includes the following:

    • Message Definition : Defines message types and their fields.
    message Person {
      string name = 1;
      int32 age = 2;
      string email = 3;
    }
    
    • Field data types : Basic data types like int32, string, bool, bytes, etc., as well as custom types (other message types or enumerations).

    • Service definitions (optional): If you use protobuf for RPC communication with gRPC, then you can .protodefine services and their methods in the file.

    service MyService {
      rpc MyMethod(RequestType) returns (ResponseType);
    }
    
    • Enum definition :
    enum MyEnum {
      UNKNOWN = 0;
      TYPE_A = 1;
      TYPE_B = 2;
    }
    
    • Import other .protofiles : Definitions in one file importcan be reused in another file using the statement..proto.proto
    import "other_file.proto";
    
    • Packages and Namespaces : To avoid naming conflicts, you can .protodefine a package for a file.
    package mypackage;
    
    • Other options : such as defining custom Java packages, optimization modes, etc.
    option java_package = "com.mycompany.mycode";
    
  3. Where the .proto files are stored :
    .protoWhere the files are stored depends on the structure of the project and the choice of the development team. When files are shared between multiple projects or languages .proto, they may be stored in a central location. For larger organizations, there is sometimes a dedicated git repository for all .protofiles to ensure sharing and version control across teams.

    Within a single project, .protothe files are usually stored together with the source code in a version control system, for example under a specific directory, eg src/main/proto/(for Java Maven projects).

Overall, .protofiles are a core part of the protobuf system, defining the structure of messages to be serialized and deserialized.

10.2 The .proto file defines the format of the message, so both the client and the server will generate corresponding classes based on this file reflection, right?

Yes, your understanding is correct.

After you define a .protofile, you can use the Protocol Buffers compiler protocto generate the corresponding code. The generated code will contain .protothe class definitions for the message types you define in the file.

For example, if you define a .protofile with the following contents:

syntax = "proto3";

message Person {
  string name = 1;
  int32 age = 2;
  string email = 3;
}

Using protoca compiler, you can generate code for multiple programming languages ​​such as Java, C++, Python, and more. For example, if you want to generate code for Java, you can use the following command:

protoc --java_out=. your_file_name.proto

After executing this command, the compiler will generate a Java file for you , which contains Personthe definition of the class. This class will contain field getters, setters, and other methods such as serialization and deserialization.

Therefore, both the server and the client can use this .protofile to generate the corresponding code and ensure that the message format between them is consistent. This is the power of Protocol Buffers: you only need to define the message format once, and then you can generate code for multiple programming languages.

10.3 If Protocol Buffers is used, is it necessary to implement the Serializable interface of serialized objects on the client and server? (According to 10.1 and 2, it is known that it is not needed)

When you use Protocol Buffers (protobuf) as a serialization framework, you don't need to make your objects implement Java's Serializable interface. In fact, protobuf has its own way of defining and serializing data.

In protobuf, you define .proto files, which describe your data structures. Then, the protobuf compiler (protoc) will generate data classes and serialization/deserialization code for the corresponding language. These generated codes do not depend on the Serializable interface in Java.

In summary, if you've chosen to use protobuf, you don't need (and shouldn't) implement the Serializable interface. The serialization and deserialization methods provided by protobuf are more efficient and flexible than Java's native Serializable.

10.4 How to understand the compatibility of Protocol Buffers

One of the design focuses of Protocol Buffers (Protobuf for short) is backward compatibility and forward compatibility. This allows services to be iterated and deployed independently without breaking communication with other services or clients. This compatibility is especially important for microservice architectures and distributed systems, where individual services may be upgraded and deployed independently.

Here are some key points about Protobuf compatibility:

  1. Backwards Compatibility :

    • If you add new fields, when the old version of the code reads the new version of the data, it ignores the new fields.
    • If you delete a field, you should no longer use that field's number. Newer versions of the code will treat the field as the default value (if provided) when reading the older version's data.
  2. Forwards Compatibility :

    • Older versions of the code can parse data produced by newer versions of the code (excluding new fields introduced by the newer code).
    • New fields are ignored in older versions of the code.
  3. Incompatible changes :

    • Modifying the number of an existing field is destructive.
    • Changing the type of a field (eg from int32to string) is also destructive.
  4. Reserved fields :

    • If you delete a field, you should .protodeclare the field as reserved in the file to ensure that the field number or name will not be used again in the future.

To maximize compatibility, the general advice is to:

  • Do not change the number and type of fields that already exist.
  • Added fields should be optional, or have explicit default values.
  • When deleting fields, use reserved keywords to prevent future conflicts.

This compatibility mechanism ensures that even if the versions of the server and client do not match exactly, communication between them will work, at least in most cases.

10.5 Can you give an example?

sure. Through concrete examples, let's understand the forward and backward compatibility of Protocol Buffers.

Suppose we have a service that manages user information. An initial version of .protothe file might look like this:

Files for Version 1 .proto:

message User {
    int32 id = 1;
    string name = 2;
}
  1. When both the server and the client are at this version, they can communicate with each other smoothly.

Now, we decided to extend this message and add a new field.

Documentation for Version 2 .proto:

message User {
    int32 id = 1;
    string name = 2;
    string email = 3; // 新增的字段
}
  1. Backwards compatibility: When a new version of a service sends a message containing emaila field Userto an older version of a client, the old client will ignore this new field and process other fields normally. This is backward compatibility.

Later, we decide that a field is no longer needed, so we delete it.

.protoDocumentation for Version 3 :

message User {
    int32 id = 1;
    string name = 2;
    reserved 3; // 字段编号 3 现在已经被保留
}
  1. forward compatibility: when an older version of the service (and emaila version of the field) sends Usera message to a newer version of the client, the new client will see that the field is marked as reserved (although the actual name emailwill not appear), It is therefore ignored and the other fields are processed normally. This is forward compatibility.

These examples show how to maintain communication compatibility when the Protobuf message structure changes. This compatibility greatly facilitates the iteration and deployment of distributed systems.

10.6 What is forward compatibility and backward compatibility

The answer to this question involves how we define the concepts of "before" and "behind", and from which direction we think about this compatibility.

  1. Backwards Compatibility :

    • When we say something is "backward compatible", we usually mean that newer versions of the system can handle data or requests from older versions. In other words, we are looking "backwards", considering the compatibility between the new version and the previous version.
    • For example, a new version of software can read files created by an older version of it. Or in this case, a newer version of the service can handle requests from older version clients.
    • Actual scenario: The server uses a new version (such as adding a new field), but the client still uses the old version. When the client sends data that does not contain new fields to the server, the server can process the data normally.
  2. Forwards Compatibility :

    • When we say something is "forward compatible", we mean that older versions of the system can handle newer versions of data or requests. We're looking "ahead" and thinking about how to maintain compatibility between older versions and future versions.
    • For example, an older version of software can read files created by a newer version of it. Or in this case, an older version of the client can process the response of the newer version of the server.
    • Actual scenario: the client uses a new version (such as adding a new field), but the server still uses the old version. When the server sends data that does not contain new fields to the client, the client can process the data normally and ignore the new fields.

This nomenclature can be confusing, but it's mostly from a system perspective: when we think of "later" versions, we think of backward compatibility; when we think of "earlier" versions, we think of forward compatibility. compatibility.

The old and new version of 10.7 is based on the comparison between the server and the client, and it is centered on the server

Yes, your understanding is correct. In many distributed systems and client-server models, especially when considering compatibility, the server is often considered the center. There are several reasons for this:

  1. Centralized control : The server usually has more control. For example, you can decide when to update server software, but you cannot decide when all clients are updated.

  2. Diverse clients : Clients may come in a wide variety of versions, platforms, and configurations. Considering this diversity, it is particularly important to maintain the stability and compatibility of the server.

  3. Stability and continuity : The server usually needs to be online 24/7 to provide services for various versions of clients. Therefore, it is particularly critical to consider the compatibility of the server.

  4. Release process : In some cases, updates on the server side may be relatively centralized and controlled, while updates on the client side (especially in mobile environments) may be affected by factors such as user choice and device compatibility.

So when we talk about compatibility, we really usually look at it from a server side perspective. For example, backward compatibility ensures that newer versions of servers can interact with older versions of clients, while forward compatibility ensures that older versions of servers can interact with newer versions of clients. This server-centric perspective makes version management and protocol updates relatively simpler and more controllable.

Guess you like

Origin blog.csdn.net/yxg520s/article/details/132286806