[java] basic knowledge 3

Distributed CAP principle

The CAP principle, also known as the CAP theorem, means that in a distributed system, Consistency (consistency), Availability (availability), and Partition tolerance (partition tolerance) cannot be combined.

The CAP principle is the cornerstone of the NOSQL database.

Consistency (C): Whether all data backups in a distributed system have the same value at the same time. (Equivalent to all nodes accessing the same latest data copy)
Availability (A): After some nodes in the cluster fail, whether the cluster as a whole can still respond to the client's read and write requests. (High availability for data updates)
Partition tolerance (P): In terms of practical effects, partitions are equivalent to the time limit requirements for communication. If the system cannot achieve data consistency within the time limit, it means that a partition has occurred, and a choice between C and A must be made for the current operation.

Eureka (guaranteed AP), Zookeeper (guaranteed CP).

Comparison between Eureka and zookeeper

As a distributed service registry, how is Eureka better than Zookeeper?

The famous CAP theory points out that it is impossible for a distributed system to satisfy C (consistency), A (availability), and P (fault tolerance) at the same time. Since partition fault tolerance P must be guaranteed in a distributed system, we only A trade-off can be made between A and C.

  • What Zookeeper guarantees is CP —> to meet the consistency, partition fault-tolerant system, usually the performance is not particularly high
  • Eureka guarantees that AP —> satisfies availability, a partition fault-tolerant system, and usually may have lower consistency requirements

Zookeeper guarantees CP

When querying the service list from the registration center, we can tolerate that the registration center returns the registration information a few minutes ago, but we cannot receive the service and directly shut down and become unavailable. In other words, the service registration function has higher requirements for usability than consistency . However, there will be such a situation in zookeeper. When the master node loses contact with other nodes due to network failure, the remaining nodes will re-elect the leader. The problem is that it takes too long to elect a leader, 30-120s, and the entire zookeeper cluster is unavailable during the election, which leads to the paralysis of the registration service during the election. In a cloud deployment environment, it is highly probable that the zookeeper cluster loses the master node due to network problems. Although the service can eventually be restored, the long election time makes the registration unavailable for a long time, which is intolerable.

Eureka guarantees AP

Eureka understands this, so it prioritizes usability in design. All nodes of Eureka are equal , and the failure of several nodes will not affect the work of normal nodes, and the remaining nodes can still provide registration and query services. When the Eureka client registers with a Eureka, if it finds that the connection fails, it will automatically switch to other nodes. As long as one Eureka is still there, the availability of the registration service can be maintained, but the information found may not be the latest. Yes, in addition, Eureka also has a self-protection mechanism. If more than 85% of the nodes do not have a normal heartbeat within 15 minutes, then Eureka thinks that there is a network failure between the client and the registration center. The following situations:

  • Eureka is not removing services that should expire because they have not received heartbeats for a long time from the registration list
  • Eureka can still accept registration and query requests for new services, but it will not be synchronized to other nodes (that is, to ensure that the current node is still available)
  • When the network is stable, the new registration information of the current instance will be synchronized to other nodes

Therefore, Eureka can well deal with the situation that some nodes lose contact due to network failure, instead of paralyzing the entire registration service like zookeeper

Having said that, let me just talk about rest and rpc , these two terms we often hear

What is rest?

REST is an architectural style that refers to a set of architectural constraints and principles. An application or design that satisfies these constraints and principles is RESTful . The REST specification regards all content as resources, and everything on the Internet is a resource. REST does not create new technologies, components or services, it just uses existing features and capabilities of the Web. It can be realized completely through the HTTP protocol, and the data communication is handled using the HTTP protocol . The operations of the REST architecture on resources include acquiring, creating, modifying and deleting resources, exactly corresponding to the GET, POST, PUT and DELETE methods provided by the HTTP protocol.

Correspondence between HTTP verbs and REST style CRUD:

method crud
post cread,update,delete
get read
put update,create
delete delete

What are rpcs?

Remote Procedure Call, remote procedure call, is to call a remote method like a local method. RPC architecture diagram:

The RPC framework contains 4 core components, namely Client, Server, Client Stub and Server Stub. This Stub can be understood as a stub

  • Client (Client), the caller of the service.
  • Server (Server), the real service provider.
  • The client stub stores the address information of the server, and then packs the request parameters of the client into a network message, and then sends it to the server remotely through the network.
  • The server stub receives the message sent by the client, unpacks the message, and calls the local method

The RPC framework needs to do the most basic three things:

 1) How does the server determine the function to be called by the client;

    In the remote call, the client and the server maintain a corresponding table of [ID->function], and the ID is uniquely determined in all processes. When the client makes a remote procedure call, it attaches this ID, and the server checks the table to determine the function that the client needs to call, and then executes the code of the corresponding function.

 2) How to serialize and deserialize;

    When the client and the server interact, the parameters or results are converted into byte streams for transmission over the network, then serialization and reverse are required when converting data into byte streams or converting byte streams into a fixed format that can be read. Serialization, the speed of serialization and deserialization also affect the efficiency of remote calls.

 3) How to perform network transmission

    Most RPC frameworks choose TCP as the transport protocol, and some choose HTTP. For example, gRPC uses HTTP2. Different protocols have pros and cons. TCP is more efficient, and HTTP is more flexible in practical applications

Rest and rpc comparison

Compare rest rpc
general agreement http tcp/udp
performance Low high
flexibility high Low

REST and RPC application scenarios

Both REST and RPC are commonly used in microservice architectures.

  1) HTTP is relatively more standardized, more standard, and more general, no matter which language supports the http protocol. If you open APIs to the outside world, such as an open platform, there are various external programming languages, and you cannot refuse support for each language. Now open source middleware, basically the first few protocols supported include RESTful.

 2) The RPC framework, as the basic component of micro-service architecture, can greatly reduce the cost of micro-service architecture, improve the research and development efficiency of callers and service providers, and shield various complex details of calling functions (services) across processes. Let the caller feel like calling a local function to call a remote function, and make the service provider feel like implementing a local function to implement the service

Popular rpc framework and rest framework 

rpc: dubbo+zookeeper (registration center)

rest: Eureka is a sub-module of Netflix and one of the core modules.
Eureka is a REST-based service for locating services for cloud middle-tier service discovery and failover. Service registration and discovery are very important to the microservice architecture. With service discovery and registration, you only need to use the service identifier to access the service without modifying the configuration file of the service call.
The function is similar to dubbo's registration center, such as Zookeeper.

Speaking of eureka, we must mention our current nacos

Both nacos and eureka are registration centers, and both have their own load balancing strategies. Eureka is divided into Eureka Server (Eureka Service) and Eureka Client (Eureka Client). All Eureka Servers perform data synchronization through Replicate. Regardless of which Eureka Server the Eureka Client registers information with, eventually all Eureka Servers will store the registered information, which is cached locally on the Eureka Server.
When Eureka Client registers information with Eureka Server, we call it a service provider, and when it obtains registered information, it is called a service consumer, so many Eureka Clients are both service providers and service consumers.
After the service provider starts, it sends a heartbeat to Eureka Server every 30 seconds to prove its availability. When Eureka Server does not receive the provider's heartbeat for more than 90 seconds, it will consider the provider to be down and destroy the instance. Nacos has its own configuration center. Eureka needs to cooperate with config to implement the configuration center, and does not provide a management interface. Nacos is dynamically refreshed. It uses Netty to maintain long-term connection and push in real time. Eureka needs to cooperate with MQ to realize configuration dynamic refresh. Ali's nacos: the best
performance Well, he supports both AP and CP modes. He chooses temporary and permanent according to service registration
to decide whether to use AP mode or CP mode.
Multiple registration center nodes, so there is no inconsistency of node information

 springcloud

gateway

Features:

  • Built on Spring Framework 5, Project Reactor and Spring Boot 2.0

  • A route that matches any request attribute.

  • Predicates and filters are specific to routes.

  • Circuit breaker integration.

  • Spring Cloud Discovery client integration

  • Easy to write predicates and filters

  • request rate limit

  • path rewriting

Role: routing and filtering

service registry (service registry)

Commonly used: Eureka, zookeeper and now nacos,

The main ap used by eureka, zookeeper is cp, nacos supports both ap and cp

config (configuration) 

Config provides server and client support for external configuration in distributed systems . Using Config Server, you can manage external properties of applications in all environments. The conceptual mappings on the client and server are the same as Spring  Environmentand PropertySourcethe abstractions, so they fit nicely with Spring applications, but can be used with any application running in any language. As the application moves through the deployment process from developer to test and production, you can manage the configuration between these environments and make sure the application has everything it needs to run when it migrates. The default implementation of the server's storage backend uses git, so it easily supports tagged versions of configuration environments, as well as access to various tools for managing content. It's easy to add alternative implementations and plug them in using Spring configuration.

@EnableConfigServer //Enable configuration

Note here that bootstrap.yml  is a system-level configuration, and application.yml  is a user-level configuration

effect:

  • Centralized management of configuration files
  • Different environments, different configurations, dynamic configuration updates, deployment by environment, such as /dev /test /prod /beta /release
  • Dynamically adjust the configuration during operation, no longer need to write configuration files on the machine where each service is deployed, and the service will uniformly pull and configure its own information from the configuration center
  • When the configuration changes, the service can sense the configuration change and apply the new configuration without restarting
  • Expose configuration information in the form of REST interface

Hystrix (fuse)

Applications in complex distributed architectures have dozens of dependencies, each of which will inevitably fail at some point!

When calling between multiple microservices, suppose microservice A calls microservice B and microservice C, and microservice B and microservice C call other microservices. This is the so-called "fan-out". If the fan-out If the call response time of a microservice on the link is too long, or if it is unavailable , the call to microservice A will occupy more and more system resources, causing the system to crash, the so-called "avalanche effect".

For high-traffic applications, a single backend dependency may cause all resources on all servers to be saturated within tens of seconds. Worse than failure, these applications can also cause increased latency between services, straining backup queues, threads, and other system resources, leading to more cascading failures throughout the system, all of which represent the need to monitor failures and latency Isolation and management so that the failure of a single dependency does not affect the entire application or system operation .

We need, abandon the car to save handsome !

Hystrix is ​​an open source library used to deal with delay and fault tolerance of distributed systems. In distributed systems, many dependencies will inevitably fail to call, such as timeouts, exceptions, etc. Hystrix can guarantee that in the case of a dependency problem  , It will not cause the failure of the entire system service, avoid cascading failures, and improve the resilience of the distributed system.

The " circuit breaker " itself is a switching device. When a service unit fails, through the fault monitoring of the circuit breaker (similar to a blown fuse), an expected and processable alternative response (FallBack) is returned to the caller. , instead of waiting for a long time or throwing an exception that cannot be handled by the calling method, so as to ensure that the thread of the service caller will not be occupied for a long time and unnecessary, thus avoiding the spread of faults in the distributed system. Even an avalanche.

What can Hystrix do?

  • service downgrade
  • Service circuit breaker
  • Service throttling
  • near real-time monitoring

Service Degradation, Service Breaking and Monitoring

The fuse mechanism is a microservice link protection mechanism that wins the avalanche effect .

When a microservice of the fan-out link is unavailable or the response time is too long, the service will be degraded, and then the call of the microservice of the node will be broken, and the wrong response information will be returned quickly . After detecting that the microservice call response of the node is normal, the call link is restored. In the SpringCloud framework, the fuse mechanism is implemented through Hystrix. Hystrix will monitor the status of calls between microservices. When the failed call reaches a certain threshold, the default is 20 call failures within 5 seconds, and the fuse mechanism will be activated . The annotation of the fuse mechanism is: @HystrixCommand.

Service fusing solves the following problems:

  • When the dependent object is unstable, it can achieve the purpose of rapid failure;
  • After a fast failure, it can dynamically test whether the dependent objects are restored according to a certain algorithm.

 Therefore, in order to avoid the error reporting of the entire application or web page due to an exception or error in the background of a microservice, it is necessary to use a fuse

service downgrade

What is service downgrade?

Service degradation refers to when the pressure on the server increases sharply, according to the actual business situation and traffic, some services and pages are strategically not processed, or handled in a simple way, so as to release server resources to ensure the normal operation of the core business or Operate efficiently. To put it bluntly, it is to give up system resources to high-priority services as much as possible .

Resources are limited, but requests are unlimited. If the service is not downgraded during the peak period of concurrency, on the one hand, it will definitely affect the performance of the overall service, and in severe cases, it may cause downtime and the unavailability of some important services. Therefore, generally during the peak period, in order to ensure the availability of core functional services, some services must be downgraded. For example, during the Double 11 event, all services unrelated to transactions will be downgraded, such as viewing Ant Forest, viewing historical orders, and so on.

What scenarios are service downgrades mainly used for? When the overall load of the entire microservice architecture exceeds the preset upper threshold or the upcoming traffic is expected to exceed the preset threshold, in order to ensure that important or basic services can run normally, some unimportant or non-urgent Delayed use or suspension of use of the service by a service or task.

The way of downgrading can be determined according to the business. You can delay the service, such as delaying adding points to the user, just put it in a cache, and wait for the service to stabilize before executing it; or close the service within a granular range, such as closing the recommendation of related articles.

Considerations for service downgrades

  • 1) Which services are core services and which services are non-core services
  • 2) Which services can support downgrade, which services cannot support downgrade, what is the downgrade strategy
  • 3) In addition to service degradation, is there a more complex business release scenario, and what is the strategy?

Automatic downgrade classification

1) Timeout degradation: Mainly configure the timeout period, the number of timeout retries and the mechanism, and use the asynchronous mechanism to detect the reply

2) The number of failures is downgraded: mainly for some unstable APIs. When the number of failed calls reaches a certain threshold, it will be automatically downgraded. It is also necessary to use an asynchronous mechanism to detect the reply

3) Fault downgrade: For example, if the remote service to be called hangs up (network failure, DNS failure, http service returns an incorrect status code, rpc service throws an exception), it can be downgraded directly. The post-downgrade solutions include: default value (for example, if the inventory service is down, return to the default spot), pocket data (for example, if the advertisement is down, return to some static pages prepared in advance), cache (some cached data temporarily stored before)

4) Current limit downgrade: When flash-killing or snapping up some restricted products, the system may crash due to too much traffic at this time. At this time, traffic limiting will be used to limit the traffic. When the traffic limit threshold is reached, subsequent requests will Downgraded; the solution after downgrading can be: queuing page (direct the user to the queuing page and wait for a while to retry), out of stock (directly inform the user that the stock is out), error page (if the event is too popular, please wait for a while) Retry).

The difference between service fuse and downgrade

  • Service fusing->server : a service timeout or abnormality, causing fusing ~, similar to a fuse (self-fusing)
  • Service degradation -> client : Considering the overall website request load, when a certain service is broken or closed, the service will no longer be called. At this time, on the client side, we can prepare a FallBackFactory and return a default value (default value). It will lead to a decline in the overall service, but it can be used anyway, which is better than hanging up directly.
  • The triggering reasons are different. Service fusing is generally caused by a service (downstream service) failure, and service degradation is generally considered from the overall load; the level of management goals is different. Fusing is actually a framework-level process. Each micro All services are required (no hierarchy), and downgrading generally requires a hierarchy of business (for example, downgrading generally starts from the most peripheral service)
  • The implementation methods are different. The service degradation is code-invasive (completed by the controller/or automatic degradation), and the fuse is generally called self-fuse .

Fuse, downgrade, current limit :

Current limiting: limit the amount of concurrent request access, if it exceeds the threshold, it will be rejected;

Degradation: Prioritize services, sacrifice non-core services (unavailable), and ensure the stability of core services; consider the overall load;

Fuse: The failure of dependent downstream services triggers a fuse to avoid crashing the system; the system automatically executes and recovers

Dashboard Flow Monitoring

Load balancing (ribbon and feign)

ribbon 

  • Spring Cloud Ribbon is a set of client load balancing tools based on Netflix Ribbon .
  • To put it simply, Ribbon is an open source project released by Netflix. Its main function is to provide a client-side software load balancing algorithm and connect Netflix's middle-tier services together. Ribbon's client component provides a series of complete configuration items, such as: connection timeout, retry, etc. Simply put, it is to list all the following LoadBalancer (LB: Load Balancer) in the configuration file, and Ribbon will automatically help you connect these machines based on certain rules (such as simple polling, random connection, etc.) . We can also easily use Ribbon to implement a custom load balancing algorithm!

What can the ribbon do?

  • LB, or Load Balancer, is an application often used in microservices or distributed clusters.
  • Simply put, load balancing is to distribute user requests evenly to multiple services, so as to achieve HA (high utilization) of the system.
  • Common load balancing software includes Nginx, Lvs and so on.
  • Both Dubbo and SpringCloud provide us with load balancing, and SpringCloud's load balancing algorithm can be customized .
  • Simple classification of load balancing:

    • Centralized LB

      • That is to use an independent LB facility between the service provider and the consumer, such as Nginx (reverse proxy server) , which is responsible for forwarding the access request to the service provider through a certain strategy!
    • Process LB

      • Integrating LB logic into the consumer, the consumer learns which addresses are available from the service registry, and then selects a suitable server from these addresses.
      • Ribbon belongs to the in-process LB , it is just a class library, integrated in the process of the consumer, through which the consumer obtains the address of the service provider!

Profile

Feign is a declarative Web Service client, which makes calls between microservices easier, similar to how a controller calls a service. SpringCloud integrates Ribbon and Eureka, and can use Feigin to provide a load-balanced http client

Just create an interface and add annotations~

Feign, mainly the community version, everyone is used to interface-oriented programming. This is the norm for many developers. Call microservices to access two methods

  1. Microservice name [ribbon]
  2. Interfaces and annotations [feign]

What can Feign do?

  • Feign aims to make writing Java Http clients easier
  • When using Ribbon  +  RestTemplate earlier , RestTemplate is used to encapsulate Http requests to form a set of templated calling methods. However, in actual development, since there may be more than one call to the service dependency, an interface will often be called from multiple places, so usually a client class is encapsulated for each microservice to wrap the calls of these dependent services. Therefore, Feign has made further encapsulation on this basis, and he will help us define and implement the definition of the dependent service interface. Under Feign's implementation, we only need to create an interface and use annotations to configure it (similar to the previous Mapper annotations are marked on the Dao interface, and now a microservice interface is marked with a Feign annotation), and the interface binding to the service provider can be completed, which simplifies the development of automatically encapsulating the service calling client when using Spring Cloud Ribbon.

Feign integrates Ribbon by default

  • Ribbon is used to maintain the service list information of MicroServiceCloud-Dept, and the load balancing of the client is realized through polling. Unlike Ribbon, Feign only needs to define the service binding interface and use a declarative method, which is elegant and simple A service call is implemented.

How to choose Feign and Ribbon?

According to personal habits, if you like the REST style, use Ribbon; if you like the interface-oriented style of the community version, use Feign.

Guess you like

Origin blog.csdn.net/weixin_46601559/article/details/126640834