Distributed related knowledge points record

SOA and micro Services Architecture
1.SOA (Service Oriented Architecture) "Service Oriented Architecture": He is a design methodology which includes multiple services, and ultimately provide a range of functions between service through interdependence. A service normally present and operating system processes in an independent form. It calls through the network between the various services.

2. Micro Services Architecture: SOA architecture and in fact similar to the micro-service is on SOA to do sublimation, a focus on micro-services architecture is emphasized that "business needs to complete assembly and service" of the original single business system will be demolished can be divided into several independent developers, small applications designed to run. These small applications through service between the completion of interaction and integration.

Micro Services Architecture SOA = 80% + 100% thought the service architecture of component-based framework for thinking of domain modeling + 80% thought

Gateway API: API Gateway is a server, is the only entrance system. From the perspective of object-oriented design, it is similar to the appearance model. API Gateway encapsulates the internal system architecture that provides a custom API for each client. It may also have other functions, such as authentication, monitoring, load balancing, caching, and fragment management request, the static response processing. The core elements of the API gateway approach is that all of the client and the consumer end through a unified gateway access micro-service, handling all non-business functions at the gateway layer. Typically, the gateway also provides REST / HTTP access API. Server by API-GW registration and management services.

SOA architecture features: systems integration, service oriented systems (the original inherent business functions into a common business services, business logic quickly reuse, solve the core problem is the [reuse]), service oriented business (to driving a business to business units into a service package, core problem is high [])

Micro-service architecture features: Implementation of Component through the service, the ability to divide by business services and development teams, decentralization, infrastructure automation

Here Insert Picture Description

Second, the Distributed Transaction
Distributed Transaction means the transaction participants, transactional servers, server and transaction manager resource are located on different nodes of different distributed systems.
Is a major operation by the composition of different small operation, these small operations distributed across different servers, and belong to different applications, distributed transactions need to ensure that these small operations either all succeed, or all fail. Distributed Transaction is to ensure the consistency of data from different databases.

CAP theorem consistency, availability, partition fault tolerance. CA distributed system architecture theoretically impossible choice but to select a CP or AP architecture.
BASE is Basically Available (available basic), abbreviations three phrases Soft state (soft state) and Eventually consistent (final consistency), the CAP is an extension of the AP.
BASE CAP in theory does not solve the network latency, soft state and ultimately consistent with BASE ensure the consistency of the delay.

Distributed transactions can take a look at some common following scheme: 2PC, TCC, MQ affairs, Saga affairs

Here Insert Picture Description
XA protocol divided in two phases:
Transaction Manager requires that each database transaction related to pre-commit (precommit) this, and reflect whether to submit.
Transaction Coordinator required to submit data for each database, or rollback data.
Advantages: try to ensure strong data consistency, to achieve lower costs
Disadvantages: single-point problem: Transaction Manager plays a role in the whole process is critical, if it goes down, such as has been done in the first stage, second stage We are preparing to submit downtime when the transaction manager, a resource manager would have been blocked, resulting in the database can not be used. Synchronous blocking: After ready, Explorer of the resource has been in a blocked until the submission is completed, the release of resources. Data inconsistency: although the two-phase commit protocol for distributed data strong consistency of the design, but data inconsistencies may still exist.

XA protocol is relatively simple, low cost, but its single-point problem, and can not support high-concurrency (due to the synchronous blocking) remains its greatest weakness.

TCC compared to XA transaction mechanism described above, solves several disadvantages: solve a single point coordinator, initiated by the business side and complete the main business activity. Business Activity Manager also become multi-point, introduction of the cluster.
Synchronous blocking: the introduction of a timeout, the timeout is compensated, and does not lock the entire resource, the resource is converted into the form of business logic, the smaller particle size.
Data consistency, the arrival of the compensation mechanism, controlled by the consistency of business activity manager.

Try stages: try to do, to complete all business checks (consistency), set aside the necessary operational resources (quasi-isolation). Confirm stages: Confirm the implementation of real business, not for any business checking, business resources Try using only the stage of reservation, Confirm operation meets idempotency. It requires idempotent design, the need to retry failed Confirm. Cancel stages: cancel the execution, the release phase of business resources Try reserve, Cancel Operation is idempotent. Substantially uniform and Confirm Cancel phase abnormality stage exception processing scheme.

Here Insert Picture Description
TCC to ensure strong consistency while maximizing system availability and scalability.
TCC has the following characteristics: Located in the business service layer rather than a resource layer by atomic layer ensures business
  without a separate preparation (Prepare) phase commit protocol reduces the cost of
  Try operating both resources and operational readiness
  Try flexibility to choose the operating business resources locking granularity, not the entire resource lock, the concurrent increase of
TCC require high development costs, required for each sub-service responsive to comfirm, Cancel operation, i.e., to achieve the corresponding compensation logic.

Distributed transaction message
such distributed transaction mechanism transaction into a plurality of local transaction, referred to herein as the primary transaction from the transaction. First, the main local affairs submitted in advance, and then, get the transactions from a transaction message from the information submitted by the local message notification. You can see that this is an asynchronous transaction mechanism, can only guarantee eventual consistency; but availability is very high, not because of failure blockage occurs. In addition, the main transaction has been submitted in advance, because if the transaction can not be committed from, to roll back the main transaction is quite troublesome, so this mode is only applicable to large probability theory and other successful business case that is submitted from a transaction failure may be due to fault, but is unlikely to be a logical error.
  Transaction mechanism based on asynchronous messaging There are two main ways: the local message list and transaction message. The difference is that: how to ensure that the primary atomic transaction commit operation and send these two messages.
  If the example is implemented by asynchronous transfer message, then operation is divided into four: deducting the user A, a message, received message user B, user B deducting money. The first two steps must ensure atomicity, if A successful deducting money but did not issue a message, the user A lost; if the message is successful, but there is no salary deduction, then the user B had the more money, the bank certainly quit.

The basic process is as follows: Prepared first stage message, will get the address of the message. The second phase of the implementation of local affairs. The third stage of the first stage to get access to the address of the message, and modify the state. Message recipients can use this message. If the confirmation message fails, a message is not updated regularly scan status in the RocketMQ Broker. If a message is not acknowledged message is sent to the sender of the message, to determine whether to submit, in the form RocketMQ is the Listener to the sender, for processing.

If the consumer timeout, you will need to have been retried, the message receiver needs to ensure that power and so on. If the message consumer fails, then you need to be processed manually because the lower the probability, if the time for such a small probability that the design of this complex process but worth the candle.
Here Insert Picture Description

Saga transaction
consisting of Saga: Saga each of a series of sub-transaction Ti, each has a corresponding compensation operation Ti Ci, Ti caused by the compensation operation result for revoking. Each T here are a local transaction.
Saga defines two recovery strategies: backward recovery, the second execution order that is mentioned above, where j is the error occurred sub-transaction, the effect of this approach is that all successful sub-transation Before revoking away, making the results of the entire Saga revocation. Forward recovery for a scene to be successful, the execution order is like this: T1, T2, ..., Tj ( failure), Tj (retry), ..., Tn of, where j is the error sub-transaction . In this case it does not need to Ci. Here we must note that there is no guarantee isolation in Saga mode, because there is no lock resource, other transactions can still cover or affect the current transaction.

Third, the distributed algorithm (consistency Hash Algorithm)
in the server load balancing to do when there are many to choose from load balancing algorithms, including: round-robin algorithm (Round Robin), hash algorithm (HASH), least connections algorithm (Least Connection), the response speed of the algorithm (response Time), weighting (weighted) and the like. Which hash algorithm is the most commonly used algorithms.

A typical scenario is: there are N servers provide caching services, the need for server load balancing, distributed to an average of requests each server, each machine is responsible for service 1 N /.

Commonly used algorithms take the remainder of the hash result (hash () mod N): machine numbered from 0 to N-1, in accordance with custom hash () algorithm, hash for each request () values ​​as modulo N, remainder obtained i, and then dispatches the request to a number of machine i. But there is a fatal problem algorithm method, if a machine goes down, it should fall on the request of the machine will not be able to get the right deal, then you need to crash the server removed from the algorithm, this time there will be ( N-1) / N of the data cache server needs to be recalculated; if a new machine, there will be N / (N + 1) data cache server needs to be recalculated. For a system, which is usually unacceptable bumps (because it means that a large number of cache invalidation or data need to be transferred). So, how to design a load balancing strategy, so that the requesting affected little as possible of it?

In Memcached, Key-Value Store, Bittorrent DHT, LVS are using Consistent Hashing algorithm, it can be said Consistent Hashing algorithm is a distributed system of choice for load balancing.

Consistent hashing algorithm (Consistent Hashing Algorithm) is a distributed algorithm commonly used for load balancing. The entire hash values consistent hashing space into a virtual ring, such as the hypothesis space H is a hash function 0 - (2 32) -1 (i.e., a hash value is a 32-bit unsigned integer) the whole spatial organization in a clockwise direction. And 0 (2 32) in the direction of zero coincides -1.
Next, the following algorithm is used to access location data corresponding to the server: the data using the same key function H to calculate the hash value H, this position is determined through the data according to H on the ring, the ring in the clockwise direction from the position of "walking", of a server encountered is that it should locate the server.
For example, we have A, B, C, D four data object, after hashing, the space position on the ring is as follows: data A is set as to the Server 1, D is set as to the Server 3, and B, C are set as to the Server 2.

Consistency hash algorithm principle: a first length configured integral ring 232 (This ring is called Hash consistency ring) The Hash value of the node name (which is the distribution of [0, 232-1]) is placed in the server node Hash this ring, which is then calculated according to the Key value Hash value of the data (which is also the distribution of [0, 232-1]), then the value of the distance lookup Hash Key value nearest server nodes clockwise in the ring Hash, Key to complete the mapping lookup server.

Hash algorithm uses a consistent, despite the enhanced flexibility of the system, but also may lead to uneven load distribution, the solution is to use a virtual node instead of real node.

Published 45 original articles · won praise 9 · views 30000 +

Guess you like

Origin blog.csdn.net/zhanglinlove/article/details/89741607