https://segmentfault.com/a/1190000004468442

sequence

This article mainly describes some relevant theoretical foundations of distributed system development.

1. ACID

Four characteristics of a transaction:

1. Atomic atomicity

A transaction must be an atomic operation sequence unit. During one execution of the operations contained in the transaction, either all of them are executed successfully, or all of them are not executed. If any one fails, the entire transaction is rolled back. A business is a success.

2. Consistency

The execution of the transaction cannot destroy the integrity and consistency of the database data. The database must be in a consistent state before and after the transaction is executed.

3. Isolation isolation

In a concurrent environment, concurrent transactions are isolated from each other, and the execution of one transaction cannot be interfered with by other transactions. That is, when different transactions concurrently manipulate the same data, each transaction has its own complete data space, that is, the operations and data used within a transaction are isolated from other concurrent transactions, and concurrently executed transactions cannot interfere with each other. .

4 transaction isolation levels in SQL:
(1) Read uncommitted

Dirty reads are allowed. If a transaction is processing some data, and updates it, but has not completed the transaction at the same time, so the transaction is not committed, at the same time, allowing another transaction to access the data. For example, A accumulates the variable n from 0 to 10 before committing the transaction. At this time, B may read all the intermediate values ​​of the n variable from 0 to 10.
(2) Read Submitted

Allow non-repeatable reads. Only data that has been committed is allowed to be read. That is, in the process of accumulating n from 0 to 10 in transaction A, B cannot see the intermediate value of n, and only 10 can be seen in it. At the same time, transaction C accumulates from 10 to 20. At this time, when B reads again in the same transaction, it reads 20.
(3) Repeatable read

Phantom reads are allowed. It is guaranteed that when the same data is read multiple times during transaction processing, its value is consistent with the time when the transaction started. Dirty reads and non-repeatable reads are prohibited. A phantom read is the same transaction operation. If the same data item is read in two time periods before and after, inconsistent results may occur. It is guaranteed that B reads the value of n multiple times within the same transaction, and the initial value 0 is read. Phantom reads are different transactions. The data of n read may be 0, 10, or 20
(4) Serialization

The most restrictive transaction requires that all transactions be executed serially and cannot be executed concurrently.
If there is no concurrency control for transactions, let's see what abnormal situations will occur in the concurrent operation of the database

(1) One type of lost update: two things read the same data, one modifies field 1, the other modifies field 2, and the one submitted later restores the field that was submitted first.

(2) The second type of lost update: two things read the same data, both modify the same field, and the one submitted later overwrites the modification submitted earlier.

(3) Dirty read: Read the uncommitted value, in case the transaction is rolled back, a dirty read will occur.

(4) Non-repeatable read: Between two queries, the content of the data is modified by another transaction, resulting in inconsistency of the content.

(5) Phantom read: Between two queries, records are inserted or deleted by another transaction, resulting in inconsistent result sets.

4. Durability

Once a transaction is committed, its state changes to the corresponding data in the database should be permanent. Even if the system crashes or the machine goes down, as long as the database can be restarted, it must be restored to the state when the transaction ended successfully.

2. CAP theorem

It is impossible for a distributed system to meet the three basic requirements of consistency, availability, and partition tolerance at the same time. At most, it can only meet two of them at the same time.

1. Consistency

In a distributed environment, consistency refers to the ability to maintain consistency among multiple copies. Under the requirement of consistency, when a system performs an update operation in a state of consistent data, it should ensure that the data of the system is still in a consistent state.

2. Availability

The services provided by the system must always be available, and the results can always be returned within a limited time for each operation request of the user.

(1) Limited time
For an operation request from the user, the system must be able to return the corresponding processing result within the specified time (response time). If this time range is exceeded, the system is considered unavailable. That is, the response time must be within a reasonable value and not disappoint the user.

(2) Return a normal result The system is
required to return a normal response result after completing the processing of the user request. A normal response usually clearly reflects the processing of the request, success or failure, rather than a return result that confuses the user. For example, if a system error such as OutOfMemory is returned, the system is considered unavailable.

3. Partition fault tolerance

That is, when a distributed system encounters any network partition failure, it still needs to be able to provide services that meet the consistency and availability to the outside world, unless the entire network environment fails.

Network partition refers to that in a distributed system, different nodes are distributed in different sub-networks (computer room/remote network). Due to some special reasons, the network is disconnected between these sub-networks, but the network of each sub-network is disconnected. The internal network is normal, so that the network environment of the entire system is divided into several isolated areas. The joining and leaving of each node that makes up a distributed system can be regarded as a special network partition.

3. Application of CAP

1. Give up P

If you give up partition fault tolerance, you give up distribution and the scalability of the system

2. Give up A

If the availability is abandoned, in the event of a network partition or other failure, the affected service needs to wait for a certain period of time, and then the policy service cannot be provided externally during this period, that is, it is unavailable.

3. Give up C

If the consistency is abandoned (here refers to strong consistency), the system cannot guarantee that the data maintains real-time consistency. When the data reaches the final consistency, there is a time window within which the data is inconsistent.

For distributed systems, P cannot be relinquished, so architects usually make a trade-off between availability and consistency.

4. BASE Theorem

Basically Available (basically available), Soft state (soft state), Eventually consistent (eventually consistent), based on the evolution of the CAP theorem, the core idea is that strong consistency cannot be achieved in real time, but each application can be based on its own business characteristics. , using appropriate means to bring the system to eventual consistency.

1. Basically Available

Basic availability means that a distributed system is allowed to lose some availability in the event of an unpredictable failure, but it does not mean that the system is unavailable.
(1) Loss in response time

Increased response time when failure occurs
(2) loss of functionality

When traffic peaks, block the use of some functions to ensure system stability (service degradation)

2. Soft state

In contrast to the hard state, it means that the data in the system is allowed to exist in an intermediate state, and it is believed that the existence of the intermediate state will not affect the overall availability of the system, that is, the process of allowing the system to synchronize data between data copies of different nodes is delayed. Time.

3. Eventually consistent

Emphasizes that all data copies in the system can eventually reach a consistent state after a period of synchronization. Its essence is that the system needs to ensure that the final data can be consistent, and there is no need to ensure the strong consistency of system data in real time.

The final consistency can be divided into the following types:
(1) Causal consistency (Causal consistency)
that is, process A notifies process B after updating the data, and then the range of process B for the data is the latest update after process A. value.

(2) Read your writes
After process A updates a piece of data, it can always access the latest value it has updated.

(3) Session consistency (Session consistency)
frames the data consistency in the session, and realizes the consistency of reading and writing in a session. That is, after performing the update, the client can always read the latest value of the data in the same session

(4) Monotonic read consistency (Monotonic read consistency)
If a process reads a certain value of a data item from the system, the system should not return an older value to any subsequent data access of the process.

(5) Monotonic write consistency (Monotoic write consistency)
A system needs to ensure that write operations from the same process are executed sequentially.

The BASE theorem proposes to gain availability by sacrificing consistency and allow data to be inconsistent for a period of time, but eventually reach a consistent state.

Original
https://segmentfault.com/a/1190000004468442

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324777717&siteId=291194637