Deadlock Processing in Distributed Database System

Deadlock Processing in Distributed Database System

Global deadlock and wait graph

Conditions under which the deadlock occurs

In a distributed environment, due to the existence of influencing factors such as the uncertainty of communication delay, geographical distribution, and the high degree of sharing of resources and data, it is extremely difficult to prevent and detect deadlocks. In a distributed computing system, there are more than two processes executing concurrently, and each process is waiting for system resources occupied by other processes and cannot continue to run, which means that no process in the system can run (dead). Loop), which creates a deadlock.

If and only if the followingFour conditions are met simultaneouslyWhen the deadlock occurs:

  • Mutually exclusive. The same resource can be occupied by at most one process at the same time.

  • Possess and wait. There must be a process that occupies at least one resource in the system, and at the same time it is waiting to acquire resources occupied by other processes.

  • Inalienable. A process cannot deprive the resources occupied by other processes.

  • Wait in a loop. There is a cycle in the waiting graph.

Deadlock classification

  • Local deadlock : a deadlock that only occurs on one site
  • Global deadlock : deadlock involving multiple sites (that is, the waiting circle is composed of multiple sites)

In the figure, the transaction T1at the site Aof the data xheld by a lock, but also on the site Bdata yrequest blockade, and the transaction T2at the site Bof the data yheld by a lock, but also on the site Adata xrequest blockade. In this way, mutual waiting is formed, and no one can continue to execute the two transactions. Since these two transactions are not in the same site, a global deadlock is caused.

Insert picture description here
AThe site T1holds the xlock and wants to ylock, the Bsite T2holds the ylock and wants to xlock, and waiting for each other causes a global deadlock.

Distributed databaseData redundancyIt will also increase the chance of deadlock when updating data. Because it is necessary to add rejection locks to all replicas when updating, each site containing replicas may wait for another transaction to release the lock, but each transaction can only be released after it is completely completed, thus causing a global deadlock.

Insert picture description here
Site A: store a copy of x and y, issue transaction T1: read(x),write(y)
Site B: store a copy of y and z, issue transaction T2: read(y),write(z)
Site C: store a copy of z, issue transaction T3:read(z),write(x)

Write operations must be locked to ensure

Site A: Hold x lock and want y lock
Site B: Hold y lock and want z lock
Site A: Hold z lock and want x lock

Concept expansion (lock)

What is a lock? What are the two types of locks?

Answer: The
lock is the most commonly used concurrency control mechanism. It is a means to prevent other transactions from accessing specified resources and achieve concurrency control.
Exclusive lock ( X写锁): When data is locked by writing, other transactions cannot read or write to the data ;
shared lock ( S读锁): When data is locked by reading, other transactions are allowed to read the data , but not written .

Waiting graph

  • A directed graph used to represent the mutual waiting relationship between transactions, which is a useful tool for analyzing deadlocks
  • Node represents transaction
  • A directed edge with an arrow indicates a "waiting" relationship
  • If the waiting graph has a loop, it means there is a deadlock

Waiting for image classification

  • 局部等待图(LWFG)
  • 全局等待图(GWFG)

Case study

Global wait graph ( GWFG)
Insert picture description here

T1 waits for T2 to release the shared lock on y(s)
T2 waits for T3 to release the shared lock on z(s)
T3 waits for T1 to release the shared lock on x(s)

In other words

  • T1 holds x lock and wants y lock (The T1 transaction can be completed when the two locks are obtained at the same time, and the x and y locks will be released.)
  • T2 holds y lock and wants z lock
  • T3 holding z lock wants x lock

1 to 6.3 assuming four transaction T1、T2、T3、T4waiting relation between follows;
T1->T2->T3->T4if T1 and T2 in the operation site 1, T3 and T4 runs on site 2, the local wait two sites as shown in FIG. FIG. Note: It is not comprehensive to find out the deadlock only by checking these two local wait graphs, because the deadlock is global. However, deadlocks can be easily found by checking the global waiting graph. To distinguish them, the waits between stations in the global waiting graph can be represented by dotted lines, as shown in the figure.

Insert picture description here

Strategies for dealing with deadlocks

There are three strategies for dealing with deadlocks:

  1. 预防死锁. Limit requests to ensure that at least one of the four deadlock conditions mentioned above cannot occur, thereby preventing deadlocks;

  2. 避免死锁. If the result state is safe, the resources are dynamically allocated to the process. If there is at least one execution sequence to enable all processes to finish running, then this state is safe;

  3. Detect deadlock and recover from deadlock, allow deadlock to occur, and then discover and remove deadlock.

Deadlock prevention and avoidancepessimisticMethod, that is, think that deadlock will happen frequently and try to prevent or avoid it. Although deadlock avoidance strategies are widely used in centralized systems and there are many algorithms, they are rarely used in distributed systems.
This is because there is no global clock in a distributed system, and checking the security status will involve a large number of processes and resource calculations, which will cause expensive overhead . Deadlock detection and recovery use an optimistic method , but this method may not be effective for applications where deadlocks occur frequently.

Solution to deadlock

  • Deadlock prevention, so that the necessary conditions that cause deadlock are not established
  • Sort all resources, apply by resource sequence
  • Sort all concurrent transactions by identifier or start time
  • When there is a danger of deadlock, there are two methods for the transaction to exit the occupied resources
  • Wait-Death ( Wait-Die): Always restart the younger transaction (non-priority)-Injury
    -Wait ( Wound-Wait): The young wait for the old, the younger to restart, and the restart transaction is not necessarily the transaction currently being applied for ( Preemption)

Deadlock prevention

Waiting-death mode (non-priority)

  • If Ti requests to block a data item that has been blocked by Tj, then Ti is (Ti<Tj)allowed to wait only when Ti is older than Tj
  • If Ti is younger than Tj (Ti>Tj), Ti is terminated and restarted with the same timestamp
  • It is best to always restart younger transactions
  • Allow older transactions to wait for younger transactions that already hold resources
  • But don’t allow younger transactions to wait for older ones

Injured-Waiting Mode (Priority)

  • If Ti requests to block a data item that has been blocked by Tj, then Ti is allowed to wait only when Ti is younger than Tj (Ti>Tj)
  • Otherwise, Ti is older than Tj (Ti<Tj), then Tj is terminated and restarted with the same timestamp, and Ti has to be locked for execution
  • Only the young wait for the old

to sum up

Non-priority:

The old transaction is allowed to wait for the young transaction. If the young transaction needs the lock of the old transaction, the restart is directly terminated. In a sense, the transaction loses priority when it gets old.
Due to the lock time of the old transaction, the young transaction may be Has been restarted multiple times.

Preemption:

When the old transaction needs the lock of the young transaction, the young transaction must unconditionally obey the termination and restart, release its own lock, and achieve the old transaction. In a sense, when the transaction gets older, the priority is increased.
Because the young transaction encounters Older transactions will actively give way, release all locks and restart, so they will only be restarted once.

Detection and resolution of deadlock

Centralized deadlock detection

  • Choose a site to be responsible for the deadlock detection of the entire system, and place the deadlock detector on this site
  • The lock manager of each site periodically transmits the site’s data LWFGto the deadlock detector, the deadlock detector is constructed GWFG, and the loop is found in it
  • Or, the lock manager on each other site periodically records the start time of the transaction on this site, the lock holdings, and the dynamic table of the request status changes to the site responsible for processing the lock, which will maintain a dynamic table. Zhang Global blocked the dynamic table, formed GWFG, and looked for loops in it
  • If it contains at least one loop, it will select one or more transactions, cancel and restore them, release resources, and make other transactions continue.
    -The selection criterion is to minimize the cost of undo and recovery as much as possible, such as undoing young transactions, undoing transactions with fewer resources, undoing transactions with the shortest running time, undoing transactions with the longest running time, etc., to make the system situation Depends on

Hierarchical deadlock detection

  • Organize DBMSdeadlock detectors among members in a hierarchical manner
  • When a deadlock occurs, only part of the site is often involved
  • The hierarchical structure of level detection is related to the network topology
  • Reduce the dependence on the central site, thereby reducing the transmission overhead

Based on the type of resource access, each site is divided into different groups, and these groups are organized in a hierarchical manner. In each group. A designated group is responsible for detecting deadlocks in this group using a centralized algorithm .

Steps of hierarchical deadlock detection

  • The leaves are local deadlock detectors at each site, and a local waiting graph is established at this site
  • The deadlock detector at this site finds any loops in the local waiting graph of this site, and sends information about potential global loops to the upper layer deadlock detector
  • Each non-local deadlock detector only performs deadlock detection on the immediate lower layer it involves, merges the received information about potential global loops, and finds any loops
  • If there is an upper-layer deadlock detector, send the simplified information about the potential global loop to its upper-layer deadlock detector, and the upper-layer deadlock detector will merge to find any global loop
  • In this way, check layer by layer until the highest layer.

Schematic diagram of hierarchical deadlock detection method

Insert picture description here

Distributed detection

The detection of distributed deadlocks is more difficult because each site has only a partial view of the entire system. Therefore, when detecting deadlocks involving more than one site, collaboration between sites is required. In the distributed deadlock detection algorithm, all sites work together to detect and deal with deadlocks. Unlike centralized deadlock detection algorithms, distributed deadlock detection algorithms are not affected by a single site failure, and no site will be overwhelmed by deadlock detection activities every month

  • Each site has the same responsibility for detecting deadlocks. The exchange of information ( LWFG) between sites is to determine the global deadlock
  • EXRefers to the site where the local transaction is waiting for other transactions in the uncertain site

Insert picture description here

The local deadlock detector focuses on two things:
  • If there is a loop that does not include edges involving other sites, this local deadlock can be resolved locally
  • If there is a loop that includes edges involving other sites, the potential distributed deadlock and information about this loop must be transmitted to the deadlock detectors of the other sites that are newly involved. This possibility of distributed deadlock can be detected by two sites as shown above.
optimization

The distributed deadlock detection algorithm requires a unified modification of the lock manager of each site. This uniformity makes them easier to implement. However, this algorithm has a lot of information transmission. For example, in the example diagram, station 1 sends potential deadlock information to station 2, and station 2 also sends its potential deadlock information to station 1 . In this case, the deadlock detectors on both sites will detect the deadlock. In addition to causing unnecessary information transmission , there is also a problem, that is, each site may choose a different victim transaction to cancel .

To solve this problem, use the following rules: make a site on the LWFGpath of a distributed deadlock is likely to cause Ti->...->Tjonly if ts(Ti)< ts(Tj), the local deadlock detector toFront transmission loop information(Forward transmission). This method reduces the transmission volume by half. In the example diagram, there is a transmission path at station 1 T1->T2->T3, and the transmission path at station 2 is T3->T4->T1. In this way,Assume that the subscript of each transaction represents its time stamp, Then only station 1 will send information to station 2.

Guess you like

Origin blog.csdn.net/qq_40649503/article/details/109923117