Distributed Lock database implementations

Outline

In the era of stand-alone, distributed lock Although not required, but is also facing similar problems, but in a single case, if there are multiple threads to simultaneously access a shared resource, we can lock the threads used mechanism, that is, when a thread gets to this resource, this resource will immediately lock, when you are finished using the resource, and then unlock the other thread can then use. For example, in JAVA, and even some of the API provides a number of specialized handling lock mechanism (synchronize / Lock, etc.).

But in the era of distributed systems, locking mechanism between this thread, there is no effect, the system may have multiple copies and deployed on different machines, these resources have not shared between threads, but belongs sharing resources between processes.

Therefore, in order to solve this problem, we must introduce the Distributed Lock. Distributed lock means in a distributed deployment environment, to allow more customers through the end of the lock mechanism mutually exclusive access to shared resources.

Currently more common distributed lock implementations are the following:

  1. Based on the database, such as MySQL
  2. Based caching, such as Redis
  3. Based Zookeeper, etcd etc.

We discuss the use of distributed lock when often the first to be excluded based database program, will instinctively feel that this program is not enough, "senior." From the viewpoint of performance, based on performance of the database program does not excellent overall performance comparison: cache> Zookeeper, etcd> database. It was also proposed program issues a database based on a lot less reliable. I believe that the program which is to adopt the point of view of the scene based on the use, what kind of program options, the right is most important.

I quote it here before a scenario in the article - assign tasks scene. In this scenario, because the business is the company's back-office systems, mainly for audit auditors, concurrency is not very high, but the design task allocation rules become active each time the auditors request to pull through, then server tasks assigned from a random pool of selected tasks. The scene you see here will feel relatively simple, but the actual allocation process, as it relates to the pressing problems of user clustering, so more complex than I have described, but here to illustrate the problem, we can keep things simple to understand. So in the course, mainly to avoid the same task simultaneously acquire two auditors to the problem. Under this scenario is based on the use of the database program is more reasonable.

Then add, such as a database service that relies do some downstream data read and write operations, the model as shown below:
Here Insert Picture Description
General multiple service instance is deployed, if multiple instances of the same data requires the operation time (such as the previously mentioned the same task simultaneously acquire two auditors to question), naturally introduces a distributed lock. But this time, we did not use the program database, but the introduction of Redis, model as shown below:
Here Insert Picture Description
After the introduction of Redis, positive earnings I will not go into reverse gains is to increase the complexity of the system, for the entire service, also we need to consider cases 1 and 2 fail. 1 failure is the interactive service module Redis appeared abnormal, this abnormal refers not only unable to abnormality of communication, there may be an abnormal process service module sends a request only Redis the procedure or Redis response service module appearing in the overall services need to consider this situation: is retried, discard or take other measures; 2 failure is Redis itself appeared abnormal. Once the variable-length data link, once the system complexity increases, when the problems will hinder troubleshooting and service restoration, thereby making the overall service availability down.

In contrast, if the program database, then you can save the complexity of this section, if a database program to meet the current scenario and future expansion in the visible range, why no reason to increase the complexity of the system? We have to choose according to the specific business scenarios right technical solution, not just to find a sufficiently complex enough trendy technology solutions to business problems.

Let's take a look based on the database (MySQL) program, generally divided into three categories: record based on the table, optimistic and pessimistic locking.

Based on table records

To achieve Distributed Lock, the easiest way possible is to directly create a lock table, then realized through the data in the table operation. When we want to acquire a lock, you can add a record in the table, when you want to release the lock to delete this record.

For better presentation, we first create a database table, refer to the following:

CREATE TABLE `database_lock` (
	`id` BIGINT NOT NULL AUTO_INCREMENT,
	`resource` int NOT NULL COMMENT '锁定的资源',
	`description` varchar(1024) NOT NULL DEFAULT "" COMMENT '描述',
	PRIMARY KEY (`id`),
	UNIQUE KEY `uiq_idx_resource` (`resource`) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='数据库分布式锁表';

When we want to get a lock, you can insert a piece of data:

INSERT INTO database_lock(resource, description) VALUES (1, 'lock');

Note: In Table database_lock in, Resource field did the only constraint, so that if there are multiple requests at the same time submitted to the database, the database may ensure that only one operation can be successful (the other being given: ERROR 1062 (23000): Duplicate entry ' 1 'for key' uiq_idx_resource '), so then we can believe that the success of the operation to obtain the requested lock.

When you need to release the lock, you can delete this data:

DELETE FROM database_lock WHERE resource=1;

This implementation is very simple, but the need to note the following:

  1. This lock is not time to failure, once the operation of the lock release will lead to lock failure has been recorded in the database, other threads can not get a lock. This defect is also a good solution, for example, can do a timed task to regularly clean up.
  2. This lock is dependent on the reliability of database. Recommendations set by the library, to avoid a single point, to further improve reliability.
  3. This lock is non-blocking, because after inserting data directly failure error, you want to get a lock on the need to operate again. If you need blocking, you can get hold of for loops, while loops and the like, until INSERT success back again.
  4. This lock is also non-reentrant, because the same thread can not acquire the lock again before the lock is not released, because already in the database with a record. Want to achieve reentrant lock, you can add some fields in the database, such as access to host information lock, thread information, etc., then again at the time of acquiring the lock can first query the data, if the current host information and thread information can be found, it can be directly allocated to lock it.

Optimistic locking

As the name suggests, the system thinks update data in most cases is no conflict, only fishes for conflict detection data when the database update operations submitted. If the test results appear inconsistent with the expected data, the failure information is returned.

Most optimistic locking is based on the data version (version) of the recording mechanism to achieve. What data version number? Data is the addition of a version identifier, the version of the database table based solutions generally be achieved when reading out the data by adding a "version" field in the database table, read together this version, after update, this version number is incremented. During the update, version numbers will be compared, if it is consistent, has not changed, it will successfully execute this operation; if inconsistent with the version number, the update will fail.

To better understand optimistic locking database used in the actual project, where it is listed a typical example of the electricity supplier stock. There will be a stock of electronic business platform, when users will make a purchase of inventory operations (inventory minus 1 representatives have sold a). We will use this inventory models below a table optimistic_lock to express, refer to the following:

CREATE TABLE `optimistic_lock` (
	`id` BIGINT NOT NULL AUTO_INCREMENT,
	`resource` int NOT NULL COMMENT '锁定的资源',
	`version` int NOT NULL COMMENT '版本信息',
	`created_at` datetime COMMENT '创建时间',
	`updated_at` datetime COMMENT '更新时间',
	`deleted_at` datetime COMMENT '删除时间', 
	PRIMARY KEY (`id`),
	UNIQUE KEY `uiq_idx_resource` (`resource`) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='数据库分布式锁表';

Where: id represents the primary key; resource represents the resources to do this, here it is specific to inventory; version represents the version number.

To ensure that the table has a corresponding data before using optimistic locking, such as:

INSERT INTO optimistic_lock(resource, version, created_at, updated_at) VALUES(20, 1, CURTIME(), CURTIME());

If only one thread, the database itself will be able to ensure correct operation. The main steps are as follows:

  • STEP1 - access to resources: SELECT resource FROM optimistic_lock WHERE id = 1
  • STEP2 - execute business logic
  • STEP3 - update the resource: UPDATE optimistic_lock SET resource = resource -1 WHERE id = 1

However, it will produce some unexpected problems in the case of concurrent: for example, two threads simultaneously buy a product, the actual operation should be at the database level inventory (resource) minus 2, but as is the case of high concurrency, the first after the thread execution (implementation of STEP1, STEP2 but not yet completed STEP3), the second thread in goods (execution STEP1) to buy the same, then check out the inventory was not complete action minus 1, then eventually lead to two inventory situation only minus one thread has appeared purchased.

Following the introduction of the version field, the specific operation will evolve into the following:

  • STEP1 - access to resources: SELECT resource, version FROM optimistic_lock WHERE id = 1
  • STEP2 - execute business logic
  • STEP3 - 更新资源:UPDATE optimistic_lock SET resource = resource -1, version = version + 1 WHERE id = 1 AND version = oldVersion

In fact, with the updated time stamp (updated_at) can also be achieved optimistic locking, and the use of version fields similar way: update on frontline capture the current update time, when submitting updated to detect whether the current update time and get updates when starting update timestamp equal.

Optimistic locking obvious advantages, since the data collision detection is not dependent database locking mechanism itself does not affect the performance of the request, and if a small amount of concurrent concurrent generation when only a small part of the request fails. The disadvantage is the need to design a table to add additional fields, increasing the redundancy of the database. In addition, when the high volume of concurrent applications, version values ​​change frequently, will result in a large number of requests fail, the availability of the system. We can see by the above sql statement, the database lock is applied to the data recorded on the same line, which led to a marked disadvantage in some special scenes, such as big promotion, spike and other activities when a large number of requests simultaneously request the same row lock a record, the database will have a great writing pressure. So the advantages and disadvantages comprehensive database optimistic locking, optimistic locking for concurrency comparison is not high, and the write operation is not frequent scene.

Pessimistic locking

In addition to the records additions and deletions operation database table can be, we can also be achieved by means of a distributed database lock comes with a lock. Behind the increase in query FOR UPDATE, the database will increase to pessimistic locking database tables in the query process, also known as an exclusive lock. When a record is added pessimistic lock, other threads will no longer be diverted to increase pessimistic lock on.

Pessimistic locking, optimistic locking with the opposite, always assuming the worst case, it considers updating data in most cases will produce conflict.

While using pessimistic locking, we need to look at the level of the lock. MySQL InnoDB caused when locked, only explicitly specified row lock will be implemented primary key (or index) of (only lock the selected data), otherwise MySQL will perform a table lock (to lock the entire data sheets) .

When using pessimistic locking, we must close the MySQL database to automatically submit attribute (refer to the example below), because MySQL default autocommit mode, that is, when you perform an update, MySQL will immediately submit the results.

mysql> SET AUTOCOMMIT = 0;
Query OK, 0 rows affected (0.00 sec)

So that after use may be performed to acquire the lock FOR UPDATE corresponding service logic, and then the COMMIT to release the lock after the execution.

We wish to continue the previous database_lock table to express what specific usage. Suppose there is a need to obtain a lock and thread A executes a corresponding operation, it is the following steps:

  • STEP1 - 获取锁:SELECT * FROM database_lock WHERE id = 1 FOR UPDATE;。
  • STEP2 - execution of business logic.
  • STEP3 - lock release: COMMIT.

If another thread B executed before STEP1 A thread releases the lock, then it will be blocked until after the thread A releases the lock to continue. Note that, if the lock release is not long thread A, then thread B will be given, with reference to the following (lock wait time may be configured by innodb_lock_wait_timeout):

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Above example demonstrates the specified primary key and can query the data process (trigger line lock), finding out if the data so there can be no "lock" played.

If you do not specify a primary key (or Index) and can query the data, it will trigger table locks instead perform such STEP1 (version here just as an ordinary field to use, regardless of the above optimistic lock):

SELECT * FROM database_lock WHERE description='lock' FOR UPDATE;

Or the primary key uncertainty will trigger table lock, but instead perform such STEP1:

SELECT * FROM database_lock WHERE id>0 FOR UPDATE;

Note that although we can show row-level locking (specify query primary key or index), but MySQL queries will be optimized, even with the index field in the condition, but is it really to use the index to retrieve the data is made by MySQL determine the cost of different execution plan to determine if a full table scan MySQL believe higher efficiency, such as some small tables, it likely will not use the index, in this case the use InnoDB table locks instead of row lock.

In pessimistic locking, accessing each row of data is exclusive only when the line is accessing the data after the transaction commits request, other requests in order to access that data, otherwise it will get blocked waiting for the lock. Stringent pessimistic locking can guarantee the security of data access. But the disadvantages are also obvious that every request will generate additional cost and not get locked into a lock request will block waiting for lock acquisition in a highly concurrent environment, a large number of requests is likely to cause obstruction, affecting system availability. In addition, improper use pessimistic locking may produce a deadlock situation.

References:

  1. https://blog.csdn.net/m0_37574566/article/details/86586847
  2. https://blog.csdn.net/ctwy291314/article/details/82424055
  3. https://blog.csdn.net/tianjiabin123/article/details/72625156
  4. https://www.jianshu.com/p/39d8b7437b0b

We welcome the support of new work: "In-depth understanding of Kafka: the core design principles and practice" and "RabbitMQ practical guide", while welcoming the attention of the author micro-channel public number: Zhu servant of the blog.

Guess you like

Origin blog.csdn.net/u013256816/article/details/92854794