A simple introduction to distributed locks and three implementation methods

When many small partners are learning Java, they always feel that Java multi-threading is rarely used in actual business, so that they will not spend too much time learning, and technical debts continue to accumulate! To a certain extent, it is difficult to understand things related to Java multithreading. The things that need to be discussed today are the same as Java multithreading! Get ready and drive now!

Those who have learned Java multithreading should know what a lock is, and those who have not learned it should not worry. The lock in Java can be simply understood as a thread synchronization mechanism for accessing critical resources in the case of multithreading.

In the process of learning or using Java, processes will encounter various lock concepts: fair locks, unfair locks, spin locks, reentrant locks, biased locks, lightweight locks, heavyweight locks, read locks Write locks, mutex locks, etc.

Are you blinded? It doesn't matter! It doesn't matter if you don't know any of these, because this has little to do with today's discussion, but if you are a learning partner, here is a recipe for you: "Java Multithreading Core Technology", a total of 19 articles I wish you a helping hand! The free version is not fun, and of course there is a paid version!

1. Why use distributed locks

When we are developing applications, if we need to perform multi-threaded synchronous access to a shared variable, we can use the 18 martial arts of Java multi-threading that we have learned to process, and it can run perfectly without bugs!

Note that this is a stand-alone application, that is, all requests will be allocated to the JVM of the current server, and then mapped to the threads of the operating system for processing! And this shared variable is just a piece of memory space inside the JVM!

Later, when the business develops, it needs to be clustered. An application needs to be deployed on several machines and then load balanced, as shown in the figure below:

write picture description here

As can be seen from the above figure, variable A exists in the three JVM memories of JVM1, JVM2, and JVM3 (this variable A is mainly embodied as a member variable in a class, which is a stateful object, such as: one in the UserController controller). A member variable of an integer type), if no control is applied, variable A will allocate a piece of memory in the JVM at the same time, and three requests are sent to operate on this variable at the same time. Obviously, the result is wrong! Even if they are not sent at the same time, the three requests operate on the data of three different JVM memory areas respectively. There is no sharing or visibility between variables A, and the processing results are wrong!

If this scenario does exist in our business, we need a way to solve this problem!

In order to ensure that a method or property can only be executed by the same thread at the same time under high concurrency, in the case of single-machine deployment of traditional monolithic applications, Java concurrency related APIs (such as ReentrantLock or Synchronized) can be used for mutual exclusion control. In a stand-alone environment, Java provides many APIs related to concurrent processing. However, with the needs of business development, after the original single-machine deployment system is evolved into a distributed cluster system, since the distributed system is multi-threaded, multi-process and distributed on different machines, this will make the concurrency in the original single-machine deployment situation. The control lock strategy is invalid, and the pure Java API cannot provide the ability of distributed locks. In order to solve this problem, a cross-JVM mutual exclusion mechanism is needed to control the access to shared resources, which is the problem to be solved by distributed locks!

2. What conditions should a distributed lock have

Before analyzing the three implementations of distributed locks, let's first understand what conditions a distributed lock should have:

1. In a distributed system environment, a method can only be executed by one thread of one machine at the same time; 
2. High-availability lock acquisition and release; 
3. High-performance lock acquisition and release; 
4. Available Reentrant feature; 
5. It has a lock failure mechanism to prevent deadlock; 
6. It has the feature of non-blocking lock, that is, if the lock is not acquired, it will directly return to the failure to acquire the lock.

Three, three implementations of distributed locks

At present, almost many large-scale websites and applications are deployed in a distributed manner, and the problem of data consistency in distributed scenarios has always been a relatively important topic. The distributed CAP theory tells us that "no distributed system can satisfy Consistency, Availability and Partition tolerance at the same time, at most two of them can be satisfied at the same time." Therefore, many systems in At the beginning of the design, it is necessary to make a choice between these three. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the range acceptable to users.

In many scenarios, in order to ensure the eventual consistency of data, we need a lot of technical solutions, such as distributed transactions, distributed locks, etc. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time.

Implement distributed lock based on database; implement distributed lock 
based on cache (Redis, etc.); implement distributed lock 
based on Zookeeper;

Although there are these three schemes, different businesses have to choose according to their own situation, there is no best among them, only more suitable!

Fourth, the realization method based on the database

The core idea of ​​database-based implementation is: create a table in the database, the table contains fields such as method name, and create a unique index on the method name field. If you want to execute a method, use the method name to add to the table. Insert data, acquire the lock after successful insertion, delete the corresponding row data to release the lock after the execution is complete.

(1) Create a table:

DROP TABLE IF EXISTS `method_lock`; CREATE TABLE `method_lock` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键', `method_name` varchar(64) NOT NULL COMMENT '锁定的方法名', `desc` varchar(255) NOT NULL COMMENT '备注信息', `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `uidx_method_name` (`method_name`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COMMENT='锁定中的方法';

write picture description here

(2) To execute a method, use the method name to insert data into the table:

INSERT INTO method_lock (method_name, desc) VALUES ('methodName', '测试的methodName');

Because we have method_namemade a unique constraint, if there are multiple requests submitted to the database at the same time, the database will ensure that only one operation can succeed, then we can consider that the thread that succeeded in the operation has obtained the lock of the method and can execute the method. body content.

(3) The lock is acquired if the insertion is successful, and the corresponding row data is deleted after the execution is completed to release the lock:

delete from method_lock where method_name ='methodName';

Note: This is just a method based on the database, there are many other ways to use the database to implement distributed locks!

Using this database-based implementation is very simple, but for the conditions that distributed locks should have, it has some problems that need to be solved and optimized:

1. Because it is implemented based on the database, the availability and performance of the database will directly affect the availability and performance of the distributed lock. Therefore, the database requires dual-machine deployment, data synchronization, and active-standby switching;

2. It does not have the reentrant feature, because the row data always exists before the lock is released by the same thread, and the data cannot be successfully inserted again. Therefore, a new column needs to be added to the table to record the machine and the machine that currently obtains the lock. Thread information, when acquiring the lock again, first check whether the machine and thread information in the table is the same as the current machine and thread, and if they are the same, directly acquire the lock;

3. There is no lock failure mechanism, because it is possible that after the data is successfully inserted, the server is down, and the corresponding data is not deleted. When the service is restored, the lock cannot be obtained. Therefore, a new column needs to be added to the table for Record the expiration time, and need to have scheduled tasks to clear these expired data;

4. It does not have the feature of blocking locks. If the lock cannot be obtained, it will directly return to failure. Therefore, it is necessary to optimize the acquisition logic and obtain it in a loop for many times.

5. Various problems will be encountered in the process of implementation. In order to solve these problems, the implementation method will become more and more complicated; relying on the database requires a certain resource overhead, and performance issues need to be considered.

Five, Redis-based implementation

1. Reasons for choosing Redis to implement distributed locks:

(1) Redis has high performance; 
(2) Redis commands support this well and are more convenient to implement

2. Introduction to using commands:

(1)SETNX

SETNX key val:当且仅当key不存在时,set一个key为val的字符串,返回1;若key存在,则什么都不做,返回0。

(2)expire

expire key timeout:为key设置一个超时时间,单位为second,超过这个时间锁会自动释放,避免死锁。

(3)delete

delete key:删除key

When using Redis to implement distributed locks, these three commands are mainly used.

3. Realize the idea:

(1) When acquiring a lock, use setnx to lock, and use the expire command to add a timeout time to the lock. If the time exceeds this time, the lock is automatically released. The value of the lock is a randomly generated UUID. Through this, when the lock is released make a judgment.

(2) When acquiring the lock, a timeout period for acquisition is also set. If this time is exceeded, the acquisition of the lock is abandoned.

(3) When releasing the lock, judge whether it is the lock by UUID, if it is the lock, execute delete to release the lock.

4. Simple implementation code of distributed lock:

/**
 * 分布式锁的简单实现代码
 * Created by liuyang on 2017/4/20.
 */
public class DistributedLock { private final JedisPool jedisPool; public DistributedLock(JedisPool jedisPool) { this.jedisPool = jedisPool; } /** * 加锁 * @param lockName 锁的key * @param acquireTimeout 获取超时时间 * @param timeout 锁的超时时间 * @return 锁标识 */ public String lockWithTimeout(String lockName, long acquireTimeout, long timeout) { Jedis conn = null; String retIdentifier = null; try { // 获取连接 conn = jedisPool.getResource(); // 随机生成一个value String identifier = UUID.randomUUID().toString(); // 锁名,即key值 String lockKey = "lock:" + lockName; // 超时时间,上锁后超过此时间则自动释放锁 int lockExpire = (int) (timeout / 1000); // 获取锁的超时时间,超过这个时间则放弃获取锁 long end = System.currentTimeMillis() + acquireTimeout; while (System.currentTimeMillis() < end) { if (conn.setnx(lockKey, identifier) == 1) { conn.expire(lockKey, lockExpire); // 返回value值,用于释放锁时间确认 retIdentifier = identifier; return retIdentifier; } // 返回-1代表key没有设置超时时间,为key设置一个超时时间 if (conn.ttl(lockKey) == -1) { conn.expire(lockKey, lockExpire); } try { Thread.sleep(10); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } } catch (JedisException e) { e.printStackTrace(); } finally { if (conn != null) { conn.close(); } } return retIdentifier; } /** * 释放锁 * @param lockName 锁的key * @param identifier 释放锁的标识 * @return */ public boolean releaseLock(String lockName, String identifier) { Jedis conn = null; String lockKey = "lock:" + lockName; boolean retFlag = false; try { conn = jedisPool.getResource(); while (true) { // 监视lock,准备开始事务 conn.watch(lockKey); // 通过前面返回的value值判断是不是该锁,若是该锁,则删除,释放锁 if (identifier.equals(conn.get(lockKey))) { Transaction transaction = conn.multi(); transaction.del(lockKey); List<Object> results = transaction.exec(); if (results == null) { continue; } retFlag = true; } conn.unwatch(); break; } } catch (JedisException e) { e.printStackTrace(); } finally { if (conn != null) { conn.close(); } } return retFlag; } }

5. Test the distributed lock just implemented

In the example, 50 threads are used to simulate killing a commodity in seconds, and the – operator is used to achieve commodity reduction. From the orderliness of the results, it can be seen whether it is in a locked state.

Simulate the spike service, in which the jedis thread pool is configured, and it is passed to the distributed lock during initialization for its use.

/**
 * Created by liuyang on 2017/4/20.
 */
public class Service { private static JedisPool pool = null; private DistributedLock lock = new DistributedLock(pool); int n = 500; static { JedisPoolConfig config = new JedisPoolConfig(); // 设置最大连接数 config.setMaxTotal(200); // 设置最大空闲数 config.setMaxIdle(8); // 设置最大等待时间 config.setMaxWaitMillis(1000 * 100); // 在borrow一个jedis实例时,是否需要验证,若为true,则所有jedis实例均是可用的 config.setTestOnBorrow(true); pool = new JedisPool(config, "127.0.0.1", 6379, 3000); } public void seckill() { // 返回锁的value值,供释放锁时候进行判断 String identifier = lock.lockWithTimeout("resource", 5000, 1000); System.out.println(Thread.currentThread().getName() + "获得了锁"); System.out.println(--n); lock.releaseLock("resource", identifier); } }

Simulate a thread for spike service:

public class ThreadA extends Thread { private Service service; public ThreadA(Service service) { this.service = service; } @Override public void run() { service.seckill(); } } public class Test { public static void main(String[] args) { Service service = new Service(); for (int i = 0; i < 50; i++) { ThreadA threadA = new ThreadA(service); threadA.start(); } } }

The results are as follows, and the results are in order:

write picture description here

If you comment out the part that uses locks:

public void seckill() {
    // 返回锁的value值,供释放锁时候进行判断
    //String indentifier = lock.lockWithTimeout("resource", 5000, 1000); System.out.println(Thread.currentThread().getName() + "获得了锁"); System.out.println(--n); //lock.releaseLock("resource", indentifier); }

As you can see from the results, some are done asynchronously:

write picture description here

5. Implementation based on ZooKeeper

ZooKeeper is an open source component that provides consistent services for distributed applications. Inside it is a hierarchical file system directory tree structure, which stipulates that there can only be one unique file name in the same directory. The steps to implement distributed locks based on ZooKeeper are as follows:

(1) Create a directory mylock; 
(2) If thread A wants to acquire a lock, it creates a temporary sequence node in the mylock directory; 
(3) Gets all the child nodes in the mylock directory, and then acquires a sibling node smaller than itself, if it does not exist , it means that the current thread sequence number is the smallest, and the lock is obtained; 
(4) Thread B obtains all nodes, determines that it is not the smallest node, and sets the monitoring node to the next smaller node; 
(5) Thread A finishes processing, deletes its own node, thread B listens to the change event, determines whether it is the smallest node, and if so, obtains the lock.

An Apache open source library, Curator, is recommended here. It is a ZooKeeper client. The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method is used to acquire the lock, and the release method is used to release the lock.

Advantages: It has the characteristics of high availability, reentrancy, and blocking lock, which can solve the problem of deadlock.

Disadvantages: Because of the frequent creation and deletion of nodes, the performance is not as good as the Redis method.

6. Summary

The above three implementation methods are not perfect in all occasions. Therefore, the most suitable implementation method should be selected according to different application scenarios.

In a distributed environment, it is sometimes important to lock resources, such as snapping up a resource. In this case, using distributed locks can control resources well. 
Of course, in specific use, many factors need to be considered, such as the selection of timeout time and the selection of lock acquisition time, which have a great impact on the amount of concurrency. The distributed lock implemented above is only a simple implementation, mainly A kind of thought, the code above including the text may not be suitable for formal production environment, only for introductory reference!

Reference article:

1、https://yq.aliyun.com/articles/60663

2、http://www.hollischuang.com/archives/1716

3 、https://www.cnblogs.com/liuyang0/p/6744076.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325337838&siteId=291194637