Sike java synchronization zookeeper series distributed lock

problem

How to (1) zookeeper achieve Distributed Lock?

What are the advantages (2) zookeeper distributed locks?

(3) zookeeper distributed lock what are the disadvantages?

Brief introduction

zooKeeper is a distributed, open-source coordination service for distributed applications, it can provide consistent service for distributed applications, it is an important component of Hadoop and Hbase, but also can be used as distribution center, the use of registry services in micro system.

In this chapter we will describe how to implement a distributed lock zookeeper use in a distributed system.

Basics

What is znode?

zooKeeper operation and maintenance data for a node, called znode, using the hierarchical tree structure similar to the file management system, if a node contains data znode are stored as an array of bytes (byte array).

Moreover, the same node multiple clients simultaneously create [This article by the public number "Tong brother read the source" original], only one client will be successful, will fail to create other clients.

zooKeeper

Node Type

znode There are four types:

  • Persistent (disorder)

  • Persistent ordered

  • Temporary (disorder)

  • Temporary order

Among them, the persistent node if you do not manually delete will always exist, temporary node failure when the client session will be automatically deleted node.

What is the watcher?

watcher (event listener), is a very important feature in the zookeeper.

zookeeper allows users to register some of the watcher on the specified node, and when certain triggering events, zooKeeper server-side event will inform interested customers to end up, this mechanism is Zookeeper distributed coordination services to achieve important features .

KeeperState EventType Triggering conditions Explanation operating
SyncConnected(3) None(-1) Client and server connection is successfully established At this time, the client and server in a connected state -
Ditto NodeCreated(1) Watcher monitor corresponding data node is created Ditto Create
Ditto NodeDeleted(2) Watcher monitor corresponding data node is deleted Ditto Delete/znode
Ditto NodeDataChanged(3) Data corresponding to the contents of the data node listens changed Watcher Ditto setDate/znode
Ditto NodeChildChanged(4) List of child nodes corresponding to the data nodes Wather listening changed Ditto Create/child
Disconnected(0) None(-1) ZooKeeper client and server disconnect At this time, the client and server in a disconnected state -
Expired(-112) None(-1) Session Timeout At this point the client session expires, usually at the same time will also be abnormal SessionExpiredException -
AuthFailed(4) None(-1) There are usually two cases, 1: Using the wrong schema for permission checking 2: SASL permission check failed Usually you will also receive an exception AuthFailedException -

Analytical principle

Option One

Now, the same node can only be created once, then, when the lock detection node exists, does not exist, create it, there is a failure to create or delete the event listener for this node, so that when the lock is released when listening client again this node to create competition, success is to acquire a lock, then listening to the node unsuccessful again.

zooKeeper

For example, there are three client client1, client2, client3 acquired simultaneously / locker / user_1 the lock, which will run the following steps:

(1) all three at the same time try to create / locker / user_1 node;

(2) client1 created, it gets into the lock;

(3) client2 and client3 creation fails, they listen delete events / locker / user_1 of;

(4) client1 business logic implemented within the lock;

(5) client1 release the lock, delete nodes / locker / user_1;

(6) client2 and client3 all captured event node / locker / user_1 deleted, Both wake up;

(7) client2 and client3 the same time to create / locker / user_1 node;

(8) back to the second step, and so [This article from the original public number "Tong brother read source"];

However, this program has a very serious drawbacks - shock group effect.

If the amount of high concurrency, multiple clients simultaneously monitor the same node, so multiple clients simultaneously wake up when the lock is released, and then the competition, in the end only one can get to the lock, other clients have to sleep, these clients wake does not make any sense, a great waste of system resources, then there is no better solution then? The answer is of course, see Option II.

Option II

In order to solve the thundering herd effect in a program, we can use the form ordered child nodes to achieve distributed lock, and in order to avoid the risk of the client to obtain the lock suddenly broken, we need to use a temporary order node.

zooKeeper

For example, there are three client client1, client2, client3 acquired simultaneously / locker / user_1 the lock, which will run the following steps:

(1) simultaneously in the three / locker / user_1 / create temporary ordered child nodes below;

(2) all three-created, respectively / locker / user_1 / 0000000001, / locker / user_1 / 0000000003, / locker / user_1 / 0000000002;

(3) Check the node that you created is not the smallest child nodes;

(4) client1 find themselves is minimal node that acquired the lock;

(5) client2 and client3 found himself not the smallest nodes, they can not get into the lock;

(6) node client2 created for / locker / user_1 / 0000000003, delete the event that it listens on a node / locker / user_1 / 0000000002 of;

(7) node client3 created for / locker / user_1 / 0000000002, delete the event that it listens on a node / locker / user_1 / 0000000001 of;

(8) client1 business logic implemented within the lock;

(9) client1 release the lock, delete nodes / locker / user_1 / 0000000001;

(10) client3 listens to delete the event node / locker / user_1 / 0000000001, and wake up;

(11) client3 check again that he is not the smallest node is found, then get to the lock;

(12) client3 execute business logic within the lock [This article from the original public number "Tong brother read source"];

(13) client3 release the lock, delete nodes / locker / user_1 / 0000000002;

(14) client2 listening to delete the event node / locker / user_1 / 0000000002, and wake up;

(15) client2 business logic implemented within the lock;

(16) client2 release the lock, delete nodes / locker / user_1 / 0000000003;

Are there any child nodes (17) client2 check / locker / user_1 / down, not the delete / locker / user_1 node;

(18) the flow ends;

This program is a program with respect to, each time the lock is released only to wake up a client, reducing the cost of thread wakes up and improve efficiency.

zookeeper native API implementation

pom file

pom introduced in the jar package:

<dependency>
    <groupId>org.apache.zookeeper</groupId>
    <artifactId>zookeeper</artifactId>
    <version>3.5.5</version>
</dependency>

Locker Interface

Locker define an interface with the previous chapter mysql Distributed Lock use the same interface.

public interface Locker {
    void lock(String key, Runnable command);
}

Distributed Lock achieve zookeeper

Here related operations by the internal processing based ZkLockerWatcher zookeeper, note the following points:

(1) zk connection establishment not related operations before completion, otherwise it will be reported ConnectionLoss exception here by LockSupport.park (); blocking connection threads and wake-up process in the listener thread;

(2) the client thread monitor thread is not the same thread, so can LockSupport.park (); and LockSupport.unpark (thread); to deal with;

(3) an intermediate step is not a lot of atoms (pit), it is necessary to test again, see the comments in the code;

@Slf4j
@Component
public class ZkLocker implements Locker {
    @Override
    public void lock(String key, Runnable command) {
        ZkLockerWatcher watcher = ZkLockerWatcher.conn(key);
        try {
            if (watcher.getLock()) {
                command.run();
            }
        } finally {
            watcher.releaseLock();
        }
    }

    private static class ZkLockerWatcher implements Watcher {
        public static final String connAddr = "127.0.0.1:2181";
        public static final int timeout = 6000;
        public static final String LOCKER_ROOT = "/locker";

        ZooKeeper zooKeeper;
        String parentLockPath;
        String childLockPath;
        Thread thread;

        public static ZkLockerWatcher conn(String key) {
            ZkLockerWatcher watcher = new ZkLockerWatcher();
            try {
                ZooKeeper zooKeeper = watcher.zooKeeper = new ZooKeeper(connAddr, timeout, watcher);
                watcher.thread = Thread.currentThread();
                // 阻塞等待连接建立完毕
                LockSupport.park();
                // 根节点如果不存在,就创建一个(并发问题,如果两个线程同时检测不存在,两个同时去创建必须有一个会失败)
                if (zooKeeper.exists(LOCKER_ROOT, false) == null) {
                    try {
                        zooKeeper.create(LOCKER_ROOT, "".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
                    } catch (KeeperException e) {
                        // 如果节点已存在,则创建失败,这里捕获异常,并不阻挡程序正常运行
                        log.info("创建节点 {} 失败", LOCKER_ROOT);
                    }
                }
                // 当前加锁的节点是否存在
                watcher.parentLockPath = LOCKER_ROOT + "/" + key;
                if (zooKeeper.exists(watcher.parentLockPath, false) == null) {
                    try {
                        zooKeeper.create(watcher.parentLockPath, "".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
                    } catch (KeeperException e) {
                        // 如果节点已存在,则创建失败,这里捕获异常,并不阻挡程序正常运行
                        log.info("创建节点 {} 失败", watcher.parentLockPath);
                    }
                }

            } catch (Exception e) {
                log.error("conn to zk error", e);
                throw new RuntimeException("conn to zk error");
            }
            return watcher;
        }

        public boolean getLock() {
            try {
                // 创建子节点【本篇文章由公众号“彤哥读源码”原创】
                this.childLockPath = zooKeeper.create(parentLockPath + "/", "".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);
                // 检查自己是不是最小的节点,是则获取成功,不是则监听上一个节点
                return getLockOrWatchLast();
            } catch (Exception e) {
                log.error("get lock error", e);
                throw new RuntimeException("get lock error");
            } finally {
//                System.out.println("getLock: " + childLockPath);
            }
        }

        public void releaseLock() {
            try {
                if (childLockPath != null) {
                    // 释放锁,删除节点
                    zooKeeper.delete(childLockPath, -1);
                }
                // 最后一个释放的删除锁节点
                List<String> children = zooKeeper.getChildren(parentLockPath, false);
                if (children.isEmpty()) {
                    try {
                        zooKeeper.delete(parentLockPath, -1);
                    } catch (KeeperException e) {
                        // 如果删除之前又新加了一个子节点,会删除失败
                        log.info("删除节点 {} 失败", parentLockPath);
                    }
                }
                // 关闭zk连接
                if (zooKeeper != null) {
                    zooKeeper.close();
                }
            } catch (Exception e) {
                log.error("release lock error", e);
                throw new RuntimeException("release lock error");
            } finally {
//                System.out.println("releaseLock: " + childLockPath);
            }
        }

        private boolean getLockOrWatchLast() throws KeeperException, InterruptedException {
            List<String> children = zooKeeper.getChildren(parentLockPath, false);
            // 必须要排序一下,这里取出来的顺序可能是乱的
            Collections.sort(children);
            // 如果当前节点是第一个子节点,则获取锁成功
            if ((parentLockPath + "/" + children.get(0)).equals(childLockPath)) {
                return true;
            }

            // 如果不是第一个子节点,就监听前一个节点
            String last = "";
            for (String child : children) {
                if ((parentLockPath + "/" + child).equals(childLockPath)) {
                    break;
                }
                last = child;
            }

            if (zooKeeper.exists(parentLockPath + "/" + last, true) != null) {
                this.thread = Thread.currentThread();
                // 阻塞当前线程
                LockSupport.park();
                // 唤醒之后重新检测自己是不是最小的节点,因为有可能上一个节点断线了
                return getLockOrWatchLast();
            } else {
                // 如果上一个节点不存在,说明还没来得及监听就释放了,重新检查一次
                return getLockOrWatchLast();
            }
        }

        @Override
        public void process(WatchedEvent event) {
            if (this.thread != null) {
                // 唤醒阻塞的线程(这是在监听线程,跟获取锁的线程不是同一个线程)
                LockSupport.unpark(this.thread);
                this.thread = null;
            }
        }
    }
}

Test code

We played here two groups of thread, a group of user_1 get the lock, the lock number to obtain user_2.

@RunWith(SpringRunner.class)
@SpringBootTest(classes = Application.class)
public class ZkLockerTest {

    @Autowired
    private Locker locker;

    @Test
    public void testZkLocker() throws IOException {
        for (int i = 0; i < 1000; i++) {
            new Thread(()->{
                locker.lock("user_1", ()-> {
                    try {
                        System.out.println(String.format("user_1 time: %d, threadName: %s", System.currentTimeMillis(), Thread.currentThread().getName()));
                        Thread.sleep(500);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                });
            }, "Thread-"+i).start();
        }
        for (int i = 1000; i < 2000; i++) {
            new Thread(()->{
                locker.lock("user_2", ()-> {
                    try {
                        System.out.println(String.format("user_2 time: %d, threadName: %s", System.currentTimeMillis(), Thread.currentThread().getName()));
                        Thread.sleep(500);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                });
            }, "Thread-"+i).start();
        }

        System.in.read();
    }
}

operation result:

We can see the results of two stable print locks around 500ms.

user_1 time: 1568973299578, threadName: Thread-10
user_2 time: 1568973299579, threadName: Thread-1780
user_1 time: 1568973300091, threadName: Thread-887
user_2 time: 1568973300091, threadName: Thread-1542
user_1 time: 1568973300594, threadName: Thread-882
user_2 time: 1568973300594, threadName: Thread-1539
user_2 time: 1568973301098, threadName: Thread-1592
user_1 time: 1568973301098, threadName: Thread-799
user_1 time: 1568973301601, threadName: Thread-444
user_2 time: 1568973301601, threadName: Thread-1096
user_1 time: 1568973302104, threadName: Thread-908
user_2 time: 1568973302104, threadName: Thread-1574
user_2 time: 1568973302607, threadName: Thread-1515
user_1 time: 1568973302607, threadName: Thread-80
user_1 time: 1568973303110, threadName: Thread-274
user_2 time: 1568973303110, threadName: Thread-1774
user_1 time: 1568973303615, threadName: Thread-324
user_2 time: 1568973303615, threadName: Thread-1621

curator achieve

Above the native API implementation is easier to understand the logic zookeeper achieve distributed lock, but it is inevitable to ensure that there is no problem, such as not reentrant lock, do not support the read-write lock.

Here we take a look at existing wheel curator is how to achieve.

pom file

pom file jar package introduced in the following:

<dependency>
    <groupId>org.apache.curator</groupId>
    <artifactId>curator-recipes</artifactId>
    <version>4.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.curator</groupId>
    <artifactId>curator-framework</artifactId>
    <version>4.0.0</version>
</dependency>

Code

The following is one implementation of the mutex:

@Component
@Slf4j
public class ZkCuratorLocker implements Locker {
    public static final String connAddr = "127.0.0.1:2181";
    public static final int timeout = 6000;
    public static final String LOCKER_ROOT = "/locker";

    private CuratorFramework cf;

    @PostConstruct
    public void init() {
        this.cf = CuratorFrameworkFactory.builder()
                .connectString(connAddr)
                .sessionTimeoutMs(timeout)
                .retryPolicy(new ExponentialBackoffRetry(1000, 3))
                .build();

        cf.start();
    }

    @Override
    public void lock(String key, Runnable command) {
        String path = LOCKER_ROOT + "/" + key;
        InterProcessLock lock = new InterProcessMutex(cf, path);
        try {
            // 【本篇文章由公众号“彤哥读源码”原创】
            lock.acquire();
            command.run();
        } catch (Exception e) {
            log.error("get lock error", e);
            throw new RuntimeException("get lock error", e);
        } finally {
            try {
                lock.release();
            } catch (Exception e) {
                log.error("release lock error", e);
                throw new RuntimeException("release lock error", e);
            }
        }
    }
}

In addition mutex, Curator also provides a read-write locks, multiple locks, semaphores implementation, and they are reentrant lock.

to sum up

(1) The node zookeeper There are four types: long-lasting, durable and orderly, temporary, temporary and orderly;

(2) zookeeper provides a very important feature - sensing mechanism, which can be used to monitor changes in the node;

(3) zookeeper distributed lock is based on the temporary order nodes and listening mechanisms to achieve;

(4) zookeeper Distributed Lock node in order to create a temporary lock locked path;

(5) if he is the first node, a lock is obtained;

(6) if he is not the first node, a node before listening, and blocks the current thread;

(7) When listening to delete an event before a node, wake up the thread of the current node, and check again that he is not the first node;

(8) using a temporary order node rather than enduring orderly node is to allow customers to lock automatically released when the end disconnected for no reason;

Egg

What are the advantages zookeeper distributed locks?

A: 1) zookeeper cluster deployment itself can, with respect to a single point mysql more reliable;

2) do not take up mysql connections will not increase the pressure of mysql;

3) the use of sensing mechanism, reduce the number of thread context switching;

4) The client disconnected automatically release the lock, very safe;

5) there may be used a conventional wheel curator;

6) curator implementation is reentrant, the small existing code conversion costs;

zookeeper distributed lock what are the disadvantages?

A: 1) lock will "write" frequently zookeeper, zookeeper increase the pressure;

2) Write zookeeper time will be synchronized in the cluster, the more nodes, the synchronization slower, slower process of acquiring the lock;

3) require additional dependent zookeeper, but most services will not use zookeeper, increasing the complexity of the system;

4) with respect to the distributed lock redis, slightly worse performance slightly;

Recommended Reading

1, Sike java synchronization of the Opening Series

2, Sike Unsafe java class of analytic magic

3, Sike java synchronized series of JMM (Java Memory Model)

4, Sike java synchronized series of volatile resolve

5, Sike java synchronized series of synchronized resolve

6, Sike java series of synchronous write himself a lock Lock

7, Sike java synchronized series of articles from the AQS

8, Sike ReentrantLock resolve java source code synchronized series of (a) - fair locks, lock unfair

9, Sike ReentrantLock resolve java source code synchronized series of (two) - Conditions Lock

10, Sike java synchronized series of ReentrantLock VS synchronized

11, Sike ReentrantReadWriteLock parse java source synchronous series

12, Sike Semaphore synchronization java source parsing Series

13, Sike java source parsing synchronous series CountDownLatch

14, Sike java synchronized series of AQS final chapter

15, Sike StampedLock parse java source synchronous series

16, Sike CyclicBarrier parse java source synchronous series

17, Sike java source code synchronized series of analytical Phaser

18, Sike java series of synchronized distributed lock mysql


I welcome the attention of the public number "Tong brother read source" view source code more series, and brother Tong source of ocean swim together.

qrcode

Guess you like

Origin www.cnblogs.com/tong-yuan/p/11619006.html