Zookeeper笔记之使用zk实现集群选主

一、需求

在主从结构的集群中,我们假设硬件机器是很脆弱的,随时可能会宕机,当master挂掉之后需要从slave中选出一个节点作为新的master,使用zookeeper可以很简单的实现集群选主功能。

二、分析

集群选主涉及到两个问题:

1. 谁来做leader

2. leader挂掉了怎么被follower感知到

首先是第一个问题,谁来做leader,其实可以将这个问题看做是多线程中的互斥锁抢占,锁只有一把,并且只能被一个人抢到,这里就把一个zookeeper上的一个节点/leader-info看做是锁,集群中的每台机器都尝试去创建这个节点,因为zookeeper创建节点是原子性操作,所以只有一台机器能够创建成功其它都会失败,创建成功的那台机器就作为leader,其它机器做follower,一般还会在/leader-info节点上存储一些leader相关的信息,以让follower去连接leader进行一些数据交换或指令控制之类的,那就是选主之后的事了不在此篇文章的讨论范围之内。

第二个问题是leader挂掉了怎么通知其它的follower,zookeeper中的节点按照有效时间分为持久节点和临时节点,临时节点跟session绑定,当session失效的时候它创建的临时节点就会被删除,利用这个特性可以检测到节点是否还在存活状态,实现follower对leader下线的感知,只需要在创建/leader-info节点的时候将其创建为临时节点,然后follower在这个节点上添加一个watcher监听其删除事件,这样当leader挂掉的时候zookeepr会将/leader-info节点删除,同时给所有的follower发送事件通知,follower一看leader挂了就燥起来了,将自己的状态置为looking,开始新一轮的选举。

总结一下选主的流程:

1. 集群中的所有机器将自己置为looking状态,准备开始选举。

2. 所有looking状态的机器尝试去创建/leader-info节点。

3. 创建成功的将自己的状态修改为leader,同时将自己的一些信息写入到这个节点上;创建失败的将自己的状态置为follower,同时尝试从/leader-info获取leader信息进行一些leader改变的逻辑。

4. 在follower去获取/leader-info节点的数据的时候,是有可能报KeeperException.NoNodeException异常的,因为leader刚成为leader就挂掉了(或者因为一些网络抖动原因,总之是session失效了),这个时候follower检测到KeeperException.NoNodeException,说明集群中已经没有了leader,将自己的状态置为looking开始新一轮的选举。

三、实现

Node.java:

package cc11001100.zookeeper.leaderElection;

import cc11001100.zookeeper.utils.ZooKeeperUtil;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooDefs;
import org.apache.zookeeper.ZooKeeper;

import java.io.IOException;
import java.io.UnsupportedEncodingException;

/**
 * 表示集群中的一个节点,会通过选举决定自己是leader还是follower
 *
 * @author CC11001100
 */
public class Node {

	private Status status;
	private String nodeForLeaderInfo;
	private ZooKeeper zooKeeper;

	public Node(String listenerNodeForLeader) throws IOException {
		this.nodeForLeaderInfo = listenerNodeForLeader;
		this.zooKeeper = ZooKeeperUtil.getZooKeeper();
		lookingForLeader();
	}

	public void lookingForLeader() {
		status = Status.LOOKING;
		try {
			String leaderInfo = Thread.currentThread().getName();
			zooKeeper.create(nodeForLeaderInfo, leaderInfo.getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
			// 如果上一步没有抛异常,说明自己已经是leader了
			status = Status.LEADER;
			String logMsg = Thread.currentThread().getName() + " is leader";
			System.out.println(logMsg);
		} catch (KeeperException.NodeExistsException e) {
			// 节点已经存在,说明leader已经被别人注册成功了,自己是follower
			status = Status.FOLLOWER;
			try {
				byte[] leaderInfoBytes = zooKeeper.getData(nodeForLeaderInfo, event -> {
					if (event.getType() == Watcher.Event.EventType.NodeDeleted) {
						lookingForLeader();
					}
				}, null);
				String logMsg = Thread.currentThread().getName() + " is follower, master is " + new String(leaderInfoBytes, "UTF-8");
				System.out.println(logMsg);
			} catch (KeeperException.NoNodeException e1) {
				// 如果在获取leader信息的时候报了节点不存在,说明这个leader比较短命,刚抢到leader就又挂掉了
				lookingForLeader();
			} catch (KeeperException | InterruptedException | UnsupportedEncodingException e1) {
				e1.printStackTrace();
			}
		} catch (KeeperException | InterruptedException e) {
			e.printStackTrace();
		}
	}

	public void shutdown() {
		try {
			if (zooKeeper != null) {
				zooKeeper.close();
			}
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}

	public Status getStatus() {
		return status;
	}

	// 当前节点的身份
	public enum Status {
		LOOKING, // 选举中
		LEADER, // 选举完毕,当前节点为leader
		FOLLOWER; // 选举完毕,当前节点为follower
	}

}

LeaderElectionTest.java:

package cc11001100.zookeeper.leaderElection;

import java.io.IOException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;

/**
 * @author CC11001100
 */
public class LeaderElectionTest {

	private static void sleep(long mils) {
		try {
			TimeUnit.MILLISECONDS.sleep(mils);
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}

	public static void main(String[] args) throws IOException {

		final String LEADER_INFO_NODE = "/leader-info";
		int nodeNum = 10;
		AtomicLong idGenerator = new AtomicLong();
		AtomicInteger activeNodeCount = new AtomicInteger();
		while (true) {
			if (activeNodeCount.get() >= nodeNum) {
				sleep(10);
				continue;
			}

			// 线程启动需要一定时间,将线程启动看做开机过程,在开机之前就算一台新的机器加入了
			activeNodeCount.incrementAndGet();
			new Thread(() -> {
				try {
					Node node = new Node(LEADER_INFO_NODE);
					while (true) {
						sleep(1000);
						// 这里为了试验就让leader有轻微自杀倾向...
						if (node.getStatus() == Node.Status.LEADER && Math.random() < 0.3) {
							String logMsg = "----------------------------- " + Thread.currentThread().getName() + " shutdown -----------------------------";
							System.out.println(logMsg);
							node.shutdown();
							break;
						}
					}
				} catch (IOException e) {
					e.printStackTrace();
				} finally {
					activeNodeCount.decrementAndGet();
				}
			}, "node-" + idGenerator.getAndIncrement()).start();
		}
	}

}

控制台输出:

...
node-4 is leader
node-3 is follower, master is node-4
node-0 is follower, master is node-4
node-9 is follower, master is node-4
node-7 is follower, master is node-4
node-5 is follower, master is node-4
node-1 is follower, master is node-4
node-6 is follower, master is node-4
node-8 is follower, master is node-4
node-2 is follower, master is node-4
----------------------------- node-4 shutdown -----------------------------
node-0-EventThread is leader
node-6-EventThread is follower, master is node-0-EventThread
node-3-EventThread is follower, master is node-0-EventThread
node-7-EventThread is follower, master is node-0-EventThread
node-1-EventThread is follower, master is node-0-EventThread
node-5-EventThread is follower, master is node-0-EventThread
node-9-EventThread is follower, master is node-0-EventThread
node-2-EventThread is follower, master is node-0-EventThread
node-8-EventThread is follower, master is node-0-EventThread
node-10 is follower, master is node-0-EventThread
----------------------------- node-0 shutdown -----------------------------
node-6-EventThread is leader
node-7-EventThread is follower, master is node-6-EventThread
node-1-EventThread is follower, master is node-6-EventThread
node-3-EventThread is follower, master is node-6-EventThread
node-10-EventThread is follower, master is node-6-EventThread
node-9-EventThread is follower, master is node-6-EventThread
node-5-EventThread is follower, master is node-6-EventThread
node-2-EventThread is follower, master is node-6-EventThread
node-8-EventThread is follower, master is node-6-EventThread
node-11 is follower, master is node-6-EventThread
----------------------------- node-6 shutdown -----------------------------
node-1-EventThread is leader
node-10-EventThread is follower, master is node-1-EventThread
node-7-EventThread is follower, master is node-1-EventThread
node-11-EventThread is follower, master is node-1-EventThread
node-8-EventThread is follower, master is node-1-EventThread
node-5-EventThread is follower, master is node-1-EventThread
node-9-EventThread is follower, master is node-1-EventThread
node-3-EventThread is follower, master is node-1-EventThread
node-2-EventThread is follower, master is node-1-EventThread
node-12 is follower, master is node-1-EventThread
----------------------------- node-1 shutdown -----------------------------
node-3-EventThread is leader
node-12-EventThread is follower, master is node-3-EventThread
node-11-EventThread is follower, master is node-3-EventThread
node-5-EventThread is follower, master is node-3-EventThread
node-7-EventThread is follower, master is node-3-EventThread
node-9-EventThread is follower, master is node-3-EventThread
node-2-EventThread is follower, master is node-3-EventThread
node-10-EventThread is follower, master is node-3-EventThread
node-8-EventThread is follower, master is node-3-EventThread
node-13 is follower, master is node-3-EventThread
----------------------------- node-3 shutdown -----------------------------
node-5-EventThread is leader
node-13-EventThread is follower, master is node-5-EventThread
node-12-EventThread is follower, master is node-5-EventThread
node-7-EventThread is follower, master is node-5-EventThread
node-11-EventThread is follower, master is node-5-EventThread
node-10-EventThread is follower, master is node-5-EventThread
node-9-EventThread is follower, master is node-5-EventThread
node-2-EventThread is follower, master is node-5-EventThread
node-8-EventThread is follower, master is node-5-EventThread
node-14 is follower, master is node-5-EventThread
----------------------------- node-5 shutdown -----------------------------
node-7-EventThread is leader
node-13-EventThread is follower, master is node-7-EventThread
node-12-EventThread is follower, master is node-7-EventThread
node-9-EventThread is follower, master is node-7-EventThread
node-11-EventThread is follower, master is node-7-EventThread
node-14-EventThread is follower, master is node-7-EventThread
node-10-EventThread is follower, master is node-7-EventThread
node-8-EventThread is follower, master is node-7-EventThread
node-2-EventThread is follower, master is node-7-EventThread
node-15 is follower, master is node-7-EventThread
----------------------------- node-7 shutdown -----------------------------
node-14-EventThread is leader
node-13-EventThread is follower, master is node-14-EventThread
node-11-EventThread is follower, master is node-14-EventThread
node-2-EventThread is follower, master is node-14-EventThread
node-12-EventThread is follower, master is node-14-EventThread
node-15-EventThread is follower, master is node-14-EventThread
node-10-EventThread is follower, master is node-14-EventThread
node-9-EventThread is follower, master is node-14-EventThread
node-8-EventThread is follower, master is node-14-EventThread
node-16 is follower, master is node-14-EventThread
----------------------------- node-14 shutdown -----------------------------
node-13-EventThread is leader
node-12-EventThread is follower, master is node-13-EventThread
node-15-EventThread is follower, master is node-13-EventThread
node-9-EventThread is follower, master is node-13-EventThread
node-10-EventThread is follower, master is node-13-EventThread
node-2-EventThread is follower, master is node-13-EventThread
node-8-EventThread is follower, master is node-13-EventThread
node-11-EventThread is follower, master is node-13-EventThread
node-16-EventThread is follower, master is node-13-EventThread
node-17 is follower, master is node-13-EventThread
...

.

猜你喜欢

转载自www.cnblogs.com/cc11001100/p/10231242.html