zookeeper源码分析-事件注册及通信原理


title: zookeeper源码分析-事件注册及通信原理

欢迎查看Eetal的第二十六篇博客–zookeeper源码分析-事件注册及通信原理

zookeeper基本命令代码

依赖

  		<groupId>org.apache.zookeeper</groupId>
  		<artifactId>zookeeper</artifactId>
  		<version>3.5.5</version>

demo代码

		final CountDownLatch coutnDownLatch = new CountDownLatch(1);
		ZooKeeper zookeeper = new ZooKeeper("192.168.156.171:2181,192.168.156.171:2182,192.168.156.171:2183",
								3000,
								new Watcher() {

									public void process(WatchedEvent event) {
										//注册默认的watcher,当连接上时,会触发该watcher
										//对该zookeeper实例的节点进行操作,watcher填true使用此默认watcher
										System.out.println("defaultWatcher process eventType : "+event.getPath()+"-"+event.getType());
										coutnDownLatch.countDown();
									}
			
		});
		System.out.println(zookeeper.getState());//connecting---连接中
		coutnDownLatch.await();
		System.out.println(zookeeper.getState());//connected---已连接
		//createMode 模式,PERSISTENT---持久化节点
		Stat stat = new Stat();
		/**
		 * 节点状态,包含version
		 * create、delete、setData语句会更新节点stat
		 * 并且create、delete子节点会更新父节点的stat
		 * 在create、getData语句传入stat,执行后会把最新的stat属性复制到传入的stat对象中
		 * setData、delete语句传入stat的version,通过乐观锁验证版本号保证数据同步,版本不对时抛出BadVersionException(如果version是-1,代表不进行版本验证)
		 * setData、exists语句返回值为最新的stat
		 */
		zookeeper.delete("/java",-1);
		String path = zookeeper.create("/java", "2019-07-29".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT,stat);//返回路径
		System.out.println("path : "+path);
		byte[] byteV = zookeeper.getData(path, null, stat);
		System.out.println("version:"+stat.getVersion()+"-"+new String(byteV));
		/**
		 * 
		 * OPEN_ACL_UNSAFE---完全开放
		 * CREATOR_ALL_ACL---创建者拥有权限
		 * READ_ACL_UNSAFE---开放读权限
		 * ANYONE_ID_UNSAFE---有验证的用户就不被全部权限
		 * AUTH_IDS---只对指定用户开放
		 */
		stat = zookeeper.setData(path, "2019-07-30".getBytes(), stat.getVersion());
		zookeeper.getData(path, true, stat);//true代表注册watcher,使用默认watcher即创建Zookeeper实例时注册的watcher,只响应一次事件
		zookeeper.getData(path, new Watcher() {//注册专用的watcher,只响应一次事件
			public void process(WatchedEvent event) {
			}
		}, stat);
		
		zookeeper.setData(path, "2019-07-30".getBytes(), stat.getVersion());
		byteV = zookeeper.getData(path, null, stat);
		System.out.println("version:"+stat.getVersion()+"-"+new String(byteV)); 

源码分析

Zookeeper对于集群信息提供jmx的监控支持,详情欢迎查看本人另一篇博客——zookeeper源码分析-选举算法
而对于客户端连接使用的是netty做socket的io处理,实现类为ClientCnxnSocketNetty,还有另一个可选的nio实现类是ClientCnxnSocketNIO
Zookeeper构造函数,实例化最终会到下面的构造函数,根据默认配置,创建一个ZKWatchManager对象,将默认wathcer注入该wathcerManager的defaultWatcher成员
将该watcherManager以及包含了集群信息的zookeeper对象传递用于实例化一个ClientCnxn对象,并启动

    protected ZKWatchManager defaultWatchManager() {
        return new ZKWatchManager(getClientConfig().getBoolean(ZKClientConfig.DISABLE_AUTO_WATCH_RESET));
    }
    public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher,
            boolean canBeReadOnly, HostProvider aHostProvider,
            ZKClientConfig clientConfig) throws IOException {

        if (clientConfig == null) {
            clientConfig = new ZKClientConfig();
        }
        this.clientConfig = clientConfig;
        watchManager = defaultWatchManager();
        watchManager.defaultWatcher = watcher;
        ConnectStringParser connectStringParser = new ConnectStringParser(
                connectString);
        hostProvider = aHostProvider;

        cnxn = createConnection(connectStringParser.getChrootPath(),
                hostProvider, sessionTimeout, this, watchManager,
                getClientCnxnSocket(), canBeReadOnly);
        cnxn.start();
    }

    // @VisibleForTesting
    protected ClientCnxn createConnection(String chrootPath,
            HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
            boolean canBeReadOnly) throws IOException {
        return new ClientCnxn(chrootPath, hostProvider, sessionTimeout, this,
                watchManager, clientCnxnSocket, canBeReadOnly);
    }

ClientCnxn在实例化时会实例化两个线程成员,作为发送消息的线程以及处理事件的线程
因为zookeeper内部使用netty所以发送消息的响应式异步的,当netty收到响应时,将响应封装为一个event,数据的封装返回在evetThread处理
要发送的包由一个阻塞队列保存,已发送待确认的包由一个链表保存

    private final LinkedList<Packet> pendingQueue = new LinkedList<Packet>();//等待响应的队列

    private final LinkedBlockingDeque<Packet> outgoingQueue = new LinkedBlockingDeque<Packet>();//要发送的队列

    public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
            long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) {
        this.zooKeeper = zooKeeper;
        this.watcher = watcher;
        this.sessionId = sessionId;
        this.sessionPasswd = sessionPasswd;
        this.sessionTimeout = sessionTimeout;
        this.hostProvider = hostProvider;
        this.chrootPath = chrootPath;

        connectTimeout = sessionTimeout / hostProvider.size();
        readTimeout = sessionTimeout * 2 / 3;
        readOnly = canBeReadOnly;

        sendThread = new SendThread(clientCnxnSocket);
        eventThread = new EventThread();
        this.clientConfig=zooKeeper.getClientConfig();
        initRequestTimeout();
    }

    public void start() {
        sendThread.start();
        eventThread.start();
    }

sendThread的处理逻辑为,当服务器状态为活跃时,循环检查计算心跳间隔时间,添加pin包到待发送队列,后面发送以保持连接
每一轮最后调用clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this)去开始传输

 public void run() {
            clientCnxnSocket.introduce(this, sessionId, outgoingQueue);
            clientCnxnSocket.updateNow();
            clientCnxnSocket.updateLastSendAndHeard();
            int to;
            long lastPingRwServer = Time.currentElapsedTime();
            final int MAX_SEND_PING_INTERVAL = 10000; //10 seconds
            InetSocketAddress serverAddress = null;
            while (state.isAlive()) {
                try {
                    if (!clientCnxnSocket.isConnected()) {
                        // don't re-establish connection if we are closing
                        if (closing) {
                            break;
                        }
                        if (rwServerAddress != null) {
                            serverAddress = rwServerAddress;
                            rwServerAddress = null;
                        } else {
                            serverAddress = hostProvider.next(1000);
                        }
                        startConnect(serverAddress);
                        clientCnxnSocket.updateLastSendAndHeard();
                    }

                    if (state.isConnected()) {
                        // determine whether we need to send an AuthFailed event.
                        if (zooKeeperSaslClient != null) {
                            boolean sendAuthEvent = false;
                            if (zooKeeperSaslClient.getSaslState() == ZooKeeperSaslClient.SaslState.INITIAL) {
                                try {
                                    zooKeeperSaslClient.initialize(ClientCnxn.this);
                                } catch (SaslException e) {
                                   LOG.error("SASL authentication with Zookeeper Quorum member failed: " + e);
                                    state = States.AUTH_FAILED;
                                    sendAuthEvent = true;
                                }
                            }
                            KeeperState authState = zooKeeperSaslClient.getKeeperState();
                            if (authState != null) {
                                if (authState == KeeperState.AuthFailed) {
                                    // An authentication error occurred during authentication with the Zookeeper Server.
                                    state = States.AUTH_FAILED;
                                    sendAuthEvent = true;
                                } else {
                                    if (authState == KeeperState.SaslAuthenticated) {
                                        sendAuthEvent = true;
                                    }
                                }
                            }

                            if (sendAuthEvent) {
                                eventThread.queueEvent(new WatchedEvent(
                                      Watcher.Event.EventType.None,
                                      authState,null));
                                if (state == States.AUTH_FAILED) {
                                  eventThread.queueEventOfDeath();
                                }
                            }
                        }
                        to = readTimeout - clientCnxnSocket.getIdleRecv();
                    } else {
                        to = connectTimeout - clientCnxnSocket.getIdleRecv();
                    }
                    
                    if (to <= 0) {
                        String warnInfo;
                        warnInfo = "Client session timed out, have not heard from server in "
                            + clientCnxnSocket.getIdleRecv()
                            + "ms"
                            + " for sessionid 0x"
                            + Long.toHexString(sessionId);
                        LOG.warn(warnInfo);
                        throw new SessionTimeoutException(warnInfo);
                    }
                    if (state.isConnected()) {
                    	//1000(1 second) is to prevent race condition missing to send the second ping
                    	//also make sure not to send too many pings when readTimeout is small 
                        int timeToNextPing = readTimeout / 2 - clientCnxnSocket.getIdleSend() - 
                        		((clientCnxnSocket.getIdleSend() > 1000) ? 1000 : 0);
                        //send a ping request either time is due or no packet sent out within MAX_SEND_PING_INTERVAL
                        if (timeToNextPing <= 0 || clientCnxnSocket.getIdleSend() > MAX_SEND_PING_INTERVAL) {
                            sendPing();
                            clientCnxnSocket.updateLastSend();
                        } else {
                            if (timeToNextPing < to) {
                                to = timeToNextPing;
                            }
                        }
                    }

                    // If we are in read-only mode, seek for read/write server
                    if (state == States.CONNECTEDREADONLY) {
                        long now = Time.currentElapsedTime();
                        int idlePingRwServer = (int) (now - lastPingRwServer);
                        if (idlePingRwServer >= pingRwTimeout) {
                            lastPingRwServer = now;
                            idlePingRwServer = 0;
                            pingRwTimeout =
                                Math.min(2*pingRwTimeout, maxPingRwTimeout);
                            pingRwServer();
                        }
                        to = Math.min(to, pingRwTimeout - idlePingRwServer);
                    }

                    clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this);
                } catch (Throwable e) {
                    if (closing) {
                        if (LOG.isDebugEnabled()) {
                            // closing so this is expected
                            LOG.debug("An exception was thrown while closing send thread for session 0x"
                                    + Long.toHexString(getSessionId())
                                    + " : " + e.getMessage());
                        }
                        break;
                    } else {
                        // this is ugly, you have a better way speak up
                        if (e instanceof SessionExpiredException) {
                            LOG.info(e.getMessage() + ", closing socket connection");
                        } else if (e instanceof SessionTimeoutException) {
                            LOG.info(e.getMessage() + RETRY_CONN_MSG);
                        } else if (e instanceof EndOfStreamException) {
                            LOG.info(e.getMessage() + RETRY_CONN_MSG);
                        } else if (e instanceof RWServerFoundException) {
                            LOG.info(e.getMessage());
                        } else if (e instanceof SocketException) {
                            LOG.info("Socket error occurred: {}: {}", serverAddress, e.getMessage());
                        } else {
                            LOG.warn("Session 0x{} for server {}, unexpected error{}",
                                            Long.toHexString(getSessionId()),
                                            serverAddress,
                                            RETRY_CONN_MSG,
                                            e);
                        }
                        // At this point, there might still be new packets appended to outgoingQueue.
                        // they will be handled in next connection or cleared up if closed.
                        cleanAndNotifyState();
                    }
                }
            }
            synchronized (state) {
                // When it comes to this point, it guarantees that later queued
                // packet to outgoingQueue will be notified of death.
                cleanup();
            }
            clientCnxnSocket.close();
            if (state.isAlive()) {
                eventThread.queueEvent(new WatchedEvent(Event.EventType.None,
                        Event.KeeperState.Disconnected, null));
            }
            eventThread.queueEvent(new WatchedEvent(Event.EventType.None,
                        Event.KeeperState.Closed, null));
            ZooTrace.logTraceMessage(LOG, ZooTrace.getTextTraceLevel(),
                    "SendThread exited loop for session: 0x"
                           + Long.toHexString(getSessionId()));
        }

netty相关的类ClientCnxnSocketNetty中对doTransport的实现为:
从outgoingQueue里取出一个待发送的packet,如果不为空就调用doWrite方法去发送数据包,如果当前连接已断开,将包再次加入待发送队列

	void doTransport(int waitTimeOut,
                     List<Packet> pendingQueue,
                     ClientCnxn cnxn)
            throws IOException, InterruptedException {
        try {
            if (!firstConnect.await(waitTimeOut, TimeUnit.MILLISECONDS)) {
                return;
            }
            Packet head = null;
            if (needSasl.get()) {
                if (!waitSasl.tryAcquire(waitTimeOut, TimeUnit.MILLISECONDS)) {
                    return;
                }
            } else {
                head = outgoingQueue.poll(waitTimeOut, TimeUnit.MILLISECONDS);
            }
            // check if being waken up on closing.
            if (!sendThread.getZkState().isAlive()) {
                // adding back the packet to notify of failure in conLossPacket().
                addBack(head);
                return;
            }
            // channel disconnection happened
            if (disconnected.get()) {
                addBack(head);
                throw new EndOfStreamException("channel for sessionid 0x"
                        + Long.toHexString(sessionId)
                        + " is lost");
            }
            if (head != null) {
                doWrite(pendingQueue, head, cnxn);
            }
        } finally {
            updateNow();
        }
    }

doWrite方法会将除了outgoingQueue里的唤醒包和pin包以及链接登陆的auth包以外的全部包和调用方法带过来的packet加入事件队列
(因为pin包和auth包实际没有需要返回的数据信息response回调用方,pin包无处理,auth包只需要失败时的一个响应,收到auth包的失败信息,添加一个authFailed的事件到事件队列,源码里也是这么做的*在sendThread的readResponse方法)
并且outgoingQueue里的packet,除了唤醒包以外,pin包、auth包以及剩下的packet和调用方法带过来的packet都会被发送

@Override
	private void doWrite(List<Packet> pendingQueue, Packet p, ClientCnxn cnxn) {
        updateNow();
        boolean anyPacketsSent = false;
        while (true) {
            if (p != WakeupPacket.getInstance()) {
                if ((p.requestHeader != null) &&
                        (p.requestHeader.getType() != ZooDefs.OpCode.ping) &&
                        (p.requestHeader.getType() != ZooDefs.OpCode.auth)) {
                    p.requestHeader.setXid(cnxn.getXid());
                    synchronized (pendingQueue) {
                        pendingQueue.add(p);
                    }
                }
                sendPktOnly(p);
                anyPacketsSent = true;
            }
            if (outgoingQueue.isEmpty()) {
                break;
            }
            p = outgoingQueue.remove();
        }
        // TODO: maybe we should flush in the loop above every N packets/bytes?
        // But, how do we determine the right value for N ...
        if (anyPacketsSent) {
            channel.flush();
        }
    }

当netty的channel读到数据时,会调用sendThread的readResponse方法装载响应,最后会调用wakeupCnxn往outgoingQueue里添加唤醒包

	protected void channelRead0(ChannelHandlerContext ctx, ByteBuf buf) throws Exception {
		updateNow();
		while (buf.isReadable()) {
			if (incomingBuffer.remaining() > buf.readableBytes()) {
				int newLimit = incomingBuffer.position()
						+ buf.readableBytes();
				incomingBuffer.limit(newLimit);
			}
			buf.readBytes(incomingBuffer);
			incomingBuffer.limit(incomingBuffer.capacity());

			if (!incomingBuffer.hasRemaining()) {
				incomingBuffer.flip();
				if (incomingBuffer == lenBuffer) {
					recvCount.getAndIncrement();
					readLength();
				} else if (!initialized) {
					readConnectResult();
					lenBuffer.clear();
					incomingBuffer = lenBuffer;
					initialized = true;
					updateLastHeard();
				} else {
					sendThread.readResponse(incomingBuffer);
					lenBuffer.clear();
					incomingBuffer = lenBuffer;
					updateLastHeard();
				}
			}
		}
		wakeupCnxn();
		// Note: SimpleChannelInboundHandler releases the ByteBuf for us
		// so we don't need to do it.
    }
    private void wakeupCnxn() {
        if (needSasl.get()) {
            waitSasl.release();
        }
        outgoingQueue.add(WakeupPacket.getInstance());
    }

sendThread的readResponse方法根据响应的数据报类型做处理
如果是pin包不做处理只做日志,并返回
如果是auth包,做日志处理,判断err码,如果auth失败了,就往事件队列里增加一个type为None,stat为AuthFailed的事件,并返回
如果是主动监听的服务端通知事件,创建一个WatcherEvent反序列化response的信息,并用一个WatchedEvent包装以后,加入事件队列,并返回
如果不满足以上的,先判断如果当前正在进行SASL身份验证,构造和立即发送响应数据包,而不是排队响应其他数据包
剩下的进行普通数据报处理,从pendingQueue的首部取出一个已发送待响应的packet(因为是从头拿,所以是最早发送的带响应packet)
判断取出的packet与读到的ReplyHeader数据的xid是否相等,不相等说明出现丢包,抛出异常
xid相等则将本次数据的response信息反序列化装载到packet的resonse成员
无论是否丢包,最终会调用finishPacket(packet)对packet做善后工作

 void readResponse(ByteBuffer incomingBuffer) throws IOException {
            ByteBufferInputStream bbis = new ByteBufferInputStream(
                    incomingBuffer);
            BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
            ReplyHeader replyHdr = new ReplyHeader();

            replyHdr.deserialize(bbia, "header");
            if (replyHdr.getXid() == -2) {
                // -2 is the xid for pings
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Got ping response for sessionid: 0x"
                            + Long.toHexString(sessionId)
                            + " after "
                            + ((System.nanoTime() - lastPingSentNs) / 1000000)
                            + "ms");
                }
                return;
            }
            if (replyHdr.getXid() == -4) {
                // -4 is the xid for AuthPacket               
                if(replyHdr.getErr() == KeeperException.Code.AUTHFAILED.intValue()) {
                    state = States.AUTH_FAILED;                    
                    eventThread.queueEvent( new WatchedEvent(Watcher.Event.EventType.None, 
                            Watcher.Event.KeeperState.AuthFailed, null) );
                    eventThread.queueEventOfDeath();
                }
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Got auth sessionid:0x"
                            + Long.toHexString(sessionId));
                }
                return;
            }
            if (replyHdr.getXid() == -1) {
                // -1 means notification
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Got notification sessionid:0x"
                        + Long.toHexString(sessionId));
                }
                WatcherEvent event = new WatcherEvent();
                event.deserialize(bbia, "response");

                // convert from a server path to a client path
                if (chrootPath != null) {
                    String serverPath = event.getPath();
                    if(serverPath.compareTo(chrootPath)==0)
                        event.setPath("/");
                    else if (serverPath.length() > chrootPath.length())
                        event.setPath(serverPath.substring(chrootPath.length()));
                    else {
                    	LOG.warn("Got server path " + event.getPath()
                    			+ " which is too short for chroot path "
                    			+ chrootPath);
                    }
                }

                WatchedEvent we = new WatchedEvent(event);
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Got " + we + " for sessionid 0x"
                            + Long.toHexString(sessionId));
                }

                eventThread.queueEvent( we );
                return;
            }

            // If SASL authentication is currently in progress, construct and
            // send a response packet immediately, rather than queuing a
            // response as with other packets.
            if (tunnelAuthInProgress()) {
                GetSASLRequest request = new GetSASLRequest();
                request.deserialize(bbia,"token");
                zooKeeperSaslClient.respondToServer(request.getToken(),
                  ClientCnxn.this);
                return;
            }

            Packet packet;
            synchronized (pendingQueue) {
                if (pendingQueue.size() == 0) {
                    throw new IOException("Nothing in the queue, but got "
                            + replyHdr.getXid());
                }
                packet = pendingQueue.remove();
            }
            /*
             * Since requests are processed in order, we better get a response
             * to the first request!
             */
            try {
                if (packet.requestHeader.getXid() != replyHdr.getXid()) {
                    packet.replyHeader.setErr(
                            KeeperException.Code.CONNECTIONLOSS.intValue());
                    throw new IOException("Xid out of order. Got Xid "
                            + replyHdr.getXid() + " with err " +
                            + replyHdr.getErr() +
                            " expected Xid "
                            + packet.requestHeader.getXid()
                            + " for a packet with details: "
                            + packet );
                }

                packet.replyHeader.setXid(replyHdr.getXid());
                packet.replyHeader.setErr(replyHdr.getErr());
                packet.replyHeader.setZxid(replyHdr.getZxid());
                if (replyHdr.getZxid() > 0) {
                    lastZxid = replyHdr.getZxid();
                }
                if (packet.response != null && replyHdr.getErr() == 0) {
                    packet.response.deserialize(bbia, "response");
                }

                if (LOG.isDebugEnabled()) {
                    LOG.debug("Reading reply sessionid:0x"
                            + Long.toHexString(sessionId) + ", packet:: " + packet);
                }
            } finally {
                finishPacket(packet);
            }
        }

在eventThread的queueEvent方法里,会调用watcher.materialize(event.getState(),event.getType(), event.getPath())识别事件
构建一个WatcherSetEventPair加入事件队列waitingEvents

	private final LinkedBlockingQueue<Object> waitingEvents =
            new LinkedBlockingQueue<Object>();
	public void queueEvent(WatchedEvent event) {
		queueEvent(event, null);
	}
	private void queueEvent(WatchedEvent event,
			Set<Watcher> materializedWatchers) {
		if (event.getType() == EventType.None
				&& sessionState == event.getState()) {
			return;
		}
		sessionState = event.getState();
		final Set<Watcher> watchers;
		if (materializedWatchers == null) {
			// materialize the watchers based on the event
			watchers = watcher.materialize(event.getState(),
					event.getType(), event.getPath());
		} else {
			watchers = new HashSet<Watcher>();
			watchers.addAll(materializedWatchers);
		}
		WatcherSetEventPair pair = new WatcherSetEventPair(watchers, event);
		// queue the pair (watch set & event) for later processing
		waitingEvents.add(pair);
	}

eventThread使用的watcher实际为Zookeeper类传递给ClientCnxn的watcherManager成员
也就是zookeeper本地注册watcher的对象

    protected ClientCnxn createConnection(String chrootPath,
            HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
            boolean canBeReadOnly) throws IOException {
        return new ClientCnxn(chrootPath, hostProvider, sessionTimeout, this,
                watchManager, clientCnxnSocket, canBeReadOnly);
    }
    public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket, boolean canBeReadOnly)
            throws IOException {
        this(chrootPath, hostProvider, sessionTimeout, zooKeeper, watcher,
             clientCnxnSocket, 0, new byte[16], canBeReadOnly);
    }
	public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
            long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) {
        this.zooKeeper = zooKeeper;
        this.watcher = watcher;
        this.sessionId = sessionId;
        this.sessionPasswd = sessionPasswd;
        this.sessionTimeout = sessionTimeout;
        this.hostProvider = hostProvider;
        this.chrootPath = chrootPath;

        connectTimeout = sessionTimeout / hostProvider.size();
        readTimeout = sessionTimeout * 2 / 3;
        readOnly = canBeReadOnly;

        sendThread = new SendThread(clientCnxnSocket);
        eventThread = new EventThread();
        this.clientConfig=zooKeeper.getClientConfig();
        initRequestTimeout();
    }

ZkWatcherManager使用三个集合成员存储不同事件类型的watcher,并设置一个默认的watcher

	static class ZKWatchManager implements ClientWatchManager {
        private final Map<String, Set<Watcher>> dataWatches =
            new HashMap<String, Set<Watcher>>();
        private final Map<String, Set<Watcher>> existWatches =
            new HashMap<String, Set<Watcher>>();
        private final Map<String, Set<Watcher>> childWatches =
            new HashMap<String, Set<Watcher>>();
        private boolean disableAutoWatchReset;

        ZKWatchManager(boolean disableAutoWatchReset) {
            this.disableAutoWatchReset = disableAutoWatchReset;
        }

        protected volatile Watcher defaultWatcher;

其materialize方法根据事件类型和路径信息,匹配到对应的watcher事件并返回事件集合,可以看到使用remove方法获得事件集合。这也是为什么注册的事件只响应一次

	public Set<Watcher> materialize(Watcher.Event.KeeperState state,
                                        Watcher.Event.EventType type,
                                        String clientPath)
        {
            Set<Watcher> result = new HashSet<Watcher>();

            switch (type) {
            case None:
                result.add(defaultWatcher);
                boolean clear = disableAutoWatchReset && state != Watcher.Event.KeeperState.SyncConnected;
                synchronized(dataWatches) {
                    for(Set<Watcher> ws: dataWatches.values()) {
                        result.addAll(ws);
                    }
                    if (clear) {
                        dataWatches.clear();
                    }
                }

                synchronized(existWatches) {
                    for(Set<Watcher> ws: existWatches.values()) {
                        result.addAll(ws);
                    }
                    if (clear) {
                        existWatches.clear();
                    }
                }

                synchronized(childWatches) {
                    for(Set<Watcher> ws: childWatches.values()) {
                        result.addAll(ws);
                    }
                    if (clear) {
                        childWatches.clear();
                    }
                }

                return result;
            case NodeDataChanged:
            case NodeCreated:
                synchronized (dataWatches) {
                    addTo(dataWatches.remove(clientPath), result);
                }
                synchronized (existWatches) {
                    addTo(existWatches.remove(clientPath), result);
                }
                break;
            case NodeChildrenChanged:
                synchronized (childWatches) {
                    addTo(childWatches.remove(clientPath), result);
                }
                break;
            case NodeDeleted:
                synchronized (dataWatches) {
                    addTo(dataWatches.remove(clientPath), result);
                }
                // XXX This shouldn't be needed, but just in case
                synchronized (existWatches) {
                    Set<Watcher> list = existWatches.remove(clientPath);
                    if (list != null) {
                        addTo(list, result);
                        LOG.warn("We are triggering an exists watch for delete! Shouldn't happen!");
                    }
                }
                synchronized (childWatches) {
                    addTo(childWatches.remove(clientPath), result);
                }
                break;
            default:
                String msg = "Unhandled watch event type " + type
                    + " with state " + state + " on path " + clientPath;
                LOG.error(msg);
                throw new RuntimeException(msg);
            }

            return result;
        }

说回readResponse最后对数据报的处理,在完成将数据报对应的本地注册事件加入事件队列以后,执行finishPacket方法
在finishPacket中,设置packet的finished标志位true
如果该packet没有绑定异步回调函数,则直接调用packet的notify唤醒(因为是异步发送,前面获得packet以后会陷入阻塞wait)
否则调用eventThread.queuePacket方法对响应后的数据包做进一步处理

	protected void finishPacket(Packet p) {
        int err = p.replyHeader.getErr();
        if (p.watchRegistration != null) {
            p.watchRegistration.register(err);
        }
        // Add all the removed watch events to the event queue, so that the
        // clients will be notified with 'Data/Child WatchRemoved' event type.
        if (p.watchDeregistration != null) {
            Map<EventType, Set<Watcher>> materializedWatchers = null;
            try {
                materializedWatchers = p.watchDeregistration.unregister(err);
                for (Entry<EventType, Set<Watcher>> entry : materializedWatchers
                        .entrySet()) {
                    Set<Watcher> watchers = entry.getValue();
                    if (watchers.size() > 0) {
                        queueEvent(p.watchDeregistration.getClientPath(), err,
                                watchers, entry.getKey());
                        // ignore connectionloss when removing from local
                        // session
                        p.replyHeader.setErr(Code.OK.intValue());
                    }
                }
            } catch (KeeperException.NoWatcherException nwe) {
                p.replyHeader.setErr(nwe.code().intValue());
            } catch (KeeperException ke) {
                p.replyHeader.setErr(ke.code().intValue());
            }
        }

        if (p.cb == null) {
            synchronized (p) {
                p.finished = true;
                p.notifyAll();
            }
        } else {
            p.finished = true;
            eventThread.queuePacket(p);
        }
    }

eventThread的queuePacket方法先判断当前是否已经被要求结束,如果是,判断是否还没有停止运行,没有则将packet加入待处理队列,否则直接调用processEvent处理packet
(因为在设置wasKilled为true后,run方法里会将waitingEvents里事件处理完后才设置停止运行并结束线程)
如果wasKilled不为true,处于运行中,直接将packet加入waitingEvents


	public void queuePacket(Packet packet) {
	  if (wasKilled) {
		 synchronized (waitingEvents) {
			if (isRunning) waitingEvents.add(packet);
			else processEvent(packet);
		 }
	  } else {
		 waitingEvents.add(packet);
	  }
	}

eventThread的run方法,循环从waitingEvents里(事件队列,是个阻塞队列)取出待处理的事件
调用processEvent去处理事件,如果收到结束事件(eventOfDeath),设置wasKilled为true,在处理完waitingEvents的事件后,停止运行(isRunnig设置为false)

	public void run() {
	   try {
		  isRunning = true;
		  while (true) {
			 Object event = waitingEvents.take();
			 if (event == eventOfDeath) {
				wasKilled = true;
			 } else {
				processEvent(event);
			 }
			 if (wasKilled)
				synchronized (waitingEvents) {
				   if (waitingEvents.isEmpty()) {
					  isRunning = false;
					  break;
				   }
				}
		  }
	   } catch (InterruptedException e) {
		  LOG.error("Event thread exiting due to interruption", e);
	   }

		LOG.info("EventThread shut down for session: 0x{}",
				 Long.toHexString(getSessionId()));
	}

在查看processEvent方法前,我们先回过头来看执行各个语句的源码

create

	public String create(final String path, byte data[], List<ACL> acl,
            CreateMode createMode, Stat stat, long ttl)
            throws KeeperException, InterruptedException {
        final String clientPath = path;
        PathUtils.validatePath(clientPath, createMode.isSequential());
        EphemeralType.validateTTL(createMode, ttl);

        final String serverPath = prependChroot(clientPath);

        RequestHeader h = new RequestHeader();
        setCreateHeader(createMode, h);
        Create2Response response = new Create2Response();
        if (acl != null && acl.size() == 0) {
            throw new KeeperException.InvalidACLException();
        }
        Record record = makeCreateRecord(createMode, serverPath, data, acl, ttl);
        ReplyHeader r = cnxn.submitRequest(h, record, response, null);
        if (r.getErr() != 0) {
            throw KeeperException.create(KeeperException.Code.get(r.getErr()),
                    clientPath);
        }
        if (stat != null) {
            DataTree.copyStat(response.getStat(), stat);
        }
        if (cnxn.chrootPath == null) {
            return response.getPath();
        } else {
            return response.getPath().substring(cnxn.chrootPath.length());
        }
    }

setData(带回调)

	public void setData(final String path, byte data[], int version,
            StatCallback cb, Object ctx)
    {
        final String clientPath = path;
        PathUtils.validatePath(clientPath);

        final String serverPath = prependChroot(clientPath);

        RequestHeader h = new RequestHeader();
        h.setType(ZooDefs.OpCode.setData);
        SetDataRequest request = new SetDataRequest();
        request.setPath(serverPath);
        request.setData(data);
        request.setVersion(version);
        SetDataResponse response = new SetDataResponse();
        cnxn.queuePacket(h, new ReplyHeader(), request, response, cb,
                clientPath, serverPath, ctx, null);
    }

getData

	public byte[] getData(final String path, Watcher watcher, Stat stat)
        throws KeeperException, InterruptedException
     {
        final String clientPath = path;
        PathUtils.validatePath(clientPath);

        // the watch contains the un-chroot path
        WatchRegistration wcb = null;
        if (watcher != null) {
            wcb = new DataWatchRegistration(watcher, clientPath);
        }

        final String serverPath = prependChroot(clientPath);

        RequestHeader h = new RequestHeader();
        h.setType(ZooDefs.OpCode.getData);
        GetDataRequest request = new GetDataRequest();
        request.setPath(serverPath);
        request.setWatch(watcher != null);
        GetDataResponse response = new GetDataResponse();
        ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
        if (r.getErr() != 0) {
            throw KeeperException.create(KeeperException.Code.get(r.getErr()),
                    clientPath);
        }
        if (stat != null) {
            DataTree.copyStat(response.getStat(), stat);
        }
        return response.getData();
    }
	.....

可以看见都是先进行下路径的校验,然后生成一个RequestHeader,为请求头设置操作的类型
对于有接受stat参数的,最后会调用DataTree.copyStat(response.getStat(), stat)把最新的stat数据复制给传入的stat对象
根据类型生成不同的Record实现类,record接口提供两个方法,序列化与反序列化,一个Reqeust实现类实例以及一个Response实现类实例
当有回调函数时,将回调函数传入
最终都是调用ClientCnxn的submitRequest方法,获得ReplyHeader(实际的response数据会装载到传入的response对象里)
ReplyHeader只保存了是否出现失败,以及错误码和路径等信息,不保存实际返回的响应数据
submitRequest实际会调用queuePacket方法去发送给服务端,并获得及时响应的packet对象
因为是异步发送,后面会根据是否设置了超时时间选择调用wait阻塞等待或者等待超时设置replyHeader失败并返回

    public ReplyHeader submitRequest(RequestHeader h, Record request,
            Record response, WatchRegistration watchRegistration)
            throws InterruptedException {
        return submitRequest(h, request, response, watchRegistration, null);
    }

    public ReplyHeader submitRequest(RequestHeader h, Record request,
            Record response, WatchRegistration watchRegistration,
            WatchDeregistration watchDeregistration)
            throws InterruptedException {
        ReplyHeader r = new ReplyHeader();
        Packet packet = queuePacket(h, r, request, response, null, null, null,
                null, watchRegistration, watchDeregistration);
        synchronized (packet) {
            if (requestTimeout > 0) {
                // Wait for request completion with timeout
                waitForPacketFinish(r, packet);
            } else {
                // Wait for request completion infinitely
                while (!packet.finished) {
                    packet.wait();
                }
            }
        }
        if (r.getErr() == Code.REQUESTTIMEOUT.intValue()) {
            sendThread.cleanAndNotifyState();
        }
        return r;
    }

queuePacket实际将请求信息封装为一个packet,并将packet加入outgoingQueue队列,由sendThread去处理,返回packet对象

	private final LinkedBlockingDeque<Packet> outgoingQueue = new LinkedBlockingDeque<Packet>();
	public Packet queuePacket(RequestHeader h, ReplyHeader r, Record request,
            Record response, AsyncCallback cb, String clientPath,
            String serverPath, Object ctx, WatchRegistration watchRegistration,
            WatchDeregistration watchDeregistration) {
        Packet packet = null;

        // Note that we do not generate the Xid for the packet yet. It is
        // generated later at send-time, by an implementation of ClientCnxnSocket::doIO(),
        // where the packet is actually sent.
        packet = new Packet(h, r, request, response, watchRegistration);
        packet.cb = cb;
        packet.ctx = ctx;
        packet.clientPath = clientPath;
        packet.serverPath = serverPath;
        packet.watchDeregistration = watchDeregistration;
        // The synchronized block here is for two purpose:
        // 1. synchronize with the final cleanup() in SendThread.run() to avoid race
        // 2. synchronized against each packet. So if a closeSession packet is added,
        // later packet will be notified.
        synchronized (state) {
            if (!state.isAlive() || closing) {
                conLossPacket(packet);
            } else {
                // If the client is asking to close the session then
                // mark as closing
                if (h.getType() == OpCode.closeSession) {
                    closing = true;
                }
                outgoingQueue.add(packet);
            }
        }
        sendThread.getClientCnxnSocket().packetAdded();
        return packet;
    }

processEvent方法根据event的类型,执行不同操作
如果是本地监听事件,直接调用注册的各个watcher的process方法
如果是本地的callBack回调函数类型,则使用instanceOf判断以后,使用不同的回调函数强制转换去执行processResult(从目前的源码分析暂时没看到有使用,可能是nio的socket实现类,这里只分析了默认的实现类ClientCnxnSocketNetty)
如果是response类型,使用instanceOf判断以后,选择具体的实现类去执行processResult操作

	private void processEvent(Object event) {
	  try {
		  if (event instanceof WatcherSetEventPair) {
			  // each watcher will process the event
			  WatcherSetEventPair pair = (WatcherSetEventPair) event;
			  for (Watcher watcher : pair.watchers) {
				  try {
					  watcher.process(pair.event);
				  } catch (Throwable t) {
					  LOG.error("Error while calling watcher ", t);
				  }
			  }
			} else if (event instanceof LocalCallback) {
				LocalCallback lcb = (LocalCallback) event;
				if (lcb.cb instanceof StatCallback) {
					((StatCallback) lcb.cb).processResult(lcb.rc, lcb.path,
							lcb.ctx, null);
				} else if (lcb.cb instanceof DataCallback) {
					((DataCallback) lcb.cb).processResult(lcb.rc, lcb.path,
							lcb.ctx, null, null);
				} else if (lcb.cb instanceof ACLCallback) {
					((ACLCallback) lcb.cb).processResult(lcb.rc, lcb.path,
							lcb.ctx, null, null);
				} else if (lcb.cb instanceof ChildrenCallback) {
					((ChildrenCallback) lcb.cb).processResult(lcb.rc,
							lcb.path, lcb.ctx, null);
				} else if (lcb.cb instanceof Children2Callback) {
					((Children2Callback) lcb.cb).processResult(lcb.rc,
							lcb.path, lcb.ctx, null, null);
				} else if (lcb.cb instanceof StringCallback) {
					((StringCallback) lcb.cb).processResult(lcb.rc,
							lcb.path, lcb.ctx, null);
				} else {
					((VoidCallback) lcb.cb).processResult(lcb.rc, lcb.path,
							lcb.ctx);
				}
			} else {
			  Packet p = (Packet) event;
			  int rc = 0;
			  String clientPath = p.clientPath;
			  if (p.replyHeader.getErr() != 0) {
				  rc = p.replyHeader.getErr();
			  }
			  if (p.cb == null) {
				  LOG.warn("Somehow a null cb got to EventThread!");
			  } else if (p.response instanceof ExistsResponse
					  || p.response instanceof SetDataResponse
					  || p.response instanceof SetACLResponse) {
				  StatCallback cb = (StatCallback) p.cb;
				  if (rc == 0) {
					  if (p.response instanceof ExistsResponse) {
						  cb.processResult(rc, clientPath, p.ctx,
								  ((ExistsResponse) p.response)
										  .getStat());
					  } else if (p.response instanceof SetDataResponse) {
						  cb.processResult(rc, clientPath, p.ctx,
								  ((SetDataResponse) p.response)
										  .getStat());
					  } else if (p.response instanceof SetACLResponse) {
						  cb.processResult(rc, clientPath, p.ctx,
								  ((SetACLResponse) p.response)
										  .getStat());
					  }
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null);
				  }
			  } else if (p.response instanceof GetDataResponse) {
				  DataCallback cb = (DataCallback) p.cb;
				  GetDataResponse rsp = (GetDataResponse) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx, rsp
							  .getData(), rsp.getStat());
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null,
							  null);
				  }
			  } else if (p.response instanceof GetACLResponse) {
				  ACLCallback cb = (ACLCallback) p.cb;
				  GetACLResponse rsp = (GetACLResponse) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx, rsp
							  .getAcl(), rsp.getStat());
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null,
							  null);
				  }
			  } else if (p.response instanceof GetChildrenResponse) {
				  ChildrenCallback cb = (ChildrenCallback) p.cb;
				  GetChildrenResponse rsp = (GetChildrenResponse) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx, rsp
							  .getChildren());
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null);
				  }
			  } else if (p.response instanceof GetChildren2Response) {
				  Children2Callback cb = (Children2Callback) p.cb;
				  GetChildren2Response rsp = (GetChildren2Response) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx, rsp
							  .getChildren(), rsp.getStat());
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null, null);
				  }
			  } else if (p.response instanceof CreateResponse) {
				  StringCallback cb = (StringCallback) p.cb;
				  CreateResponse rsp = (CreateResponse) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx,
							  (chrootPath == null
									  ? rsp.getPath()
									  : rsp.getPath()
								.substring(chrootPath.length())));
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null);
				  }
			  } else if (p.response instanceof Create2Response) {
				  Create2Callback cb = (Create2Callback) p.cb;
				  Create2Response rsp = (Create2Response) p.response;
				  if (rc == 0) {
					  cb.processResult(rc, clientPath, p.ctx,
							  (chrootPath == null
									  ? rsp.getPath()
									  : rsp.getPath()
								.substring(chrootPath.length())), rsp.getStat());
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null, null);
				  }                   
			  } else if (p.response instanceof MultiResponse) {
				  MultiCallback cb = (MultiCallback) p.cb;
				  MultiResponse rsp = (MultiResponse) p.response;
				  if (rc == 0) {
					  List<OpResult> results = rsp.getResultList();
					  int newRc = rc;
					  for (OpResult result : results) {
						  if (result instanceof ErrorResult
								  && KeeperException.Code.OK.intValue() != (newRc = ((ErrorResult) result)
								  .getErr())) {
							  break;
						  }
					  }
					  cb.processResult(newRc, clientPath, p.ctx, results);
				  } else {
					  cb.processResult(rc, clientPath, p.ctx, null);
				  }
			  }  else if (p.cb instanceof VoidCallback) {
				  VoidCallback cb = (VoidCallback) p.cb;
				  cb.processResult(rc, clientPath, p.ctx);
			  }
		  }
	  } catch (Throwable t) {
		  LOG.error("Caught unexpected throwable", t);
	  }
   }

当在finishPacket执行完notify后,就会唤醒前面的对应函数,像是getData的,会判断返回头信息是否需要抛出异常
如果没有异常,则装载stat并调用response的getData获得服务端返回的数据并返回

	ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
	if (r.getErr() != 0) {
		throw KeeperException.create(KeeperException.Code.get(r.getErr()),
				clientPath);
	}
	if (stat != null) {
		DataTree.copyStat(response.getStat(), stat);
	}
	return response.getData();

吐槽一下,zk里面使用netty做异步io,源码跳来跳去的,这博客1137行(–),前面一篇选举算法700多行

更多精彩内容

请移步

个人主页: yangyitao.top

发布了35 篇原创文章 · 获赞 12 · 访问量 5026

猜你喜欢

转载自blog.csdn.net/weixin_44627989/article/details/97995152