Zookeeper ----- 系统模型

数据模型

Zookeeper的数据模型与文件系统非常相似,唯一不同的它的每个节点(ZNode)都可以存放数据,无论父节点还是子节点。

事务ID

即前面提到的ZXID。对每个事务请求,Zookeeper都会分配一个ZXID,保证操作的全局顺序。

节点类型

  1. 持久节点:创建后一直存在,直到被删除
  2. 临时节点:当会话结束或超时就会消失
  3. 有序节点:在给定的节点名后面加上一个有序的数字后缀,这个后缀的上限是整型的最大值

节点的状态

节点的状态信息定于为Stat类,基本属性如下:

版本号-----保证分布式数据的原子操作

上面节点状态属性中的version、cversion、aversion就是Zookeeper利用乐观锁机制来保证原子操作的属性。
Zookeeper服务器的PrepRequestProcessor处理器类中,处理每个数据更新请求(setDataRequest)时,进行如下操作:

            zks.sessionTracker.checkSession(request.sessionId, request.getOwner());
            SetDataRequest setDataRequest = (SetDataRequest)record;
            if(deserialize)
                ByteBufferInputStream.byteBuffer2Record(request.request, setDataRequest);
            path = setDataRequest.getPath();
            validatePath(path, request.sessionId);
            nodeRecord = getRecordForPath(path);
            checkACL(zks, nodeRecord.acl, ZooDefs.Perms.WRITE,
                    request.authInfo);
            //使用乐观锁检查version
            version = setDataRequest.getVersion();
            int currentVersion = nodeRecord.stat.getVersion();
            if (version != -1 && version != currentVersion) {
                throw new KeeperException.BadVersionException(path);
            }
            version = currentVersion + 1;
            request.txn = new SetDataTxn(path, setDataRequest.getData(), version);
            nodeRecord = nodeRecord.duplicate(request.hdr.getZxid());
            nodeRecord.stat.setVersion(version);
            addChangeRecord(nodeRecord);

ACL-----保证数据安全

权限模式(Scheme):

  1. iP:"ip:192.168.0.12"表示针对这个ip进行权限控制,"ip:192.168.0.1/24"表示对192.168.0.*这个网段控制
  2. Digest:以"username:password"来标识,Zookeeper会对其进行两次编码----SHA-1和BASE64
  3. World:对所有用户开放
  4. Super:超级管理员,可以对任何数据操作,启动时配置-Dzookeeper.DigestAuthenticationProvider.superDigest=super:password,password需要经过编码

授权对象(ID):

权限(Permission):

  1. CREATE:子节点的创建权限
  2. DELETE:子节点的删除权限
  3. READ:读取权限
  4. WRITE:更新权限
  5. ADMIN:ACL操作权限

watcher机制

总体概况为:客户端注册watcher、服务端处理watcher、客户端回调watcher。

1.客户端注册watcher

以getData为例:
1.标记request,封装watcher到WatcherRegister

public byte[] getData(String path, Watcher watcher, Stat stat) throws KeeperException, InterruptedException {
    ....
    ZooKeeper.WatchRegistration wcb = null;
    if (watcher != null) {
        wcb = new ZooKeeper.DataWatchRegistration(watcher, path);
    }
    ....
    request.setWatch(watcher != null);
    GetDataResponse response = new GetDataResponse();
    ReplyHeader r = this.cnxn.submitRequest(h, request, response, wcb);
    ....
}

2.将request封装为Packet(通讯的最小单元)放入发送队列发送,等待服务端响应

public ReplyHeader submitRequest(RequestHeader h, Record request, Record response, WatchRegistration watchRegistration, WatchDeregistration watchDeregistration) throws InterruptedException {
    ReplyHeader r = new ReplyHeader();
    ClientCnxn.Packet packet = this.queuePacket(h, r, request, response, (AsyncCallback)null, (String)null, (String)null, (Object)null, watchRegistration, watchDeregistration);
    synchronized(packet) {
        while(!packet.finished) {
            packet.wait();
        }

        return r;
    }
}

3.客户端的sendThread的readResqponse()负责接收响应,finishPacket方法将watcher注册到ZKWatcherManager中

private void finishPacket(ClientCnxn.Packet p) {
    int err = p.replyHeader.getErr();
    if (p.watchRegistration != null) {
        p.watchRegistration.register(err);
    }
    ......
}

2.服务端处理watcher

服务端处理分为ServerCnxn(与客户端的连接)存储和watcher触发

2.1ServerCnxn存储

1.FinalRequestProcessor的processRequest会判断是否要注册watcher

        case OpCode.getData: {
            lastOp = "GETD";
            GetDataRequest getDataRequest = new GetDataRequest();
            ByteBufferInputStream.byteBuffer2Record(request.request,
                    getDataRequest);
            DataNode n = zks.getZKDatabase().getNode(getDataRequest.getPath());
            if (n == null) {
                throw new KeeperException.NoNodeException();
            }
            PrepRequestProcessor.checkACL(zks, zks.getZKDatabase().aclForNode(n),
                    ZooDefs.Perms.READ,
                    request.authInfo);
            Stat stat = new Stat();
            byte b[] = zks.getZKDatabase().getData(getDataRequest.getPath(), stat,
                    getDataRequest.getWatch() ? cnxn : null);
            rsp = new GetDataResponse(b, stat);
            break;
        }

2.getDataRequest.getWatch()为true会将ServerCnxn存储到WatcherManager中
watchManager是Zk服务器端Watcher的管理者,从两个维度维护watcher:

  1. watchTable从数据节点的粒度来维护
  2. watch2Paths从watcher的粒度来维护

2.2watcher触发

当节点数据改变时将调用watcherManager的triggerWatch方法向客户端发送通知

public Set<Watcher> triggerWatch(String path, EventType type, Set<Watcher> supress) {
    //1.封装watchedEvent
    WatchedEvent e = new WatchedEvent(type,
            KeeperState.SyncConnected, path);
    HashSet<Watcher> watchers;
    //2.查询watcher
    synchronized (this) {
        watchers = watchTable.remove(path);
        if (watchers == null || watchers.isEmpty()) {
            if (LOG.isTraceEnabled()) {
                ZooTrace.logTraceMessage(LOG,
                        ZooTrace.EVENT_DELIVERY_TRACE_MASK,
                        "No watchers for " + path);
            }
            return null;
        }
        for (Watcher w : watchers) {
            HashSet<String> paths = watch2Paths.get(w);
            if (paths != null) {
                paths.remove(path);
            }
        }
    }
    for (Watcher w : watchers) {
        if (supress != null && supress.contains(w)) {
            continue;
        }
        //3.获取ServerCnxn,向客户端发送通知
        w.process(e);
    }
    return watchers;
}

3.客户端回调watcher

1.SendThread接收通知

        else if (replyHdr.getXid() == -1) {//-1代表这是通知
            if (ClientCnxn.LOG.isDebugEnabled()) {
                ClientCnxn.LOG.debug("Got notification sessionid:0x" + Long.toHexString(ClientCnxn.this.sessionId));
            }
            //1.反序列化
            WatcherEvent event = new WatcherEvent();
            event.deserialize(bbia, "response");
            //2.相对路径处理
            if (ClientCnxn.this.chrootPath != null) {
                String serverPath = event.getPath();
                if (serverPath.compareTo(ClientCnxn.this.chrootPath) == 0) {
                    event.setPath("/");
                } else if (serverPath.length() > ClientCnxn.this.chrootPath.length()) {
                    event.setPath(serverPath.substring(ClientCnxn.this.chrootPath.length()));
                } else {
                    ClientCnxn.LOG.warn("Got server path " + event.getPath() + " which is too short for chroot path " + ClientCnxn.this.chrootPath);
                }
            }
            //3.还原watchedEvent
            WatchedEvent we = new WatchedEvent(event);
            if (ClientCnxn.LOG.isDebugEnabled()) {
                ClientCnxn.LOG.debug("Got " + we + " for sessionid 0x" + Long.toHexString(ClientCnxn.this.sessionId));
            }
            //4.交给eventThread回调watcher
            ClientCnxn.this.eventThread.queueEvent(we);
        }

2.调用EventThread的queueEvent方法从ZKWatcherManager获取watcher入队

    private void queueEvent(WatchedEvent event, Set<Watcher> materializedWatchers) {
        if (event.getType() != EventType.None || this.sessionState != event.getState()) {
            this.sessionState = event.getState();
            Object watchers;
            if (materializedWatchers == null) {
                //从ZKWatcherManager获取watcher
                watchers = ClientCnxn.this.watcher.materialize(event.getState(), event.getType(), event.getPath());
            } else {
                watchers = new HashSet();
                ((Set)watchers).addAll(materializedWatchers);
            }

            ClientCnxn.WatcherSetEventPair pair = new ClientCnxn.WatcherSetEventPair((Set)watchers, event);
            //入队等待run方法处理
            this.waitingEvents.add(pair);
        }
    }

3.EventThread的run方法串行调用队列中的事件包含的watcher的process方法

public void run() {
        try {
            this.isRunning = true;

            while(true) {
                Object event = this.waitingEvents.take();
                if (event == ClientCnxn.this.eventOfDeath) {
                    this.wasKilled = true;
                } else {
                    this.processEvent(event);
                }

                if (this.wasKilled) {
                    LinkedBlockingQueue var2 = this.waitingEvents;
                    synchronized(this.waitingEvents) {
                        if (this.waitingEvents.isEmpty()) {
                            this.isRunning = false;
                            break;
                        }
                    }
                }
            }
        } catch (InterruptedException var5) {
            ClientCnxn.LOG.error("Event thread exiting due to interruption", var5);
        }

        ClientCnxn.LOG.info("EventThread shut down for session: 0x{}", Long.toHexString(ClientCnxn.this.getSessionId()));
    }

    private void processEvent(Object event) {
        try {
            if (event instanceof ClientCnxn.WatcherSetEventPair) {
                ClientCnxn.WatcherSetEventPair pair = (ClientCnxn.WatcherSetEventPair)event;
                Iterator i$ = pair.watchers.iterator();

                while(i$.hasNext()) {
                    Watcher watcher = (Watcher)i$.next();

                    try {
                        watcher.process(pair.event);
                    } catch (Throwable var11) {
                        ClientCnxn.LOG.error("Error while calling watcher ", var11);
                    }
                }
            } 
            ......

    }

4.watcher特性

  1. 一次性:客户端和服务端都清除watcher
  2. 客户端串行执行
  3. 轻量:只告诉发生什么事件,不告诉变化的数据

参考资料

从 Paxos 到 Zookeeper——分布式一致性原理和实践

猜你喜欢

转载自www.cnblogs.com/wuweishuo/p/10680776.html