【Soul源码阅读】13.soul-admin 与 soul-bootstrap 同步机制之 http 长轮询解析(下)

目录

1.前情回顾

2.soul-bootstrap 长轮询任务

3.soul-admin 中 /configs/listener 接口

3.1比对数据是否有变化

3.2阻塞并监听变化


1.前情回顾

书接上文,昨天的文章【Soul源码阅读】12.soul-admin 与 soul-bootstrap 同步机制之 http 长轮询解析(上) 4.2小节的坑(不开 zk 网关不能启动的错误),没研究明白,最后放弃了,把 soul-admin 和 soul-bootstrap 都关掉,把 soul 数据库删掉,然后依次重启 soul-admin 和 soul-bootstrap,项目正常启动。还是重置大法好啊,不过那个数据库我备份了,等后面把几个不同的场景理解更透彻了,再拿出来研究下,我们这里继续我们的源码阅读之旅。

2.soul-bootstrap 长轮询任务

这里接昨天文章的 3.2.2 小节,昨天只分析了获取所有配置数据,后面为每个 soul-admin 分别创建各自的线程,执行 HttpLongPollingTask 任务。

// 开启 http 长轮询,每一个 soul-admin 创建一个线程去监听变化
this.serverList.forEach(server -> this.executor.execute(new HttpLongPollingTask(server)));

 让我们一起去看看这个 Http 长轮询任务具体是什么吧,废话不多说,直接上代码:

// HttpSyncDataService.java
    class HttpLongPollingTask implements Runnable {

        private String server;
        
        // 尝试次数,默认为3
        private final int retryTimes = 3;

        HttpLongPollingTask(final String server) {
            this.server = server;
        }

        @Override
        public void run() {
            while (RUNNING.get()) {
                for (int time = 1; time <= retryTimes; time++) {
                    try {
                        // 真正执行逻辑封装
                        doLongPolling(server);
                    } catch (Exception e) {
                        // print warnning log.
                        if (time < retryTimes) {
                            log.warn("Long polling failed, tried {} times, {} times left, will be suspended for a while! {}",
                                    time, retryTimes - time, e.getMessage());
                            ThreadUtils.sleep(TimeUnit.SECONDS, 5);
                            continue;
                        }
                        // print error, then suspended for a while.
                        log.error("Long polling failed, try again after 5 minutes!", e);
                        ThreadUtils.sleep(TimeUnit.MINUTES, 5);
                    }
                }
            }
            log.warn("Stop http long polling.");
        }
    }

核心方法 run,第一行 while 循环,条件是 RUNNING.get(),找到了定义和改变状态的方法,就是当关闭时,任务关掉,线程池也关闭掉:

public class HttpSyncDataService implements SyncDataService, AutoCloseable {

    private static final AtomicBoolean RUNNING = new AtomicBoolean(false);

    @Override
    public void close() throws Exception {
        RUNNING.set(false);
        if (executor != null) {
            executor.shutdownNow();
            // help gc
            executor = null;
        }
    }

...

}

真正执行长连接逻辑:

// HttpSyncDataService.java
    @SuppressWarnings("unchecked")
    private void doLongPolling(final String server) {
        MultiValueMap<String, String> params = new LinkedMultiValueMap<>(8);
        for (ConfigGroupEnum group : ConfigGroupEnum.values()) {
            // 根据类型从缓存中获取对应类型的同步
            ConfigData<?> cacheConfig = factory.cacheConfigData(group);
            // MD5 + 最后更新时间 拼接
            String value = String.join(",", cacheConfig.getMd5(), String.valueOf(cacheConfig.getLastModifyTime()));
            params.put(group.name(), Lists.newArrayList(value));
        }
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.APPLICATION_FORM_URLENCODED);
        // 把 params 作为 body 体
        HttpEntity httpEntity = new HttpEntity(params, headers);
        // 拼接调用接口 http://localhost:9095/configs/listener
        String listenerUrl = server + "/configs/listener";
        log.debug("request listener configs: [{}]", listenerUrl);
        JsonArray groupJson = null;
        try {
            // 通过 RestTemplate 向接口发送 POST 请求
            String json = this.httpClient.postForEntity(listenerUrl, httpEntity, String.class).getBody();
            log.debug("listener result: [{}]", json);
            groupJson = GSON.fromJson(json, JsonObject.class).getAsJsonArray("data");
        } catch (RestClientException e) {
            String message = String.format("listener configs fail, server:[%s], %s", server, e.getMessage());
            throw new SoulException(message, e);
        }
        if (groupJson != null) {
            // fetch group configuration async.
            ConfigGroupEnum[] changedGroups = GSON.fromJson(groupJson, ConfigGroupEnum[].class);
            if (ArrayUtils.isNotEmpty(changedGroups)) {
                log.info("Group config changed: {}", Arrays.toString(changedGroups));
                // 如果返回消息中有变化的数据,会主动拉取
                this.doFetchGroupConfig(server, changedGroups);
            }
        }
    }

for 循环执行完时 params 的数据如下图:

向 soul-admin 发送 POST 请求后,断点迟迟没有相应,过了好长时间才返回如下成功信息,感觉这个接口事有蹊跷,先把这个方法看完,马上就去分析。

{"code":200,"message":"success","data":[]}

如果返回消息中有变化的数据,会通过前面那个接口主动拉取有变化数据对应的类型,而不是所有5种类型:

// HttpSyncDataService.java
    private void doFetchGroupConfig(final String server, final ConfigGroupEnum... groups) {
        StringBuilder params = new StringBuilder();
        for (ConfigGroupEnum groupKey : groups) {
            params.append("groupKeys").append("=").append(groupKey.name()).append("&");
        }
        String url = server + "/configs/fetch?" + StringUtils.removeEnd(params.toString(), "&");
        log.info("request configs: [{}]", url);
        String json = null;
        try {
            json = this.httpClient.getForObject(url, String.class);
        } catch (RestClientException e) {
            String message = String.format("fetch config fail from server[%s], %s", url, e.getMessage());
            log.warn(message);
            throw new SoulException(message, e);
        }
        // update local cache
        // 更新本地缓存,这个方法在昨天已经分析过了,最终会调用到 dataRefresh.refresh(data) 这个模板方法
        boolean updated = this.updateCacheWithJson(json);
        if (updated) {
            log.info("get latest configs: [{}]", json);
            return;
        }
        // not updated. it is likely that the current config server has not been updated yet. wait a moment.
        log.info("The config of the server[{}] has not been updated or is out of date. Wait for 30s to listen for changes again.", server);
        // 休眠 30 秒
        ThreadUtils.sleep(TimeUnit.SECONDS, 30);
    }

好的,到这里 soul-bootstrap 端的长轮询任务就分析完了。

刚才有一个类似于阻塞的接口调用,我们到 soul-admin 端看看这个接口中有什么幺蛾子。

3.soul-admin 中 /configs/listener 接口

查询 "/listener",找到了 ConfigController

@ConditionalOnBean(HttpLongPollingDataChangedListener.class)
@RestController
@RequestMapping("/configs")
@Slf4j
public class ConfigController {
...

    /**
     * Listener.
     *
     * @param request  the request
     * @param response the response
     */
    @PostMapping(value = "/listener")
    public void listener(final HttpServletRequest request, final HttpServletResponse response) {
        longPollingListener.doLongPolling(request, response);
    }

...
}

 下面这个方法的注释明确了2点:

1.如果配置数据变化了,这个类型的变化信息会立即响应。
2.否则,这个客户端请求线程会被阻塞,直到任意数据变化了,或者指定的超时时间到了。

// HttpLongPollingDataChangedListener.java
/**
     * If the configuration data changes, the group information for the change is immediately responded.
     * Otherwise, the client's request thread is blocked until any data changes or the specified timeout is reached.
     *
     * @param request  the request
     * @param response the response
     */
    public void doLongPolling(final HttpServletRequest request, final HttpServletResponse response) {

        // compare group md5
        List<ConfigGroupEnum> changedGroup = compareChangedGroup(request);
        String clientIp = getRemoteIp(request);

        // response immediately.
        // 因为数据变化了,立即响应
        if (CollectionUtils.isNotEmpty(changedGroup)) {
            this.generateResponse(response, changedGroup);
            log.info("send response with the changed group, ip={}, group={}", clientIp, changedGroup);
            return;
        }

        // listen for configuration changed.
        final AsyncContext asyncContext = request.startAsync();

        // AsyncContext.settimeout() does not timeout properly, so you have to control it yourself
        asyncContext.setTimeout(0L);

        // block client's thread.
        scheduler.execute(new LongPollingClient(asyncContext, clientIp, HttpConstants.SERVER_MAX_HOLD_TIMEOUT));
    }

3.1比对数据是否有变化

这里将发送过来的数据与当前缓存中数据进行比对,看看是否有变化的数据,逻辑如下:

// HttpLongPollingDataChangedListener.java
    private List<ConfigGroupEnum> compareChangedGroup(final HttpServletRequest request) {
        List<ConfigGroupEnum> changedGroup = new ArrayList<>(ConfigGroupEnum.values().length);
        for (ConfigGroupEnum group : ConfigGroupEnum.values()) {
            // md5,lastModifyTime
            // 解析发送过来的 body 体
            String[] params = StringUtils.split(request.getParameter(group.name()), ',');
            if (params == null || params.length != 2) {
                throw new SoulException("group param invalid:" + request.getParameter(group.name()));
            }
            String clientMd5 = params[0];
            long clientModifyTime = NumberUtils.toLong(params[1]);
            ConfigDataCache serverCache = CACHE.get(group.name());
            // do check.
            if (this.checkCacheDelayAndUpdate(serverCache, clientMd5, clientModifyTime)) {
                changedGroup.add(group);
            }
        }
        return changedGroup;
    }

/**
     * check whether the client needs to update the cache.
     * @param serverCache the admin local cache
     * @param clientMd5 the client md5 value
     * @param clientModifyTime the client last modify time
     * @return true: the client needs to be updated, false: not need.
     */
    private boolean checkCacheDelayAndUpdate(final ConfigDataCache serverCache, final String clientMd5, final long clientModifyTime) {
        // is the same, doesn't need to be updated
        // MD5 值相同,没有变化,无需更新
        if (StringUtils.equals(clientMd5, serverCache.getMd5())) {
            return false;
        }
        // if the md5 value is different, it is necessary to compare lastModifyTime.
        // 到这里 MD5 值就不同了,有变化,需要更新
        long lastModifyTime = serverCache.getLastModifyTime();
        if (lastModifyTime >= clientModifyTime) {
            // the client's config is out of date.
            return true;
        }
        // the lastModifyTime before client, then the local cache needs to be updated.
        // Considering the concurrency problem, admin must lock,
        // otherwise it may cause the request from soul-web to update the cache concurrently, causing excessive db pressure
        boolean locked = false;
        try {
            locked = LOCK.tryLock(5, TimeUnit.SECONDS);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            return true;
        }
        if (locked) {
            try {
                ConfigDataCache latest = CACHE.get(serverCache.getGroup());
                if (latest != serverCache) {
                    // the cache of admin was updated. if the md5 value is the same, there's no need to update.
                    return !StringUtils.equals(clientMd5, latest.getMd5());
                }
                // load cache from db.
                // 从数据库中捞数据更新本地缓存
                this.refreshLocalCache();
                latest = CACHE.get(serverCache.getGroup());
                return !StringUtils.equals(clientMd5, latest.getMd5());
            } finally {
                LOCK.unlock();
            }
        }
        // not locked, the client need to be updated.
        return true;
    }

3.2阻塞并监听变化

// HttpLongPollingDataChangedListener.java
/**
     * If you exceed {@link HttpConstants#SERVER_MAX_HOLD_TIMEOUT} and still have no data change,
     * empty data is returned. If the data changes within this time frame, the DataChangeTask
     * cancellations the timed task and responds to the changed group data.
     */
    class LongPollingClient implements Runnable {

        /**
         * The Async context.
         */
        private final AsyncContext asyncContext;

        /**
         * The Ip.
         */
        private final String ip;

        /**
         * The Timeout time.
         */
        private final long timeoutTime;

        /**
         * The Async timeout future.
         */
        private Future<?> asyncTimeoutFuture;

        /**
         * Instantiates a new Long polling client.
         *
         * @param ac          the ac
         * @param ip          the ip
         * @param timeoutTime the timeout time
         */
        LongPollingClient(final AsyncContext ac, final String ip, final long timeoutTime) {
            this.asyncContext = ac;
            this.ip = ip;
            this.timeoutTime = timeoutTime;
        }

        @Override
        public void run() {
            this.asyncTimeoutFuture = scheduler.schedule(() -> {
                clients.remove(LongPollingClient.this);
                List<ConfigGroupEnum> changedGroups = compareChangedGroup((HttpServletRequest) asyncContext.getRequest());
                sendResponse(changedGroups);
            }, timeoutTime, TimeUnit.MILLISECONDS);
            clients.add(this);
        }

        /**
         * Send response.
         *
         * @param changedGroups the changed groups
         */
        void sendResponse(final List<ConfigGroupEnum> changedGroups) {
            // cancel scheduler
            if (null != asyncTimeoutFuture) {
                asyncTimeoutFuture.cancel(false);
            }
            generateResponse((HttpServletResponse) asyncContext.getResponse(), changedGroups);
            asyncContext.complete();
        }
    }



    /**
     * Send response datagram.
     *
     * @param response      the response
     * @param changedGroups the changed groups
     */
    private void generateResponse(final HttpServletResponse response, final List<ConfigGroupEnum> changedGroups) {
        try {
            response.setHeader("Pragma", "no-cache");
            response.setDateHeader("Expires", 0);
            response.setHeader("Cache-Control", "no-cache,no-store");
            response.setContentType(MediaType.APPLICATION_JSON_VALUE);
            response.setStatus(HttpServletResponse.SC_OK);
            response.getWriter().println(GsonUtils.getInstance().toJson(SoulAdminResult.success(SoulResultMessage.SUCCESS, changedGroups)));
        } catch (IOException ex) {
            log.error("Sending response failed.", ex);
        }
    }

这里把请求如何阻塞住还是有点儿懵,带我去研究研究大佬的分析,然后再补上吧。

猜你喜欢

转载自blog.csdn.net/hellboy0621/article/details/113287551