ShardingSphere源码解析之执行引擎（一）

从今天开始，我们将进入一个全新的模块，即ShardingSphere的执行引擎（ExecuteEngine）模块。执行引擎负责获取从路由和改写引擎中所生成的SQL并完成在具体数据库中的执行。执行引擎是ShardingSphere的核心模块，我们将花比较长的篇幅对其进行全面介绍。

1. 执行引擎总体结构

在讲解具体的代码之前，我们还是从《ShardingSphere源码解析之路由引擎（七）》中所介绍的PreparedQueryShardingEngine和SimpleQueryShardingEngine这两个类出发，看看它们被引用的入口。我们在ShardingPreparedStatement类中找到了如下shard方法，这里的shardingEngine是PreparedQueryShardingEngine：

private void shard() {

sqlRouteResult = shardingEngine.shard(sql, getParameters());

}

然后，我们在ShardingStatement中也找到了如下方法，显然这里用到了SimpleQueryShardingEngine：

private void shard(final String sql) {

ShardingRuntimeContext runtimeContext = connection.getRuntimeContext();

SimpleQueryShardingEngine shardingEngine = new SimpleQueryShardingEngine(runtimeContext.getRule(), runtimeContext.getProps(), runtimeContext.getMetaData(), runtimeContext.getParseEngine());

sqlRouteResult = shardingEngine.shard(sql, Collections.emptyList());

}

从设计模式上讲，ShardingStatement和ShardingPreparedStatement实际上就是很典型的门面（Facade）类，把与SQL路由和执行的入口类都整合在一起。因此，我们可以在这两个门面类中寻找执行引擎相关的入口类。通过阅读源码，我们不难发现在ShardingStatement中存在一个StatementExecutor，而在ShardingPreparedStatement中也存在PreparedStatementExecutor和BatchPreparedStatementExecutor，这些类都以Executor（执行器）结尾，显然就是我们要找的SQL执行引擎的入口类。

我们发现上述三个Executor都位于sharding-jdbc-core工程中。但是还有一个与sharding-core-route和sharding-core-rewrite并列的sharding-core-execute工程，从命名上看，这个工程应该也与执行引擎相关。果然，我们在这个工程中找到了ShardingExecuteEngine类，这是分片执行引擎的入口类。然后，我们又分别找到了SQLExecuteTemplate和SQLExecutePrepareTemplate类，这两个是典型的SQL执行模板类。

根据到目前为止对ShardingSphere代码风格的了解，可以想象，在层次关系上，ShardingExecuteEngine是底层对象，SQLExecuteTemplate应该依赖于ShardingExecuteEngine。而StatementExecutor、PreparedStatementExecutor和BatchPreparedStatementExecutor属于上层对象，应该依赖于SQLExecuteTemplate。我们通过简单阅读这些核心类之前的引用关系，印证了这种猜想。

基于以上分析，我们可以给出如下所示的SQL执行引擎整体结构图，其中直线以上部分位于sharding-core-execute工程，属于底层组件。而直线以下部分位于sharding-jdbc-core中，属于上层组件：

我们在上图中还看到SQLExecuteCallback和SQLExecutePrepareCallback，显然，它们的作用是完成SQL执行过程中的回调处理，这也是一种非常典型的扩展性处理方式。

2. ShardingExecuteEngine

按照惯例，我们还是从位于底层的ShardingExecuteEngine开始切入。与路由和改写引擎不同，ShardingExecuteEngine是ShardingSphere中唯一的一个执行引擎，所以直接设计为一个类而接口。这个类包含了如下的变量和构造函数：

private final ShardingExecutorService shardingExecutorService;

private ListeningExecutorService executorService;

public ShardingExecuteEngine(final int executorSize) {

shardingExecutorService = new ShardingExecutorService(executorSize);

executorService = shardingExecutorService.getExecutorService();

}

看到这里有两个以ExecutorService结尾的变量，显然从命名上我们不难看出它们都是执行器服务，与JDK中的java.util.concurrent.ExecutorService类似。其中ListeningExecutorService 来自google工具包guava，ShardingExecutorService是自定义类，而ShardingExecutorService包含了ListeningExecutorService的构建过程。我们首先来看一下ShardingExecutorService。

我们发现ShardingExecutorService中包含了一个JDK的ExecutorService，它的创建过程如下所示，这里用到的就是JDK中的常见做法：

private ExecutorService getExecutorService(final int executorSize, final String nameFormat) {

ThreadFactory shardingThreadFactory = ShardingThreadFactoryBuilder.build(nameFormat);

return 0 == executorSize ? Executors.newCachedThreadPool(shardingThreadFactory) : Executors.newFixedThreadPool(executorSize, shardingThreadFactory);

}

由于JDK中普通的线程池返回的Future功能比较单一，所以guava提供了ListeningExecutorService对其进行装饰。我们可以通过ListeningExecutorService对ExecutorService做一层包装，返回一个ListenableFuture实例，而ListenableFuture又是继承自Future，扩展了一个addListener监听方法，这样当任务执行完成就会主动回调该方法。ListeningExecutorService的构建过程如下所示：

public ShardingExecutorService(final int executorSize, final String nameFormat) {

executorService = MoreExecutors.listeningDecorator(getExecutorService(executorSize, nameFormat));

MoreExecutors.addDelayedShutdownHook(executorService, 60, TimeUnit.SECONDS);

}

明确了执行器ExecutorService之后，我们回到ShardingExecuteEngine类，该类提供了groupExecute方法作为入口，如下所示（该方法参数比较多，也单独都列了一下）：

/**

* @param inputGroups：输入组

* @param firstCallback：第一次分片执行回调

* @param callback：分片执行回调

* @param serial：是否使用多线程进行执行

* @param ：输入值类型

* @param <O>：返回值类型

* @return 执行结果

* @throws SQLException：抛出异常

public <I, O> List<O> groupExecute(

final Collection<ShardingExecuteGroup> inputGroups, final ShardingGroupExecuteCallback<I, O> firstCallback, final ShardingGroupExecuteCallback<I, O> callback, final boolean serial)

throws SQLException {

if (inputGroups.isEmpty()) {

return Collections.emptyList();

}

return serial ? serialExecute(inputGroups, firstCallback, callback) : parallelExecute(inputGroups, firstCallback, callback);

}

这里的ShardingExecuteGroup对象实际上就是一个包含输入信息的列表，定义如下：

public final class ShardingExecuteGroup<T> {

private final List<T> inputs;

}

而上述groupExecute方法的输入是一个ShardingExecuteGroup的集合。通过判断输入参数serial，上述代码流程分别转向了serialExecute和parallelExecute这两个代码分支。

我们先来看serialExecute方法，顾名思义，该方法用于串行执行的场景，该方法定义如下：

private <I, O> List<O> serialExecute(final Collection<ShardingExecuteGroup> inputGroups, final ShardingGroupExecuteCallback<I, O> firstCallback,

final ShardingGroupExecuteCallback<I, O> callback) throws SQLException {

Iterator<ShardingExecuteGroup> inputGroupsIterator = inputGroups.iterator();

ShardingExecuteGroup firstInputs = inputGroupsIterator.next();

List<O> result = new LinkedList<>(syncGroupExecute(firstInputs, null == firstCallback ? callback : firstCallback));

for (ShardingExecuteGroup each : Lists.newArrayList(inputGroupsIterator)) {

result.addAll(syncGroupExecute(each, callback));

}

return result;

}

上述代码的基本流程是获取第一个输入的ShardingExecuteGroup，通过第一个回调firstCallback完成同步执行syncGroupExecute。然后对剩下的ShardingExecuteGroup，通过回调callback逐个执行syncGroupExecute。这里的syncGroupExecute方法如下所示：

private <I, O> Collection<O> syncGroupExecute(final ShardingExecuteGroup executeGroup, final ShardingGroupExecuteCallback<I, O> callback) throws SQLException {

return callback.execute(executeGroup.getInputs(), true, ShardingExecuteDataMap.getDataMap());

}

我们看到同步执行的过程实际上是交给了ShardingGroupExecuteCallback，ShardingGroupExecuteCallback接口定义如下：

public interface ShardingGroupExecuteCallback<I, O> {

Collection<O> execute(Collection inputs, boolean isTrunkThread, Map<String, Object> shardingExecuteDataMap) throws SQLException;

}

这里的ShardingExecuteDataMap相当于一个用于SQL执行的数据字典，这些数据字典保存在ThreadLocal中，从而确保了线程安全。我们可以根据当前的执行线程获取对应的DataMap对象。

这样，关于串行执行的流程就介绍完了，接下来我们来看并行执行的parallelExecute方法，如下所示：

private <I, O> List<O> parallelExecute(final Collection<ShardingExecuteGroup> inputGroups, final ShardingGroupExecuteCallback<I, O> firstCallback, final ShardingGroupExecuteCallback<I, O> callback) throws SQLException {

Iterator<ShardingExecuteGroup> inputGroupsIterator = inputGroups.iterator();

ShardingExecuteGroup firstInputs = inputGroupsIterator.next();

Collection<ListenableFuture<Collection<O>>> restResultFutures = asyncGroupExecute(Lists.newArrayList(inputGroupsIterator), callback);

return getGroupResults(syncGroupExecute(firstInputs, null == firstCallback ? callback : firstCallback), restResultFutures);

}

注意到这里有一个异步执行方法asyncGroupExecute，传入参数是一个List<ShardingExecuteGroup>，如下所示：

private <I, O> Collection<ListenableFuture<Collection<O>>> asyncGroupExecute(final List<ShardingExecuteGroup> inputGroups, final ShardingGroupExecuteCallback<I, O> callback) {

Collection<ListenableFuture<Collection<O>>> result = new LinkedList<>();

for (ShardingExecuteGroup each : inputGroups) {

result.add(asyncGroupExecute(each, callback));

}

return result;

}

然后针对每个传入的ShardingExecuteGroup，再次调用一个同名的asyncGroupExecute方法：

private <I, O> ListenableFuture<Collection<O>> asyncGroupExecute(final ShardingExecuteGroup inputGroup, final ShardingGroupExecuteCallback<I, O> callback) {

final Map<String, Object> dataMap = ShardingExecuteDataMap.getDataMap();

return executorService.submit(new Callable<Collection<O>>() {

@Override

public Collection<O> call() throws SQLException {

return callback.execute(inputGroup.getInputs(), false, dataMap);

}

});

}

显然，作为异步执行方法，这里就会使用ListeningExecutorService来提交一个异步执行的任务并返回一个ListenableFuture，而这个异步执行的任务就是具体的回调。

最后，我们来看parallelExecute方法的最后一句，即调用getGroupResults方法获取执行结果，如下所示：

private <O> List<O> getGroupResults(final Collection<O> firstResults, final Collection<ListenableFuture<Collection<O>>> restFutures) throws SQLException {

List<O> result = new LinkedList<>(firstResults);

for (ListenableFuture<Collection<O>> each : restFutures) {

try {

result.addAll(each.get());

} catch (final InterruptedException | ExecutionException ex) {

return throwException(ex);

}

return result;

}

熟悉Future用法的同学对上述代码应该不会陌生，我们遍历ListenableFuture，然后调动它的get方法同步等待其返回结果。最后当所有的结果都获取到之后组装成一个结果列表并返回。这种写法在使用Future时非常常见。

回过头来看，无论是serialExecute方法还是parallelExecute方法，都会从ShardingExecuteGroup中获取第一个firstInputs元素进行执行，然后剩下的再进行同步或异步执行，这样设计的目的是让第一个任务由当前线程进行执行，从而不浪费一个线程。这也是多线程运用中的一种技巧。

至此，关于ShardingExecuteEngine类的介绍就告一段落。作为执行引擎，ShardingExecuteEngine所做的事情就是提供一个多线程的执行环境。本质上而言，ShardingExecuteEngine不做任何业务相关的事情，只是提供多线程执行环境，执行传入的回调函数，这是一个非常巧妙的设计。同样的技巧在其他诸如Spring等开源框架的很多地方也都可以看到。

更多内容可以关注我的公众号：程序员向架构师转型。

天涯兰的博客博客专家

发布了113 篇原创文章 · 获赞 12 · 访问量 11万+

私信关注

ShardingSphere源码解析之执行引擎（一）

猜你喜欢