查询是分为query阶段、fetch阶段和expand阶段,共3个阶段。重点是前2个阶段。在上篇文章说的是query阶段,接下来说fetch阶段。
fetch是要解析query阶段的result的
//比如有5个分片,当所有的分片都查询完后,开始进入下一个阶段
private void successfulShardExecution(SearchShardIterator shardsIt) {
final int remainingOpsOnIterator;
if (shardsIt.skip()) {
remainingOpsOnIterator = shardsIt.remaining();
} else {
remainingOpsOnIterator = shardsIt.remaining() + 1;
}
final int xTotalOps = totalOps.addAndGet(remainingOpsOnIterator);
if (xTotalOps == expectedTotalOps) {
//就是在这里开始进入下一个阶段
onPhaseDone();
}
}
//this的名字是 query 表示query阶段。这个方法正式从query进入fetch
final void onPhaseDone() { // as a tribute to @kimchy aka. finishHim()
executeNextPhase(this, getNextPhase(results, this));
}
//下一个阶段就是要创建的这个阶段 FetchSearch阶段
@Override
protected SearchPhase getNextPhase(final SearchPhaseResults<SearchPhaseResult> results, final SearchPhaseContext context) {
return new FetchSearchPhase(results, searchPhaseController, context, clusterState());
}
因为query的时候涉及5个分片,所有结果也是5个
[2020-09-14T22:05:10,997][TRACE][o.e.a.s.TransportSearchAction] [] [query] Moving to next phase: [fetch], based on results from: [zTVfnC6sRReXE9mbtqd6Aw][90][0],[zTVfnC6sRReXE9mbtqd6Aw][90][1],[zTVfnC6sRReXE9mbtqd6Aw][90][2],[zTVfnC6sRReXE9mbtqd6Aw][90][3],[zTVfnC6sRReXE9mbtqd6Aw][90][4] (cluster state version: 138)
上面的这条日志也更具有说服力
在fetch阶段,线程执行时使用的也是search名称的线程池。
在fetch阶段也是要创建请求对象,发送请求的,和query阶段类似
ShardFetchSearchRequest fetchSearchRequest = createFetchRequest(queryResult.queryResult().getContextId(), i, entry,
lastEmittedDocPerShard, searchShardTarget.getOriginalIndices());
executeFetch(i, searchShardTarget, counter, fetchSearchRequest, queryResult.queryResult(),
connection);
此时action的名字
indices:data/read/search[phase/fetch/id]
我原以为fetch请求只会发送一次,其实不是的,根据对query阶段返回的结果进行解析而定,具体代码是
private void innerRun() throws Exception {
//在遍历query阶段的成果
for (int i = 0; i < docIdsToLoad.length; i++) {
IntArrayList entry = docIdsToLoad[i];
SearchPhaseResult queryResult = queryResults.get(i);
//只要此处的entry不为空,就会进入到另外一个分支,发送一个fetch请求
if (entry == null) { // no results for this shard ID
if (queryResult != null) {
// if we got some hits from this shard we have to release the context there
// we do this as we go since it will free up resources and passing on the request on the
// transport layer is cheap.
releaseIrrelevantSearchContext(queryResult.queryResult());
progressListener.notifyFetchResult(i);
}
// in any case we count down this result since we don't talk to this shard anymore
counter.countDown();
} else {
//所以会有可能发送多次fetch请求
SearchShardTarget searchShardTarget = queryResult.getSearchShardTarget();
Transport.Connection connection = context.getConnection(searchShardTarget.getClusterAlias(),
searchShardTarget.getNodeId());
ShardFetchSearchRequest fetchSearchRequest = createFetchRequest(queryResult.queryResult().getContextId(), i, entry,
lastEmittedDocPerShard, searchShardTarget.getOriginalIndices());
executeFetch(i, searchShardTarget, counter, fetchSearchRequest, queryResult.queryResult(),
connection);
}
}
}
}
}
在ArraySearchPhaseResults中有一个专门用数组保存请求结果的字段
final AtomicArray<Result> results;
有几个分片数组的长度就是几,但并不是每个都需要发送fetch请求。
有多个fetch结果,所以需要merge
所谓的merge可以理解为将2个结果按照分数排序或者按照字段排序放到一个集合里返回。merge完以后构建response。
fetch阶段完成后,是expand阶段
所以ExpandSearchPhase中的run方法
@Override
public void run() {
//第一个参数是fetch阶段返回的结果
context.sendSearchResponse(searchResponse, scrollId);
}
对于这个阶段而言如果在请求中要求field-collapsing,是该阶段的主要处理任务。如果没有该要求,直接进入下一个阶段。
对于query then fetch来说,查询就结束了