solr分布式请求的判断——HttpShardHandler

在solrCloud中，我们发起的一次查询请求绝大部分是朝多个shard发起的请求，但是可能存在这么一个情况，我已经知道这次要查询的sahrd是哪一个了，那么如何只向一个shard发起请求呢？这个博客就是看看solrCloud对分布式请求的判断，代码在HttpShardHandler中，看看checkDistributed方法：

 /** 
   * 判断这次请求是不是分布式的请求，根据是不是有zk，
   * 如果是的话，则找到由Router决定的要路由到的多个shard，
   * 并添加多个shard的多个replica的url，用|分隔，放在rb的shard和slices中 
   */
  @Override
  public void checkDistributed(ResponseBuilder rb) {
    
    SolrQueryRequest req = rb.req;
    SolrParams params = req.getParams();
    
    rb.isDistrib = params.getBool("distrib", req.getCore().getCoreDescriptor().getCoreContainer().isZooKeeperAware());// 先检查distrib这个参数，如果指定了则使用，否则默认值是是否启动了zk.
    String shards = params.get(ShardParams.SHARDS);// 参数中指定的shards参数。
    
    // for back compat, a shards param with URLs like localhost:8983/solr will mean that this
    // search is distributed.
    boolean hasShardURL = shards != null && shards.indexOf('/') > 0;
    rb.isDistrib = hasShardURL | rb.isDistrib;//由distrib、是否使用zk、是否制定了shards三个参数决定一个请求是否是分布式的，即是否要向多个shard转发请求。
    
    if (rb.isDistrib) {// 如果是分布式的。
      
      // since the cost of grabbing cloud state is still up in the air, we grab it only if we need it.
      ClusterState clusterState = null;
      Map<String,Slice> slices = null;
      CoreDescriptor coreDescriptor = req.getCore().getCoreDescriptor();
      CloudDescriptor cloudDescriptor = coreDescriptor.getCloudDescriptor();
      ZkController zkController = coreDescriptor.getCoreContainer().getZkController();
      
      if (shards != null) {// 如果在请求的参数中指定了shards，则使用给定的shards
        List<String> lst = StrUtils.splitSmart(shards, ",", true);// 可以指定多个要查询的shard，用英文的逗号分隔。
        rb.shards = lst.toArray(new String[lst.size()]);
        rb.slices = new String[rb.shards.length];
        
        if (zkController != null) {
          // figure out which shards are slices
          for (int i = 0; i < rb.shards.length; i++) {
            if (rb.shards[i].indexOf('/') < 0) {
              // this is a logical shard
              rb.slices[i] = rb.shards[i];
              rb.shards[i] = null;
            }
          }
        }
      } else if (zkController != null) {// 如果没有指定shards并且使用了zk
        
        // we weren't provided with an explicit list of slices to query via "shards", so use the cluster state
        clusterState = zkController.getClusterState();
        String shardKeys = params.get(ShardParams._ROUTE_);// shardKeys就是参数中的_route_，这个指定要路由到的shard，对于任何的Router都可以使用这个值（像Implicit这个Router可以使用域的名字来指定要查找的shard）。
        
        // This will be the complete list of slices we need to query for this request.
        slices = new HashMap<>();
        
        // we need to find out what collections this request is for.
        
        // A comma-separated list of specified collections.
        // Eg: "collection1,collection2,collection3"
        String collections = params.get("collection");// 得到collection，可能有多个collection，有,分隔。
        if (collections != null) {
          // If there were one or more collections specified in the query, split
          // each parameter and store as a separate member of a List.
          List<String> collectionList = StrUtils.splitSmart(collections, ",", true);
          // In turn, retrieve the slices that cover each collection from the
          // cloud state and add them to the Map 'slices'.
          for (String collectionName : collectionList) {// 假设只有一个collection.
            // The original code produced <collection-name>_<shard-name> when the collections
            // parameter was specified (see ClientUtils.appendMap)
            // Is this necessary if ony one collection is specified?
            // i.e. should we change multiCollection to collectionList.size() > 1?
            addSlices(slices, clusterState, params, collectionName, shardKeys, true);// 根据这个collection的路由策略和参数找到所有要请求的shard。这个方法的实现要涉及到docRouter，关于这个博客参见http://suichangkele.iteye.com/blog/2363305这个博客。
          }
        } else {
          // just this collection
          String collectionName = cloudDescriptor.getCollectionName();
          addSlices(slices, clusterState, params, collectionName, shardKeys, false);
        }
        
        // Store the logical slices in the ResponseBuilder and create a new
        // String array to hold the physical shards (which will be mapped
        // later).
        rb.slices = slices.keySet().toArray(new String[slices.size()]);
        rb.shards = new String[rb.slices.length];
      }

读完了这个代码，便明白了solrCloud对分布式请求的路由的规则，如果我们指定了shards就会使用查找的shard，如果没有指定，则使用collection中的DocRouter根据参数中的_router_来决定要路由到的shard。对于DocRouter的操作在http://suichangkele.iteye.com/blog/2363305这个博客中写了。

solr分布式请求的判断——HttpShardHandler

猜你喜欢