ES learn -ClusterState notes of learning

We studied the ES front of get apithe whole idea as a reference when writing plug-ES. At that time the focus in the understanding of the overall process, mainly shardOperation()the internal logic of the method call, it weakened shards()method. In fact shards()method in understanding the structural level ES, larger role. We still, get apito understand the start shards().

First look at get apithe use of the process:

 添加文档到ES:
 curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"name":"hello"}'

 根据文档ID读取数据:
 curl -XGET 'http://localhost:9200/test1/type1/1' 

Very simple to use. But if distributed logic behind taking into account not simple. If ES cluster has three nodes, where the index data has three fragments, a copy of each slice. That index is set as follows:

{
  "test1" : {
    "settings" : {
      "index" : {
        "number_of_replicas" : "1",
        "number_of_shards" : "3"
      }
    }
  }
}

So with id doc 1 of this distribution to that slice of it? This issue needs a detailed answer blog, where we first give a simple conclusion:

默认情况下,ES会按照文档id计算一个hash值, 采用的是Murmur3HashFunction,然后根据这个id跟分片数取模。实现代码是MathUtils.mod(hash, indexMetaData.getNumberOfShards()); 最后的结果作为文档所在的分片id,所以ES的分片标号是从0开始的。

I do not know exist, how can you know to take.

And then sort out the core processes of access to data:

s1: 根据文档id定位到数据所在分片。由于可以设为多个副本,所以一个分片会映射到多个节点。

s2: 根据分片节点的映射信息,选择一个节点,去获取数据。 这里重点关注的是节点的选择方式,简而言之,我们需要负载均衡,不然设置副本就没有意义了。

The above two steps is associated with a core data structure ClusterState, we can use _cluster/state?prettyto view this data structure:

# http://localhost:9200/_cluster/state?pretty

{
  "cluster_name" : "elasticsearch",
  "version" : 4,
  "state_uuid" : "b6B739p5SbanNLyKxTMHfQ",
  "master_node" : "KnEE25tzRjaXblFJq5jqRA",
  "blocks" : { },
  "nodes" : {
    "KnEE25tzRjaXblFJq5jqRA" : {
      "name" : "Mysterio",
      "transport_address" : "127.0.0.1:9300",
      "attributes" : { }
    }
  },
  "metadata" : {
    "cluster_uuid" : "ZIl7g86YRiGv8Dqz4DCoAQ",
    "templates" : { },
    "indices" : {
      "test1" : {
        "state" : "open",
        "settings" : {
          "index" : {
            "creation_date" : "1553995485603",
            "uuid" : "U7v5t_T7RG6rNU3JlGCCBQ",
            "number_of_replicas" : "1",
            "number_of_shards" : "1",
            "version" : {
              "created" : "2040599"
            }
          }
        },
        "mappings" : { },
        "aliases" : [ ]
      }
    }
  },
  "routing_table" : {
    "indices" : {
      "test1" : {
        "shards" : {
          "0" : [ {
            "state" : "STARTED",
            "primary" : true,
            "node" : "KnEE25tzRjaXblFJq5jqRA",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "test1",
            "version" : 2,
            "allocation_id" : {
              "id" : "lcSHbfWDRyOKOhXAf3HXLA"
            }
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "test1",
            "version" : 2,
            "unassigned_info" : {
              "reason" : "INDEX_CREATED",
              "at" : "2019-03-31T01:24:45.845Z"
            }
          } ]
        }
      }
    }
  },
  "routing_nodes" : {
    "unassigned" : [ {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "test1",
      "version" : 2,
      "unassigned_info" : {
        "reason" : "INDEX_CREATED",
        "at" : "2019-03-31T01:24:45.845Z"
      }
    } ],
    "nodes" : {
      "KnEE25tzRjaXblFJq5jqRA" : [ {
        "state" : "STARTED",
        "primary" : true,
        "node" : "KnEE25tzRjaXblFJq5jqRA",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "test1",
        "version" : 2,
        "allocation_id" : {
          "id" : "lcSHbfWDRyOKOhXAf3HXLA"
        }
      } ]
    }
  }
}

The entire structure is more complex, we are slowly dismantling, step by step, one by one break. Starting from the idea of ​​dismantling or usage scenarios.

  1. IndexMetaData learning
    metaData the following format:
    "metadata" : {
    "cluster_uuid" : "ZIl7g86YRiGv8Dqz4DCoAQ",
    "templates" : { },
    "indices" : {
      "test1" : {
        "state" : "open",
        "settings" : {
          "index" : {
            "creation_date" : "1553995485603",
            "uuid" : "U7v5t_T7RG6rNU3JlGCCBQ",
            "number_of_replicas" : "1",
            "number_of_shards" : "1",
            "version" : {
              "created" : "2040599"
            }
          }
        },
        "mappings" : { },
        "aliases" : [ ]
      }
    }
    }

That metadata is stored in a state in which the cluster of each slice and the index of the number of copies, the index, the index Mapping, index aliases. This configuration can provide that function out 根据索引名称获取索引元数据, as follows:

# OperationRouting.generateShardId()

        IndexMetaData indexMetaData = clusterState.metaData().index(index);
        if (indexMetaData == null) {
            throw new IndexNotFoundException(index);
        }
        final Version createdVersion = indexMetaData.getCreationVersion();
        final HashFunction hashFunction = indexMetaData.getRoutingHashFunction();
        final boolean useType = indexMetaData.getRoutingUseType();

Our concern here is that clusterState.metaData().index(index)this code, it implements 根据索引名称获取索引元数据的功能. By the number of fragmented metadata binding document id, we will be able to locate the fragments of the document resides. This function Delete, Index, Get three in the API are necessary. Here we can understand why the number of index fragmentation can not be modified ES: If you have modified, then the hash function would not be able locate the correct data where fragmentation.

  1. IndexRoutingTable learning
"routing_table" : {
    "indices" : {
      "test1" : {
        "shards" : {
          "0" : [ {
            "state" : "STARTED",
            "primary" : true,
            "node" : "KnEE25tzRjaXblFJq5jqRA",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "test1",
            "version" : 2,
            "allocation_id" : {
              "id" : "lcSHbfWDRyOKOhXAf3HXLA"
            }
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "test1",
            "version" : 2,
            "unassigned_info" : {
              "reason" : "INDEX_CREATED",
              "at" : "2019-03-31T01:24:45.845Z"
            }
          } ]
        }
      }
    }
  }

routing_tableFragmentation of information stored for each index, by this structure, we can clearly understand the following information:

1. 索引分片在各个节点的分布
2. 索引分片是否为主分片

If a slice has two copies, and are allocated on a different node, get apia total of three data nodes to choose from, choose which one do? Here not to consider with preferenceparameters.
To make each node can fairly be selected to achieve load balancing purposes, a random number is used here. Reference RotateShuffer

/**
 * Basic {@link ShardShuffler} implementation that uses an {@link AtomicInteger} to generate seeds and uses a rotation to permute shards.
 */
public class RotationShardShuffler extends ShardShuffler {

    private final AtomicInteger seed;

    public RotationShardShuffler(int seed) {
        this.seed = new AtomicInteger(seed);
    }

    @Override
    public int nextSeed() {
        return seed.getAndIncrement();
    }

    @Override
    public List<ShardRouting> shuffle(List<ShardRouting> shards, int seed) {
        return CollectionUtils.rotate(shards, seed);
    }

}

That is used ThreadLocalRandom.current().nextInt()to generate a random number as a seed, and then take the time sequentially rotated.
Collections.rotate()The effect can be used as code demonstrates:

    public static void main(String[] args) {

        List<String> list = Lists.newArrayList("a","b","c");
        int a = ThreadLocalRandom.current().nextInt();
        List<String> l2 = CollectionUtils.rotate(list, a );
        List<String> l3 = CollectionUtils.rotate(list, a+1);
        System.out.println(l2);
        System.out.println(l3);

    }

-----
[b, c, a]
[c, a, b]

A request such as a list of nodes is obtained [b, c, a], then the list of node B requests is obtained [c, a, b]. This will achieve the purpose of load balancing.

  1. DiscoveryNodes learning.
    Since the routing_tabletime is stored in the node id, then sends a request to the target node, but also need to know the other node ip and port configuration. This information is stored in nodesthe.
  "nodes" : {
    "KnEE25tzRjaXblFJq5jqRA" : {
      "name" : "Mysterio",
      "transport_address" : "127.0.0.1:9300",
      "attributes" : { }
    }
  }

By this nodesthe acquired node information, a request can be sent, all internal nodes of the communication ES are based transportService.sendRequest().

In summary, this paper based on get api sort out a bit a few core structure in ClusterState ES:  metadata, nodesrouting_table. There is also a routing_nodesnot used here. Behind the scenes and then use the comb clear record.

Published 298 original articles · won praise 107 · Views 140,000 +

Guess you like

Origin blog.csdn.net/ywl470812087/article/details/104875127