Elasticsearch 的文档如何保证唯一性?

常见说法是下面三个字段在一个ES实例/集群中是全局唯一的:

index + type + 文档 _id

但是实际上是:

index + type + 分片标识 + 文档 _id

以下为验证:

在 Elasticsearch 7 中创建有10个分片的 index:

PUT student
{
  "mappings" : {
    "properties" : {
      "uid": {
        "type" : "integer"
      },
      "name" : {
        "type" : "keyword"
      },
      "age" : {
        "type" : "integer"
      }
    }
  },
  "settings" : {
    "index" : {
      "number_of_shards" : 10,
      "number_of_replicas" : 1
    }
  }
}

添加记录1:

POST student/_doc/1?routing=1
{
  "uid": 1,
  "name": "张三",
  "age": 10
}

查询中带上指定 explain 为 true,响应中能看到文档属于哪个 shard:

# 请求
GET student/_search
{
  "query": {
    "match": {
      "uid": 1
    }
  },
  "explain": true
}

# 响应
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_shard" : "[student][8]",
        "_node" : "wFhSfuLwR3OX21eldbRIHg",
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_routing" : "1",
        "_source" : {
          "uid" : 1,
          "name" : "张三",
          "age" : 10
        },
        "_explanation" : {
          "value" : 1.0,
          "description" : "uid:[1 TO 1]",
          "details" : [ ]
        }
      }
    ]
  }
}

添加记录2:

POST student/_doc/1?routing=2
{
  "uid": 1,
  "name": "张三",
  "age": 10
}

注意,和记录1相比,除了 routing ,其他均没有变化。

我们再次查询_id为1的记录,会发现有两条,唯一区别是 _shard_routing值不相同:

# 请求
GET student/_search
{
  "query": {
    "match": {
      "uid": 1
    }
  },
  "explain": true
}

# 响应
{
  "took" : 565,
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_shard" : "[student][7]",
        "_node" : "wFhSfuLwR3OX21eldbRIHg",
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_routing" : "2",
        "_source" : {
          "uid" : 1,
          "name" : "张三",
          "age" : 10
        },
        "_explanation" : {
          "value" : 1.0,
          "description" : "uid:[1 TO 1]",
          "details" : [ ]
        }
      },
      {
        "_shard" : "[student][8]",
        "_node" : "wFhSfuLwR3OX21eldbRIHg",
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_routing" : "1",
        "_source" : {
          "uid" : 1,
          "name" : "张三",
          "age" : 10
        },
        "_explanation" : {
          "value" : 1.0,
          "description" : "uid:[1 TO 1]",
          "details" : [ ]
        }
      }
    ]
  }
}

相关文章

猜你喜欢

转载自www.cnblogs.com/letiantian/p/12431679.html