[Elasticsearch Tutorial 9] ignore_above of Mapping keyword

First, insert a piece of data into ES at will:

put my_index/_doc/1
{
    
    
  "name": "李星云"
}

View the mapping automatically generated by ES, the name is text类型, there are others under it keyword子类型, and"ignore_above" : 256

GET /my_index/_mapping

name定义如下:
"properties" : {
    
    
  "name" : {
    
    
    "type" : "text",
    "fields" : {
    
    
      "keyword" : {
    
    
        "type" : "keyword",
        "ignore_above" : 256
      }
    }
  }
}

For keyword类型, you can set ignore_abovea limited character length. ignore_aboveCharacters exceeding will be stored, but not inverted. For example ignore_above=4, "abc", "abcd", and "abcde" can all be stored in ES, but the data cannot be retrieved based on "abcde".

[1] Create a keyword类型field, ignore_above=4

PUT test_index
{
    
    
  "mappings": {
    
    
    "_doc": {
    
    
      "properties": {
    
    
        "message": {
    
    
          "type": "keyword",
          "ignore_above": 4
        }
      }
    }
  }
}

[2] Insert 3 pieces of data into the index:

PUT /test_index/_doc/1
{
    
    
  "message": "abc"
}

PUT /test_index/_doc/2
{
    
    
  "message": "abcd"
}

PUT /test_index/_doc/3
{
    
    
  "message": "abcde"
}

At this point the ES inverted index is:

term Document ID
abc 1
abcd 2

【3】According to the message terms聚合:

GET /test_index/_search
{
    
    
  "size": 0, 
  "aggs": {
    
    
    "term_message": {
    
    
      "terms": {
    
    
        "field": "message",
        "size": 10
      }
    }
  }
}

return result:

{
    
    
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "test_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
    
    
          "message" : "abcd"
        }
      },
      {
    
    
        "_index" : "test_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
    
    
          "message" : "abc"
        }
      },
      {
    
    
        "_index" : "test_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
    
    
          "message" : "abcde"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "term_message" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [#注意这分组里没有”abcde“
        {
    
    
          "key" : "abc",
          "doc_count" : 1
        },
        {
    
    
          "key" : "abcd",
          "doc_count" : 1
        }
      ]
    }
  }
}

【4】Exact query based on "abcde" term, the result is empty

GET /test_index/_search
{
    
    
  "query": {
    
    
    "term": {
    
    
      "message": "abcde"
    }
  }
}

然后结果:
  "hits" : {
    
    
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }

From the above results, we can know that "abcde" has been stored in ES, and it can also be searched out, but there is no term "abcde", and it cannot be retrieved based on "abcde" as a term.
For existing keywordfields, their ignore_abovesub-properties can be modified, but only valid for new data .

Guess you like

Origin blog.csdn.net/winterking3/article/details/126607648