Official description link: https://www.elastic.co/cn/blog/strings-are-dead-long-live-strings
Reproduced original connection: https://segmentfault.com/a/1190000008897731
Text vs. keyword
With the arrival of ElasticSearch 5.0, but also ushered in one of the major features in this release are: removal of the string
types of the root causes of this change Is string
type will bring us a lot of confusion: because ElasticSearch strings have two completely different Search way. you can follow the entire text matching that search keywords ( keyword Search ), can also be matched by a single character, that is, full-text search ( Full-text Search ). ElasticSearch some understanding of people know, the former It is called string not-analyzed
of characters, and the latter is referred to as a analyzed
string.
In fact, the same type used to deal with two different usage scenarios that will make people collapse, because some options are only set a scene for its effective example position_increment_gap
of the not-analyzed
character will not work, and the like ignore_above
for analyzed
strings very it is difficult to distinguish in the end the entire string is a valid value or valid for each individual word (in this scenario, ignore_above really only valid for the entire string values, and limit the use of a single word can be limit
set).
To avoid embarrassment, string
the field is split into two new data types: text
a full-text search, the keyword for the search keyword.
The new default type
After doing this type of decomposition, we default to string field dynamic mappings also made a change. Just before indirect contact ElasticSearch, if you need all the values of a field of polymeric do, you have to redo the index to the data. If you are dealing with a document that contains a city
field. this field is made of polymeric words will be given separately new
and york
in total, rather than we usually expect New York
the total number. frustrating is that in order to achieve the results we want, we need this field re-indexed.
To keep things get so bad, ElasticSearch decided to borrow ideas from Logstash in: the string will be simultaneously mapped to default text
and keyword
type the following example of a document after the index:
{ "foo": "bar" }
ElasticSearch will create the following dynamic mapping (dynamic mappings) for you:
{ "foo": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } }
Of course, you can based on the mapping that is foo
full-text search field, you can also foo.keyword
search for keywords and data aggregation field implementation.
Disabling this feature is also very easy: you just need to type when defining mapping explicitly declare a string field or use 动态模板(dynamic template)
to match all the string field you can, for example, by the following. dynamic template
Can be restored to the ElasticSearch 2.x in the effect of using the dynamic template:
{ "match_mapping_type": "string", "mapping": { "type": "text" } }
How to migrate to the new version
Typically, migration is much easier to previously mapped. analyzed
Fields string:
{ "foo": { "type": "string", "index": "analyzed" } }
Now as long as the map is text
to:
{ "foo": { "type": "text", "index": true } }
Is as previously defined not_analyzed
string field:
{ "foo": { "type": "string", "index": "not_analyzed" } }
Only need to be defined as keyword
to:
{ "foo": { "type": "keyword", "index": true } }
As mentioned above, string
the field is redefined as text
and keyword
fields. For the above index
property, because we do not need a new definition of three states (in the previous string
definition can be analyzed
, not_analyzed
and no
), it simply became a boolean value definition to inform ElasticSearch whether a search on the field.
Backward compatibility
Because of the large version upgrade itself is challenging, so we were trying to keep updating your mapping ElasticSearch meaning the upgrade process. First, the string
field can continue to continue to use the defined 2.x version of the index, and when you create a new when the index, elasticSearch will do some treatment to automatically string
mapped to equivalent text
or keyword
if you already have an index defined in the template (index template) there is string
a field, this will be very useful because these templates can be used without modification to elasticSearch 5.x in. Having said that, you still need to go about doing these templates do upgrade, because in elasticSearch 6.0, we may remove this backward-compatible logic.