Many times, we need to enable Chinese word in ElasticSearch, the paper here a brief look at the method. Chinese word first install plug-ins. Used here is IK , you may also consider other plug-ins (such as smartcn ).
$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.2.0/elasti csearch-analysis-ik-7.2.0.zip
The above code is installed is version 7.2.0 plug-in, use the Elastic 7.2.0 cooperation.
PS: Other plug-ins commands: elasticsearch-plugin help
Then, restart Elastic, it will automatically load the newly installed plug-ins.
Then, create a new Index, specify the fields that need to divide the word. This step varies according to the data structure, the following command only for this article. Basically, all the needs of the Chinese search field, it must be individually set.
PUT /accounts
{
"mappings": {
"person": {
"properties": {
"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"desc": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}
}
The above code, the first of a new name for the accounts of the Index, which has a name for the person of Type. person has three fields.
-
user
-
title
-
desc
These three fields are Chinese, and the type is text (text), you need to specify Chinese word breaker, you can not use the default English word breaker.
Elastic's word is called Analyzer . We specify the word breaker for each field.
"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
The above code, analyzer is a text field tokenizer, search_analyzer search word tokenizer. ik_max_word word is provided by plug-ik, it can be a maximum number of sub-word text.