ElasticSearch Introduction (c) - Chinese word

Many times, we need to enable Chinese word in ElasticSearch, the paper here a brief look at the method. Chinese word first install plug-ins. Used here is  IK , you may also consider other plug-ins (such as  smartcn ).

$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.2.0/elasti csearch-analysis-ik-7.2.0.zip

The above code is installed is version 7.2.0 plug-in, use the Elastic 7.2.0 cooperation.

PS: Other plug-ins commands: elasticsearch-plugin help

Then, restart Elastic, it will automatically load the newly installed plug-ins.

Then, create a new Index, specify the fields that need to divide the word. This step varies according to the data structure, the following command only for this article. Basically, all the needs of the Chinese search field, it must be individually set.

PUT /accounts
{
    "mappings": {
        "person": {
            "properties": {
                "user": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                },
                "title": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                },
                "desc": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                }
            }
        }
    }
}

The above code, the first of a new name for the accounts of the Index, which has a name for the person of Type. person has three fields.

  • user
  • title
  • desc

These three fields are Chinese, and the type is text (text), you need to specify Chinese word breaker, you can not use the default English word breaker.

Elastic's word is called  Analyzer . We specify the word breaker for each field.

"user": {
    "type": "text",
    "analyzer": "ik_max_word",
    "search_analyzer": "ik_max_word"
}

The above code, analyzer is a text field tokenizer, search_analyzer search word tokenizer. ik_max_word word is provided by plug-ik, it can be a maximum number of sub-word text.

Guess you like

Origin www.cnblogs.com/TianFang/p/11330241.html