ElasticSearch-IK expands custom thesaurus (1): Manually adding hot word files

As an open source software, the IK tokenizer has a very powerful word segmentation function, and it can completely win the word segmentation requirements in general occasions. But when it comes to technical terms and industry terms, ik is a little less intelligent, but IK is already great! Next, I will introduce how to manually add the form of hot word files to expand ik's cognitive ability (word segmentation index).

Step 1: We need to collect the industry terms we need, dic files.

We can go to the config file of the ik plugin first, there are already many dic files. This is ik's own hot word.

I added a file called hwtest.dic myself

I added two words, note that each hot word needs a line break. When I restart es, I will first see if these two words are divided again.

Now it seems that these two words ik can not be considered as a complete word.

Now configure the dic file in the ik/config/IKAnalyzer.cfg.xml file.

Take a look at the contents of this file

I now configure hwtest.dic inside <entry key="ext_dict"></ectry>. Then restart es.

You can look at the startup log

hwtest.dic has been loaded

Let's see that "Shanghai arthur" has become a complete word. So at that time, if the hot word is stored, it can be indexed into a word.

Manually adding extended thesaurus has been completed!

======================I am the dividing line========================

 

I use java to operate es, everyone can communicate with each other in the button group

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324430439&siteId=291194637