Docker: Elasticsearch installation and configuration IK tokenizer

1. Background:

  After doing the installation and configuration of elasticsearch and kibana, I did not achieve the expected effect when performing word segmentation, so I wrote about the installation and configuration of elasticsearch's ik word segmentation (custom word segmentation).

2. Solution:

1: First look at the effect of not adding the ik tokenizer.

POST _analyze
{
  "analyzer": "standard",
  "text": "我是中国人"
}

2: Download the ik package.

https://github.com/medcl/elasticsearch-analysis-ik

 3: Choose your own response version.

 4: Upload the downloaded file to the plugins in our elasticsearch folder.

5: Restart elasticsearch and test.

docker restart es的名称或者ip


POST _analyze
{
  "analyzer": "ik_smart",
  "text": "我是炎黄子孙"
}

POST _analyze
{
  "analyzer": "ik_max_word",
  "text": "我是中国人"
}

6: The ik tokenizer has been installed above, but it is not the result we want. How can "Yanhuang descendants" be split into two "Yanhuang"

"Descendants", we need to customize the dictionary and create folders such as nginx.

mkdir nginx

Download the nginx mirror

docker run -p 80:80 --name nginx -d nginx:1.10

7: Copy the configuration file in the container to the current directory.

docker container cp nginx:/etc/nginx . 

注意:别忘了后面的点

 8: Modify the file name:

mv nginx conf

9: Move conf to /mydata/nginx

mv conf nginx/

10: Terminate original container.

docker stop nginx

11: Execute the command to delete the original container.

docker rm $ContainerId

12: Create a new nginx and execute the following command.

docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10

13: Create index.html under the html folder under nginx.

 

 14: Create an es folder under the html folder.

mkdir es

15: Enter es, create fenci.txt file

vi fenci.txt

 

 16: Customize our words.

Enter: http://192.168.56.10/es/fenci.txt

 

17: Modify IKAnalyzer.cfg.xml in /usr/share/elasticsearch/plugins/ik/config/

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.56.10/es/fenci.txt</entry>
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties

18: Restart es.

19: Verify the effect.

 

 

3. Summary:

helping people makes a person happy

 

Guess you like

Origin blog.csdn.net/weixin_42188778/article/details/126500500