Jieba as word integrated with Elasticsearch

 Found on github related projects, https://github.com/sing1ee/elasticsearch-jieba-plugin , support 5.X version es.

 165 deployed on the host is 5.2.2 version of es, elasticsearch-jieba-plugin download the corresponding version of

 It supports two analyzer:

 jieba_index: an index word, word fine-grained
 jieba_search: to query word, word coarser

1, compiled deployment

 Installation gradle

 The project is compiled using gradle, on their computers to install gradle.

 Download gradle installation package  https://downloads.gradle.org/distributions/gradle-4.3-all.zip

  Extract to custom directories, and then configure the environment variables to test whether the installation is successful, execute gradle -v

  Compile the project

gradle pz

Deployment of plug-ins

 After successful compilation, copy the file build / distributions / elasticsearch-jieba-plugin-5.1.2.zip es to the plugins directory

 Delete archive after extracting

unzip elasticsearch-jieba-plugin-5.1.2.zip
rm elasticsearch-jieba-plugin-5.1.2.zip

 The final restart to take effect es

2, the test jieba tokenizer

 Use postman segmentation interface testing the effect of the rest

jieba_search

POST  192.168.1.165:9200/jieba_index/_analyze

{
"analyzer":"jieba_search",
"text":"近日,国外几名网友整理了一份自然语言处理的免费/公开数据集(包含文本数据)清单,为防止大家错过这个消息,论智暂且把清单内容搬运如下。有需要的读者可直接收藏本文,或去github点个星星以示感谢"
}

jieba_index

POST  192.168.1.165:9200/jieba_index/_analyze

{
"analyzer":"jieba_index",
"text":"近日,国外几名网友整理了一份自然语言处理的免费/公开数据集(包含文本数据)清单,为防止大家错过这个消息,论智暂且把清单内容搬运如下。有需要的读者可直接收藏本文,或去github点个星星以示感谢"
}

3, the test custom dictionary

Custom dictionary in text form, named .dict end, put plugins / jieba under / dic directory, restart es take effect

jieba_search

POST  192.168.1.165:9200/jieba_index/_analyze

{
"analyzer":"jieba_search",
"text":"近日,国外几名网友整理了一份自然语言处理的免费/公开数据集(包含文本数据)清单,为防止大家错过这个消息,论智暂且把清单内容搬运如下。有需要的读者可直接收藏本文,或去github点个星星以示感谢"
}

jieba_index

POST  192.168.1.165:9200/jieba_index/_analyze

{
"analyzer":"jieba_index",
"text":"近日,国外几名网友整理了一份自然语言处理的免费/公开数据集(包含文本数据)清单,为防止大家错过这个消息,论智暂且把清单内容搬运如下。有需要的读者可直接收藏本文,或去github点个星星以示感谢"
}

 

Guess you like

Origin blog.csdn.net/zwahut/article/details/90635621