IK分词器结合ES使用

elasticearch版本为2.4.6

一.es集群在前一章节已经有讲，接下来直接将IK集成到es中，该架构是es两台，主从，利用nginx做负载。

二.集成ik：https://github.com/medcl/elasticsearch-analysis-ik

注意点：githup上面es对应的IK版本，改elasticearch版本为2.4.6

githup上面有安装步骤，注意需要修改的是。IK版本，重点的强调三遍

cd /usr/local/app/elk/elasticsearch-2.4.6/

在线安装：

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysisik/releases/download/v6.3.0/elasticsearch-analysis-ik-1.10.6.zip

安装完成后：可以到plugins中看到新增IK或者去config中看

至此，安装已完成。

离线安装：提前下载好zip
./bin/plugin install file:xxx.zip

三.创建索引

注意分片数设置：一个分片大约存100G数据。一台机器存500g，根据自己的数据规模建立分片，副本数一般不要设置多了，es的副本就是备份的意思和分片的数据一模一样，一般建立1个即可。创建一个test-es的索引

三.创建mapping：类似mysql创建一张表

索引里面有一个type：一个索引里面可以建多个type。但是type的字段属性long必须是一致。Mapping文件请在配置里面找,根据自己的项目进行自定义

mapping脚本

{
"properties": {
"praise_rate": {
"type": "double"
},
"discription": {
"analyzer": "ik_max_word", 表示最细粒度
"type": "string"
},
"create_time": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"level": {
"type": "long"
},
"author": {
"analyzer": "ik_smart", 表示粗粒度
"type": "string"
},
"isbn": {
"type": "string"
},
"language": {
"type": "long"
},
"type": {
"type": "long"
},
"is_delete": {
"type": "long"
},
"reader_number": {
"type": "long"
},
"imgurl": {
"type": "string"
},
"recommend_number": {
"type": "long"
},
"score": {
"type": "long"
},
"update_time": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"word_count": {
"type": "long"
},
"price": {
"type": "double"
},
"name": {
"analyzer": "ik_max_word",
"type": "string"
},
"publishdate": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"publisher": {
"type": "string"
},
"id": {
"type": "long"
},
"state": {
"type": "long"
},
"page_count": {
"type": "long"
}
}
}

验证分词的数据：

{
"field":"discription",
"text":"湖北省市长"
}

可以看出，我们的次已经成功分了。

注意事项：如果提交请求是这样报错

解决方案：查看后台日志，如果是classNotFound，如果报这一点，那么可能就是这个分词器没加进去，那么就是机器没有重启，重启一下就好了，如果是其他的报错，那么就是版本对应不一样

IK分词器结合ES使用

elasticearch版本为2.4.6

猜你喜欢