二、ElasticSearch6 安装中文分词器(IK Analysis)

通过前一篇的安装后:ElasticSearch6.2.4 安装OK了 我们继续安装IK分词器

一、安装

    以下是版本对照表(GitHub地址): 

IK version ES version
master 6.x -> master
6.2.4 6.2.4
6.1.3 6.1.3
5.6.8 5.6.8
5.5.3 5.5.3
5.4.3 5.4.3
5.3.3 5.3.3
5.2.2 5.2.2
5.1.2 5.1.2
1.10.6 2.4.6
1.9.5 2.3.5
1.8.1 2.2.1
1.7.0 2.1.1
1.5.0 2.0.0
1.2.6 1.0.0
1.2.5 0.90.x
1.1.3 0.20.x
1.0.0 0.16.2 -> 0.19.0

  1、离线安装:

   (1、)如下地址下载最新包(自行检查对应版本号)

https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip

   (2、)解压到es安装目录下

[payment@localhost elasticsearch-6.2.4]$ cd plugins/
[payment@localhost plugins]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4/plugins
[payment@localhost plugins]$ unzip elasticsearch-analysis-ik-6.2.4.zip

   2、在线安装(推荐):

[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
[=================================================] 100%   
-> Installed analysis-ik
[payment@gameServer elasticsearch-6.2.4]$ 

 二、重启ElasticSearch服务

    1、停止服务:

[payment@gameServer elasticsearch-6.2.4]$ ps -ef|grep elasticsearch
payment  27352     1  0 10:50 pts/0    00:00:39 /usr/local/java/jdk1.8.0_161//bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch.oFTj99LA -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4 -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config -cp /home/payment/elasticSearch/elasticsearch-6.2.4/lib/* org.elasticsearch.bootstrap.Elasticsearch -d
payment  29017 26594  0 13:10 pts/0    00:00:00 grep elasticsearch
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ kill -9 27352

    2、启动ElasticSearch 

[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch -d && tail -f logs/elasticsearch.log
[2018-06-06T13:12:28,029][INFO ][o.e.d.DiscoveryModule    ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node               ] initialized
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node               ] [SdEluaQ] starting ...
[2018-06-06T13:12:28,711][INFO ][o.e.t.TransportService   ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:12:28,721][INFO ][o.e.b.BootstrapChecks    ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:12:31,765][INFO ][o.e.c.s.MasterService    ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:12:31,769][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:12:31,782][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:12:31,782][INFO ][o.e.n.Node               ] [SdEluaQ] started
[2018-06-06T13:12:31,921][INFO ][o.e.g.GatewayService     ] [SdEluaQ] recovered [0] indices into cluster_state
[2018-06-06T13:13:42,980][INFO ][o.e.n.Node               ] [] initializing ...
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment    ] [SdEluaQ] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [402.8gb], net total_space [442.7gb], types [rootfs]
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment    ] [SdEluaQ] heap size [990.7mb], compressed ordinary object pointers [true]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] node name [SdEluaQ] derived from node ID [SdEluaQkTfi1p-yRtlxHSA]; set [node.name] to override
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] version[6.2.4], pid[29196], build[ccec39f/2018-04-12T20:37:28.497551Z], OS[Linux/2.6.32-696.28.1.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_161/25.161-b12]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.vXQsyXAG, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4, -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [aggs-matrix-stats]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [analysis-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [ingest-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-expression]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-mustache]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-painless]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [mapper-extras]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [parent-join]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [percolator]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [rank-eval]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [reindex]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [repository-url]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [transport-netty4]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [tribe]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded plugin [analysis-ik]
[2018-06-06T13:13:46,137][INFO ][o.e.d.DiscoveryModule    ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node               ] initialized
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node               ] [SdEluaQ] starting ...
[2018-06-06T13:13:46,770][INFO ][o.e.t.TransportService   ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:13:46,778][INFO ][o.e.b.BootstrapChecks    ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:13:49,828][INFO ][o.e.c.s.MasterService    ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:13:49,835][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:13:49,853][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:13:49,861][INFO ][o.e.n.Node               ] [SdEluaQ] started
[2018-06-06T13:13:49,973][INFO ][o.e.g.GatewayService     ] [SdEluaQ] recovered [0] indices into cluster_state
启动并监听启动日志:
   看到:加载了 分词插件 
loaded plugin [analysis-ik]

三、检查分词器

   检查分词:

[root@gameServer ~]# curl -XGET http://172.17.63.15:9200/_analyze?pretty -H 'Content-Type:application/json' -d'               
{
  "analyzer": "ik_smart",
  "text": "听说看这篇博客的哥们最帅、姑娘最美"
}'
{
  "tokens" : [
    {
      "token" : "听说",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "看",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "这篇",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "博客",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "的",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "哥们",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "最",
      "start_offset" : 10,
      "end_offset" : 11,
      "type" : "CN_CHAR",
      "position" : 6
    },
    {
      "token" : "帅",
      "start_offset" : 11,
      "end_offset" : 12,
      "type" : "CN_CHAR",
      "position" : 7
    },
    {
      "token" : "姑娘",
      "start_offset" : 13,
      "end_offset" : 15,
      "type" : "CN_WORD",
      "position" : 8
    },
    {
      "token" : "最美",
      "start_offset" : 15,
      "end_offset" : 17,
      "type" : "CN_WORD",
      "position" : 9
    }
  ]
}
解释(来源 GitHub ):

ik_max_word 和 ik_smart 什么区别?
ik_max_word: 会将文本做最细粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”,会穷尽各种可能的组合;
ik_smart: 会做最粗粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”。

猜你喜欢

转载自blog.csdn.net/superviser3000/article/details/80845304
今日推荐