ElasticSearch (10) - use of tokenizers

1. Install smartcn

Excuting an order:

sh elasticsearch-plugin install analysis-smartcn

Successful installation

[es@localhost bin]$ sh elasticsearch-plugin install analysis-smartcn
-> Downloading analysis-smartcn from elastic
[=================================================] 100%
-> Installed analysis-smartcn

There will be an analysis-smartcn folder under the pluins folder

Restart the es service (if it is in a cluster, each node needs to be installed, and then all restarted)

Authenticate: Enter in the browser: http://192.168.15.38:9200/_analyze/

parameter

{
  "analyzer": "smartcn",
  "text": "我是中国人"
}

Request method: post

result:

{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
}
,
{
"token": "是",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 1
}
,
{
"token": "中国",
"start_offset": 2,
"end_offset": 4,
"type": "word",
"position": 2
}
,
{
"token": "人",
"start_offset": 4,
"end_offset": 5,
"type": "word",
"position": 3
}
]
}

2. The Java interface uses the smartcn tokenizer to query

To specify a tokenizer: analyzer("smartcn") method

SearchRequestBuilder srb = client.prepareSearch("film2").setTypes("dongzuo");
        //组合多条件查询01
        QueryBuilder requestBuilder = QueryBuilders.matchQuery("desc", "非洲迷宫").analyzer("smartcn");
        SearchResponse response = srb.setQuery(requestBuilder).setFetchSource(new String[]{"title", "desc"}, null).execute().actionGet();
        SearchHits hits = response.getHits();

        for (SearchHit hit : hits) {
            System.out.println(hit.getSourceAsString());
        }

Results of the:

smartcn will segment the content of the query: "Africa", "maze", and then go to desc to query the data that matches the segmentation

{"title":"移动迷宫3:死亡解药","desc":" 《移动迷宫3》作为系列最终章,沿袭系列一贯以来的劲爆动作戏和快节奏跑酷风,主要讲述迪伦·奥布莱恩饰演的托马斯率领的好莱坞“跑男团”在经历了迷宫逃脱和末日丧尸的生死考验后,终于迎来最后的正邪较量。 "}

{"title":"战狼2","desc":"故事发生在非洲附近的大海上,主人公冷锋(吴京 饰)遭遇人生滑铁卢,被“开除军籍”,本想漂泊一生的他,正当他打算这么做的时候,一场突如其来的意外打破了他的计划,突然被卷入了一场非洲国家叛乱,本可

Query multiple fields at the same time, such as querying title and desc data that match the query content:

    SearchRequestBuilder srb = client.prepareSearch("film2").setTypes("dongzuo");
        //组合多条件查询01
      //  QueryBuilder requestBuilder = QueryBuilders.matchQuery("desc", "非洲迷宫").analyzer("smartcn");

        QueryBuilder requestBuilder = QueryBuilders.multiMatchQuery("星球大战非洲迷宫","title","desc").analyzer("smartcn");
        SearchResponse response = srb.setQuery(requestBuilder).setFetchSource(new String[]{"title", "desc"}, null).execute().actionGet();
        SearchHits hits = response.getHits();

        for (SearchHit hit : hits) {
            System.out.println(hit.getSourceAsString());
        }

Word segmentation for query content: "Star Wars", "Africa", "Maze"

search result:

{"title":"移动迷宫3:死亡解药","desc":" 《移动迷宫3》作为系列最终章,沿袭系列一贯以来的劲爆动作戏和快节奏跑酷风,主要讲述迪伦·奥布莱恩饰演的托马斯率领的好莱坞“跑男团”在经历了迷宫逃脱和末日丧尸的生死考验后,终于迎来最后的正邪较量。 "}

{"title":"战狼2","desc":"故事发生在非洲附近的大海上,主人公冷锋(吴京 饰)遭遇人生滑铁卢,被“开除军籍”,本想漂泊一生的他,正当他打算这么做的时候,一场突如其来的意外打破了他的计划,突然被卷入了一场非洲国家叛乱,本可以安全撤离,却因无法忘记曾经为军人的使命,孤身犯险冲回沦陷区,带领身陷屠杀中的同胞和难民,展开生死逃亡。随着斗争的持续,体内的狼性逐渐复苏,最终孤身闯入战乱区域,为同胞而战斗。"}

{"title":"星球大战8:最后的绝地武士","desc":"《星球大战:最后的绝地武士》承接前作《星球大战:原力觉醒》的剧情,讲述第一军团全面侵袭之下,蕾伊(黛西·雷德利 Daisy Ridley 饰)、芬恩(约翰·博耶加 John Boyega 饰)、波·

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325646569&siteId=291194637