ElasticSearch solves the problem that Chinese search can only search for a single character

Problem Description

I wrote a small search project, I want to use es to realize the highlight search, but suddenly I found that my search function can only find English such as java, dior, vue, etc., but it cannot search for medicine, advanced mathematics, etc. Out!

Solutions

At first, I thought it was my request sent in the past that caused Chinese garbled characters

    new Vue({
        el:'#app',
        data:{
            keyword:'' ,//搜索的关键字
            results:[] //搜索的结果
        },
        methods:{
            searchKey(){
                var keyword = this.keyword;
                console.log(keyword);
                //对接后端接口
                axios.get('search/'+keyword+"/1/20").then(response=>{
                    console.log(response);
                    //绑定数据
                    this.results = response.data;
                })
            }
        }
    })

I thought that the keyword search keyword received by my backend was garbled, but I printed it in the background and found that there was no garbled code! !

Look for solutions again

I looked up the information later and it seemed that it was a problem with the tokenizer! ! ! Hey, it makes sense.
I tried to search for a word "medicine" or "learning" on the trial sheet, and found that it could be found . Then there is a bit of eyebrows, but should this tokenizer be changed? At first I thought it was naive that the thesaurus had no words like medicine! ! People's thesaurus is too large, it is impossible to not have medical words, and then asked the boss, read the blog, and later found a similar
original link to the blogger : https://blog.csdn.net/qq_44961149 /article/details/107300665 If you
are interested, you can take a look. The
reason is:

  //实现搜索功能
    public List<Map<String,Object>> searchPage(String keyword,int pageNo,int pageSize) throws IOException {
    
    

        if (pageNo<1){
    
    
            pageNo = 1;
        }
        //条件搜索
        SearchRequest searchRequest = new SearchRequest("jd_goods");
        //构建搜索条件
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        //实现分页功能
        searchSourceBuilder.from(pageNo);
        searchSourceBuilder.size(pageSize);
        //精确匹配
//        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keyword);
        MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery("title", keyword);

        searchSourceBuilder.query(matchPhraseQueryBuilder);
        //60s加载时间
        searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
        //高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("title");
        //这里可以配置多个字段,信息等等
        //关闭多个高亮
        highlightBuilder.requireFieldMatch(false);
        highlightBuilder.preTags("<span style='color:red'>");
        highlightBuilder.postTags("</span>");
        searchSourceBuilder.highlighter(highlightBuilder);
        //执行搜索
        searchRequest.source(searchSourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        //解析结果
        ArrayList<Map<String,Object>> list = new ArrayList<>();
        for (SearchHit documentFields : searchResponse.getHits().getHits()) {
    
    
            //解析我们的高亮字段
            Map<String, HighlightField> highlightFields = documentFields.getHighlightFields();
            HighlightField title = highlightFields.get("title");
            //原来的结果
            Map<String, Object> sourceAsMap = documentFields.getSourceAsMap();
            if (documentFields != null)
            {
    
    
                Text[] fragments = title.fragments();
                String n_title = "";
                //将高亮替换原来的字段
                for (Text text : fragments) {
    
    
                    n_title += text;
                }
                sourceAsMap.put("title",n_title);
            }
            list.add(sourceAsMap);
        }
        return list;
    }
    //精确匹配

// TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keyword);
MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery("title", keyword); At
Insert picture description here
this point, when I first watched the video, I had some impressions, saying that the term does not segment words. The keyword field is not segmented, but I have loaded the ik tokenizer. The ik tokenizer will segment the word by default, so you can no longer use TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keyword); this method should be changed Into MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery("title", keyword); The pattern of exact match! ! !
Learned, have a deeper understanding of the exchange! ! ! !

Guess you like

Origin blog.csdn.net/qq_22155255/article/details/111735304