ElasticSearch的搜索关键字高亮

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/HcJsJqJSSM/article/details/84066018

一:需求:搜索一本书的题目,搜索关键字高亮加粗显示,(根据title中的文字进行全文搜索,展示出结果来,统计搜索时间了,返回记录数).

     1. 使用技术如下        

            SpringBoot 2.0.5 RELEASE ,ElasticSearch  6.4.1,Bootstrap,Thymeleaf 3.0.9 RELEASE, Maven 3.3.9,lombok,IDEA热部署.  ES的测试数据.搭建了一主两备简单集群.集群名:elasticsearch.

     application.yml

server:
  port: 8082

     2. Maven依赖如下

	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.0.5.RELEASE</version>
		<relativePath/> <!-- lookup parent from repository -->
	</parent>
	<properties>
		<!-- set thymeleaf version SpringBoot默认使用的是Thymeleaf的2.0的版本.-->
		<thymeleaf.version>3.0.9.RELEASE</thymeleaf.version>
		<thymeleaf-layout-dialect.version>2.1.1</thymeleaf-layout-dialect.version>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
		<java.version>1.8</java.version>
	</properties>
	<dependencies>
        <!-- 引入ElasticSearch-->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
		</dependency>
        <dependency>
            <groupId>org.elasticsearch.plugin</groupId>
            <artifactId>transport-netty3-client</artifactId>
            <version>5.6.10</version>
        </dependency>
		<!-- SpringBoot的Web-->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
        <!-- Thymeleaf-->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-thymeleaf</artifactId>
		</dependency>
		<!--添加lombok -->
		<dependency>
			<groupId>org.projectlombok</groupId>
			<artifactId>lombok</artifactId>
			<version>1.16.18</version>
		</dependency>
		<!-- 热部署-->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-devtools</artifactId>
			<optional>true</optional>
		</dependency>
		<!-- SpringBoot的test-->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>

       3. 配置ElasticSearch(这里使用Java注解的配置方式)

          ElasticSearchConfig配置如下.

import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.net.InetAddress;

@Configuration
public class ElasticSearchConfig {
    private static final Logger logger = LoggerFactory.getLogger(ElasticSearchConfig.class);
    /** ES的配置信息*/
    private static final String CLUSTER_NAME="elasticsearch";
    private static final String HOST_NAME="localhost";
    private static final Integer PORT=9300;

    /** 返回TransportClient*/
   @Bean
    public TransportClient client() {
           logger.info("初始化开始中...");
           TransportClient client = null;
           try {
               TransportAddress transportAddress = new InetSocketTransportAddress(InetAddress.getByName(HOST_NAME),PORT);
               // 配置信息
               Settings esSetting = Settings.builder()
                       .put("cluster.name", CLUSTER_NAME)
                       .build();
               // 配置信息Settings自定义
               client= new PreBuiltTransportClient(esSetting);
               client.addTransportAddresses(transportAddress);
           } catch (Exception e) {
               logger.error("elasticsearch TransportClient create error!!!", e);
           }
           return client;
   }
}

4. Novel实体类

import lombok.Getter;
import lombok.Setter;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import java.util.Date;

@Document(indexName = "book",type="novel")
@Setter
@Getter
public class Novel {
    @Id
    private String id;
    private String title;
    private String author;
    private Integer word_count;
    private Date publish_data;

    public Novel(){
      super();
    }
    

    @Override
    public String toString() {
        return "Novel{" +
                "id=" + id +
                ", title='" + title + '\'' +
                ", author='" + author + '\'' +
                ", word_count=" + word_count +
                ", publish_data=" + publish_data +
                '}';
    }
}

5. NovelController

import com.lx.search.elastic.entity.Novel;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.index.query.Operator;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.elasticsearch.core.ElasticsearchTemplate;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
import org.springframework.data.elasticsearch.core.query.SearchQuery;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.servlet.ModelAndView;
import java.time.Duration;
import java.time.Instant;
import java.util.*;

@Controller
@RequestMapping("/novel")
public class NovelController {
    private static final String PATH="searchtitle";

    @Autowired
    private ElasticsearchTemplate elasticsearchTemplate;

    @Autowired
    private TransportClient client;

    /**
     * 搜索Title并且高亮显示
     */
    @RequestMapping("/search/title/{keyword}")
    public ModelAndView searchTitle(@PathVariable String keyword) {
        ModelAndView modelAndView=new ModelAndView();
        // 统计查询时间,这里开始
        Instant start = Instant.now();
        // 构造查询条件,使用标准分词器.
        QueryBuilder matchQuery = QueryBuilders.matchQuery("title", keyword).analyzer("standard").operator(Operator.OR);
        // 设置高亮,使用默认的highlighter高亮器
        HighlightBuilder highlightBuilder = new HighlightBuilder()
                .field("title")
                .preTags("<span style=\"color:red;font-weight:bold;font-size:15px;\">")
                .postTags("</span>");
        // 设置查询字段
        SearchResponse response = client.prepareSearch("book")
                .setQuery(matchQuery)
                .highlighter(highlightBuilder)
                // 设置一次返回的文档数量
                .setSize(10)
                .get();
        // 返回搜索结果
        SearchHits hits = response.getHits();
        // 统计搜索结束时间
        Instant end = Instant.now();
        System.out.println("共搜索到: "+hits.getTotalHits() + " 条结果" + "," + "共耗时: " +Duration.between(start, end).toMillis()/1000 + " 秒");
        List<Map<Object, Object>> novel=new ArrayList();
        for (int i=0;i<hits.getTotalHits();i++) {
            // 得到SearchHit对象
            SearchHit hit=hits.getAt(i);
            // 遍历结果,使用HashMap存放
            Map<Object,Object> map=new LinkedHashMap();
            map.put("Source As String", hit.getSourceAsString());
            // 返回String格式的文档结果
            System.out.println("Source As String:" + hit.getSourceAsString());
            map.put("Source As Map", hit.getSource());
            // 返回Map格式的文档结果
            System.out.println("Source As Map:" + hit.getSource());
            // 返回文档所在的索引
            map.put("Index", hit.getIndex());
            System.out.println("Index:" + hit.getIndex());
            // 返回文档所在的类型
            map.put("Type", hit.getType());
            System.out.println("Type:" + hit.getType());
            // 返回文档所在的ID编号
            map.put("Id", hit.getId());
            System.out.println("Id:" + hit.getId());
            // 返回指定字段的内容,例如这里返回完整的title的内容
            map.put("Title", hit.getSource().get("title"));
            System.out.println("title: " + hit.getSource().get("title"));
            // 返回文档的评分
            map.put("Scope", hit.getScore());
            System.out.println("Scope:" + hit.getScore());
            // 返回文档的高亮字段
            Text[] text = hit.getHighlightFields().get("title").getFragments();
            String hight="";
            if (text != null) {
                for (Text str : text) {
                    hight+=str;
                    System.out.println(str.toString());
                }
            }
            map.put("Highlight", hight);
            novel.add(map);
        }
        modelAndView.addObject("resultlist", novel);
        modelAndView.addObject("count", "检索出: "+"<span style=\"color:red;font-weight:bold;font-size:18px;\">"+hits.getTotalHits()+"</span>"+"条记录");
        modelAndView.addObject("time", ",共耗时: "+"<span style=\"color:red;font-weight:bold;font-size:18px;\">"+Duration.between(start, end).toMillis() + "</span>"+ "ms");
        modelAndView.setViewName(PATH);
        return modelAndView;
    }
}

6. 视图展示

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:th="http://www.thymeleaf.org">
<head>
    <meta charset="UTF-8">
    <title>【小说标题关键字搜索】</title>
    <link rel="stylesheet" th:href="@{/css/bootstrap.min.css}"  media="all">
    <link rel="stylesheet" th:href="@{/css/index.css}" />
</head>
<body>
      <div style="width:100%;height:60px;" align="center">
           <h2 style="color:#985f0d;">书籍名称关键字搜索</h2>
      </div>
      <br/>
      <div align="center">
          <span style="font-size: 18px;" th:utext="${count}"></span>
          <span style="font-size: 18px;" th:utext="${time}"></span>
      </div>
      <br/>
      <br/>
      <div class="bs-example" data-example-id="striped-table">
      <table class="table table-bordered table-hover">
             <thead>
                 <tr>
                     <th style="text-align:center;" scope="row">序号</th>
                     <th style="text-align:center;">Index</th>
                     <th style="text-align:center;">Type</th>
                     <th style="text-align:center;">ID</th>
                     <th style="text-align:center;">Title</th>
                     <th style="text-align:center;">Score</th>
                 </tr>
             </thead>
             <tbody>
                 <tr th:each="novel,stat:${resultlist}">
                     <th style="text-align:center;" th:text="${stat.count}"></th>
                     <th style="text-align:center;" th:text="${novel['Index']}"></th>
                     <th style="text-align:center;" th:text="${novel['Type']}"></th>
                     <th style="text-align:center;" th:text="${novel['Id']}"></th>
                     <th style="text-align:center;" th:utext="${novel['Highlight']}"></th>
                     <th style="text-align:center;" th:text="${novel['Scope']}"></th>
                </tr>
             </tbody>
      </table>
      </div>
</body>
</html>

 注意点如下:
      :如何model域中的属性值有HTML标签,使用th:utext解析即可.

      :如何获取遍历列表的序号,方式一:${stat.count}是从1开始的,方式  二:${stat.index}是从0开始的,如果从1开始就${stat.index+1}.

        :Thymeleaf遍历Map集合${novel['添加的对象名']}.

       :关键字高亮使用了ES的高亮器+Html标签+Thymeleaf解析含有Html标签的th:utext实现功能的.

7.搜索结果展示

  

8.搜索结果分析总结.

    8.1 分词器问题.

    使用的是standard分词器.就是默认按照中文一个字一个字的切分,使用的是全文搜索(),可以指定分词器,

   这里如何使用ik_max_word和ik_smart均不符合这里的场景的.

// 构造查询条件
QueryBuilder matchQuery = QueryBuilders.matchQuery("title", keyword).analyzer("standard").operator(Operator.OR);

     matchQuery会对查询语句进行分词,分词后查询语句中任何一个词项被匹配,文档就被搜索到了,如果想查询匹配所有关键字的文档就使用AND条件连接.只匹配一个的就使用OR.(多词查询使用).  

     这篇文章介绍比较好.match查询是如何使用bool查询的

     使用 Operator.OR (下面两个条件只要匹配了其中一个就可以搜索到文档了).

    搜索关键字:mybatis 云飞

     使用 Operator.AND(下面两个条件都要匹配,才能搜索到文档).

     搜索关键字:mybatis 云飞

    8.2 高亮器.

    ES提供了三种高亮器,默认的是highlighter高亮器,postings-highlighter高亮器,fast-vector-highlighter高亮器.默认的highlighter高亮器对保存的原始文档进行二次分析,速度最慢,但是不需要额外的存储空间.

  

至此.简单完成了ES的全文搜索搜索结果高亮加粗显示.

猜你喜欢

转载自blog.csdn.net/HcJsJqJSSM/article/details/84066018