Spring Boot integrated Elasticsearch installation and use (detailed explanation) + ik word segmenter use (detailed explanation)

Table of contents

1. Version correspondence relationship

2. Elasticsearch installation steps

3. SpringBoot project integrates Elasticsearch

1. Dependencies required by pom

2.application project configuration file

3. Project entity mapping

4.Persistence layer interface

5.Persistence layer implementation class

6. Custom query method

7. Possible errors

1.-Problem: There are version conflicts between Elasticsearch, Spring Data, Lucene, etc.

2.-Solution: Add the specified version of lucene dependency instead of using the default lucene that comes with Elasticsearch

3.-Entity mapping Document annotation rules

Edit

4.-Delete the Elasticsearch index library

4. Elasticsearch and ik word segmenter---usage

1. Add ik word segmenter to Elasticsearch

2.Java project directly tests the word segmentation effect

3. Customize some words to prevent word segmentation


1.Version correspondence

(It has guiding value, but the corresponding version has a high fault tolerance rate)
1.Elasticsearch 7.6.2 SpringBoot2.5.6 (implementation)

2.Elasticsearch 7.17.3 SpringBoot2.7.5 (implementation)

2.Elasticsearch installation steps

Reference URL:https://blog.csdn.net/weixin_42633131/article/details/82902812

3. SpringBoot project integrates Elasticsearch

1.pomdemand dependent

pom.xml

<!--ES核心依赖-->

<dependency>

    <groupId>org.springframework.boot</groupId>

    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>

</dependency>



<!--配置客户端需要的依赖,不存在添加,存在不添加-->

<dependency>

    <groupId>org.springframework.boot</groupId>

    <artifactId>spring-boot-starter-web</artifactId>

</dependency>

<!--若不需要Tomcat,则只需要依赖spring-web-->

<!--<dependency>

    <groupId>org.springframework</groupId>

    <artifactId>spring-web</artifactId>

</dependency>-->

2.application project configuration file

application.yml

# yml配置elasticsearch客户端地址(可配置项有限)

spring:

  elasticsearch:

    uris: http://127.0.0.1:9200    # elasticsearch 连接地址

    #username: elastic # 用户名

    #password: 123456 # 密码

    connection-timeout: 10s # 连接超时时间(默认1s)

socket-timeout: 30s # 数据读取超时时间(默认30s)

3. Project entity mapping

package com.nengyy.rest_server.elasticsearch.entity;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.Accessors;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.io.Serializable;
import java.math.BigDecimal;
import java.util.Objects;
@Data
@Accessors(chain = true)     // 支持链式set赋值功能
@AllArgsConstructor          // 自动生成包含全部参数的构造方法
@NoArgsConstructor           // 自动生成无参构造方法
// @Document是SpringDataES标记实体类的注解
// indexName指定关联的索引名称,运行时如果items索引不存在,SpringData会自动将它创建出来
@Document(indexName = "items")
public class ItemEfm implements Serializable {
    // SpringData标记当前属性为ES的主键
    @Id                     //ES本条数据ID(只能为long类型等转化为的String类型,不能出现数字之外的字母或汉字)(暂未研究解决方法)
    private Long id;            //想用ES自带增删改查方法,此id需要为Long类型
    // SpringData标记title属性是text类型支持分词的,以及分词器
    @Field(type = FieldType.Keyword)
    private String GoodsId;
    @Field(type = FieldType.Text,                   //生成索引时使用 ik_max_word,在搜索时用ik_smart
                    analyzer = "ik_max_word",       //最细粒度拆分-->(字会重复被使用)(生成索引时,进行分词使用)
                    searchAnalyzer = "ik_smart")    //最粗粒度拆分-->智能拆分(字不会被重复使用)(搜索时,对搜索词进行分词)
    private String title;       // 商品描述
    // Keyword类型是不需要分词的字符串类型
    @Field(type = FieldType.Keyword)
    private String category;    // 商品分类
    @Field(type = FieldType.Keyword)
    private String brand;       // 品牌(商品描述备用字段)(此字段用来做全称查询,用来补充分词查询的不足)
    @Field(type = FieldType.Keyword)
    private String price;       // 价格
    @Field(type = FieldType.Keyword)
    private String origPrice;       // 原价
    //  图片地址不会成为搜索条件,所以设置index = false
    //  这样ES就不会为它创建索引库了,能够节省空间
    @Field(type = FieldType.Keyword,index = false)
    private String imgPath;     // 图片地址

    // images/1a123s-as4td-asdsa-jasbdjff.png
    @Field(type = FieldType.Keyword)
    private String intertestNum;//关注人数


    //重写equals,用于contains判断,非必要不会用到
    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        ItemEfm other = (ItemEfm) obj;
        return Objects.equals(GoodsId, other.GoodsId) &&
                Objects.equals(title, other.title) &&
                Objects.equals(price, other.price) &&
                Objects.equals(origPrice, other.origPrice) &&
                Objects.equals(imgPath, other.imgPath) &&
                Objects.equals(intertestNum, other.intertestNum);
    }
    @Override
    public int hashCode() {
        return Objects.hash(GoodsId, title, price, origPrice, imgPath, intertestNum);
    }
}

4.Persistence layer interface

Analysis: After the persistence layer ItemRepository interface inherits extends ElasticsearchRepository, it can achieve the same function as MybatisPlus

package com.nengyy.rest_server.elasticsearch.repository;

import com.nengyy.rest_server.elasticsearch.entity.ItemEfm;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import org.springframework.stereotype.Repository;
import java.util.List;
// Spring 家族持久层命名规范为repository
@Repository
public interface ItemRepository extends ElasticsearchRepository<ItemEfm,Long>{
    // 当前ItemRepository接口可以继承SpringDataElasticsearch框架提供的父接口ElasticsearchRepository
    // 一旦继承,效果是会为指定的实体类自动生成基本的增删改查方法
    // ElasticsearchRepository<[关联的实体类名],[实体类主键类型]>

    // SpringData自定义查询
    // 遵循SpringData框架给定的格式,编写方法名称,就可以自动生成查询语句
    // query(查询): 表示当前方法是一个查询方法,类似sql中的select
    // Item\Items: 表示要查询的实体类,不带s返回单个对象,带s返回集合类型
    // By(通过): 标识开始设置条件的关键词,类似sql中的where
    // Title: 要查询的字段名称
    // Matches: 执行的查询操作,Matches表示执行查询支持分词的字符串 类似sql中的like

    Iterable<ItemEfm> queryItemsByTitleMatches(String title);

    // 四.es详细分词查询(多条件查询and)
    // 多条件查询
    // 多个条件之间需要使用逻辑运算符And或Or来分割
    // 方法参数赋值的依据是根据方法名称中参数的顺序来决定的(参数不能乱取名字,title,brand)
    Iterable<ItemEfm> queryItemsByTitleMatchesAndBrandMatches(String title, String brand);

    // 四.es详细分词查询(多条件排序查询or)
    // query、By、Matches、Or、OrderBy、Desc(倒序):均为关键字
    // Items :实体映射名字(也是此实体对应es其中一个索引库的名字)
    // Title、Brand :需要根据什么字段进行查询
    // Price :配合OrderBy、Desc意思是根据价格倒序
    Iterable<ItemEfm> queryItemsByTitleMatchesOrBrandMatchesOrderByPriceDesc(String title,String brand);

    // 一.全称查询(多写一个字段,对其设置FieldType.Keyword, 如此此字段就是不分词的全称索引)
    ItemEfm findByBrand(String brand);

    // 二.es精确分词查询(全称+分词+模糊三种查询合一)(精确查询,前端传参的分词与后端索引,必须全部匹配到才能查询到)
    org.springframework.data.domain.Page<ItemEfm>  findByTitleOrderByIntertestNumDesc(String title,Pageable pageable);

    // 三.分页查询+分词查询--->分词查询(传参和索引都会进行分词,然后进行一一匹配,并通过自带算法做好排序)(自己也可以设置排序) (前端传参的分词与后端索引,只要能匹配到一个就能查询出来)
    // 实现分页查询:最后一个参数的位置添加声明类型Pageable的变量
    // 返回值修改为Page类型,这个类型的对象不但能够保存查询出的数据,而且还能自动计算出分页信息
    // 分页信息中包括:当前页,总页数,总条数,是否有上一页或下一页等
    Page<ItemEfm> queryItemsByTitleMatchesOrBrandMatchesOrderByIntertestNumDesc(
            String title,String brand,Pageable pageable);

    List<ItemEfm> queryItemsByTitleMatchesOrBrandMatchesOrderByOrigPriceDesc(String title,String brand);
}

5. Business logic layer implementation class

package com.nengyy.rest_server.service.impl.mall;

import com.baomidou.mybatisplus.core.metadata.IPage;
import com.baomidou.mybatisplus.extension.plugins.pagination.Page;
import com.nengyy.dao_mysql.common.utils.PageUtil;
import com.nengyy.dao_mysql.mapper.primary.mall.CommodityListMapper;
import com.nengyy.dem_common.utils.ListUtils;
import com.nengyy.dem_common.utils.StringUtils;
import com.nengyy.dto.mall.*;
import com.nengyy.rest_server.elasticsearch.entity.ItemEfm;
import com.nengyy.rest_server.elasticsearch.repository.ItemRepository;
import com.nengyy.rest_server.service.impl.consignment.DatacenterIdGenerator;
import com.nengyy.rest_server.service.mall.ItemRepositoryService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.PageRequest;
import org.springframework.stereotype.Service;
import java.util.*;
/**
 * @Author: cyz
 * @Date: 2023/04/24
 * @Description:
 */
@Service
public class ItemRepositoryServiceImpl implements ItemRepositoryService {
    private final ItemRepository itemRepository;
    private final CommodityListMapper commodityListMapper;
    @Autowired
    public ItemRepositoryServiceImpl(ItemRepository itemRepository, CommodityListMapper commodityListMapper) {
        this.itemRepository = itemRepository;
        this.commodityListMapper = commodityListMapper;
    }
    /**从数据库查出所有符合条件的商品 , 每天定时进行es索引的刷新*/
    @Override
    public String goodsList() {
        List<ESCommodityListDto> data = commodityListMapper.getGoodsList();
        List<ItemEfm> list = new ArrayList<>();
        DatacenterIdGenerator snowId1 = new DatacenterIdGenerator(17,18);//获取雪花算法ID
        ItemEfm item;
        if(data != null){
            for (ESCommodityListDto listDto:data) {
                if("2".equals(listDto.getGoodsSource())){
                    item = new ItemEfm();
                    item.setId(snowId1.nextId());
                    item.setGoodsId(listDto.getGoodsId());
                    item.setTitle(listDto.getGoodsDesc());
                    item.setCategory(listDto.getGoodsType());
                    item.setBrand(listDto.getGoodsDesc());      //此字段用来做全称查询,用来补充分词查询的不足
                    item.setPrice(listDto.getGoodsFee());
                    item.setOrigPrice(listDto.getReservFee1());
                    item.setImgPath(listDto.getFilePath());
                    item.setIntertestNum(listDto.getIntertestNum());
                    list.add(item);
                }else{
                    item = new ItemEfm();
                    item.setId(snowId1.nextId());
                    item.setGoodsId(listDto.getGoodsId());
                    item.setTitle(listDto.getGoodsName());
                    item.setCategory(listDto.getGoodsType());
                    item.setBrand(listDto.getGoodsName());      //此字段用来做全称查询,用来补充分词查询的不足
                    item.setPrice(listDto.getGoodsFee());
                    item.setOrigPrice(listDto.getReservFee1());
                    item.setImgPath(listDto.getFilePath());
                    item.setIntertestNum(listDto.getIntertestNum());
                    list.add(item);
                }
            }
            //先删除之前存的索引(批量删)
            itemRepository.deleteAll();
            //添加新的索引     (批量加)
            itemRepository.saveAll(list);
        }
        return "成功";
}

    /**
     * Es搜索出的商品(分页获取商品列表)
     * @param in
     * @return
     */
    @Override
public Page<CommodityListDto> searchCommodityList(SiftEsCommodityListDto in) { 
//一.全称查询(不分词查询)
     ItemEfm oneGoods =  itemRepository.findByBrand(in.getGoodsDesc());
     //二.es精确分词查询(全称+分词+模糊三种查询合一)(精确查询,前端传参的分词与后端索引,必须全部匹配到才能查询到)
                org.springframework.data.domain.Page<ItemEfm> pageGoodsList = itemRepository.findByTitleOrderByIntertestNumDesc(in.getGoodsDesc(),PageRequest.of(0, 49));
     //三.分页查(非精确分词查询)(前端传参的分词与后端索引,只要能匹配到一个就能查询出来)
            int size = in.getPageSize() - pageSizeExtra;
            int size1 = size >=0 ? size :1;
            org.springframework.data.domain.Page<ItemEfm> page = itemRepository
            .queryItemsByTitleMatchesOrBrandMatchesOrderByIntertestNumDesc(   //通过商品描述和商品品牌(暂定传值为空)进行查询并通过关注人数进行排序
                            in.getGoodsDesc(),"",PageRequest.of(in.getPageNo() - 1, size1));
}

6.Customized query method

Spring Data automatically implements functions based on method names:

There are two query methods: query (detailed query) and find (precise query)


// 4.es detailed word segmentation query (multiple conditions sorting query or)

    // query, By, Matches, Or, OrderBy, Desc (reverse order): all are keywords

    // Items: entity mapping name (also the name of one of the es index libraries corresponding to this entity)

    // Title, Brand: What fields need to be queried

    // Price: Combined with OrderBy and Desc, it means in reverse order according to price

    //Black and blue are both keywords   The entity mapping name is based on whichfield query< /span>                 Reverse order according to which field

Iterable<ItemEfm> queryItemsByTitleMatchesOrBrandMatchesOrderByPriceDesc(String title,String brand);

// 2. es precise word segmentation query (full name + word segmentation + fuzzy three queries in one) (exact query, the word segmentation of front-end parameters and the back-end index must all match to be queried)

    org.springframework.data.domain.Page<ItemEfm>  findByTitleOrderByIntertestNumDesc(String title,Pageable pageable);

// 3. Paging query + word segmentation query ---> Word segmentation query (parameters and indexes will be segmented, and then matched one by one, and sorted through the built-in algorithm) (you can also set the sorting yourself) (front-end transmission The word segmentation and back-end index of the parameter can be queried as long as one can be matched)

// Implement paging query: add a variable declaring type Pageable at the position of the last parameter

//The return value is modified to the Page type. This type of object can not only save the queried data, but also automatically calculate the paging information.

//Paging information includes: current page, total number of pages, total number of items, whether there is a previous page or next page, etc.

    Page<ItemEfm> queryItemsByTitleMatchesOrBrandMatchesOrderByIntertestNumDesc(

            String title,String brand,Pageable pageable);

7. Possible errors

1.-Problem: There are version conflicts between Elasticsearch, Spring Data, Lucene, etc.

2.-Solution: Add the specified version of lucene dependency instead of using the default lucene that comes with Elasticsearch

You can ask yourself through chatGptThe lucene version corresponding to different versions of Elasticsearch

<dependency>

    <groupId>org.apache.lucene</groupId>

    <artifactId>lucene-core</artifactId>

    <version>8.11.1</version>

</dependency>

3.-Entity mapping Document annotation rules

@Document(indexName = "itemsapp1")       

//whereitemsapp1 is the library name of the custom index library. There are many index libraries in es, and the project passes through different entities Index library created

4.-Delete the Elasticsearch index library

Explanation: (After deleting the index, restart the project and call the refresh index project to rebuild the index library)

Solution: 1. If you want to delete the Elasticsearch index library, you can use the es visualization tool

2.Elasticvue - (es visualization tool) free and open source Elasticsearch GUI for browsers

Elasticvue installation steps URL:https://blog.csdn.net/UbuntuTouch/article/details/125777834

3. UseElasticvue to connect to Elasticsearch

4. Elasticsearch and ik word segmenter---usage

1. Add ik word segmenter to Elasticsearch

--Ik download URL: https://github.com/medcl/elasticsearch-analysis-ik/releases

--Refer to the ik detailed explanation website:[Selected] ElasticSearch - Download and use of IK word segmenter_ik word segmenter download-CSDN blog

--Rename the decompressed ik word segmenter to ik and put it in the plugins directory (before downloading, Baidu checked the corresponding version of ik for its own es, and the fault tolerance rate is quite high)

2.Java project directly tests the word segmentation effect

1.Case code

GET http://localhost:9200

### Three #s are both separators and comments. They must be used to separate two requests, otherwise they will not run.

POST http://localhost:9200/_analyze

Content-Type: application/json

{

  "text": "Green Ice Bracelet",

  "analyzer": "ik_smart"

}

2. Create a demo case

3. After Es starts, just click Start to test.

3. Customize some words to prevent word segmentation

1. Open main.dic directly with Notepad and add words that you do not want to segment.

2.ik_smart和ik_max_word两种分词不訳

ik_smart //Smart word segmentation: coarse-grained word segmentation (words will not be reused)

ik_max_word //Fine-grained word segmentation          (Words will be used repeatedly)
Project entity suggestions:

    @Field(type = FieldType.Text, //Use ik_max_word when generating index, use ik_smart when searching (word segmentation of parameters passed from the front end)

                    Analyzer = & quot; IK_MAX_WORD & Quot;, // The finest granularity split-& gt; (the word will be repeatedly used) (when generating an index, use the word division)

                    SearChanalyzer = & Quot; IK_SMART & Quot;) // The thickest particle size split-& gt; intelligent splitting (the word will not be reused) (when searching, seminar the search words)

    private String title;       // Product description

Guess you like

Origin blog.csdn.net/m0_71202849/article/details/134008764