Spring Boot integrates Elasticsearch in practice

1. Introduction

I have to use Elasticsearch in my recent project, so I went to briefly learn how to use it. Some of the specific advanced functions cannot be displayed for the time being, and the capabilities are currently a bit limited, but some basic needs can still be met. So I wrote an article to sort it out and hope it can point out the shortcomings.

2. Install and configure Elasticsearch

docker deployment
normal deployment

3. Integrate Spring Boot and Elasticsearch

1. Add dependencies and configuration files

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
spring:
  elasticsearch:
    rest:
      uris: 127.0.0.1:9200 #可配置多个,以逗号间隔举例: ip,ip
      connection-timeout: 1
      read-timeout: 30

2. Create Elasticsearch data model

import com.fasterxml.jackson.annotation.JsonFormat;
import lombok.Data;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.DateFormat;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.util.Date;

/**
 * @BelongsProject: spring-elas
 * @BelongsPackage: com.example.springelas.elas.entity
 * @Author: gepengjun
 * @CreateTime: 2023-09-07  09:16
 * @Description: TODO
 * @Version: 1.0
 */
@Data
@Document(indexName = "book",createIndex = true)
public class Book {
    
    
    @Id
    @Field(type = FieldType.Text)
    private String id;
    @Field(analyzer="ik_max_word")
    private String title;
    @Field(analyzer="ik_max_word")
    private String author;
    @Field(type = FieldType.Double)
    private Double price;
    @JsonFormat(pattern = "yyyy-MM-dd'T'HH:mm:ss")
    @Field(type = FieldType.Date,format = DateFormat.custom, pattern = "8uuuu-MM-dd'T'HH:mm:ss")
    private Date createTime;
    @Field(type = FieldType.Date,format = DateFormat.time)
    private Date updateTime;
    /**
     * 1. Jackson日期时间序列化问题:
     * Cannot deserialize value of type `java.time.LocalDateTime` from String "2020-06-04 15:07:54": Failed to deserialize java.time.LocalDateTime: (java.time.format.DateTimeParseException) Text '2020-06-04 15:07:54' could not be parsed at index 10
     * 解决:@JsonFormat(pattern = "yyyy-MM-dd HH:mm:ss")
     * 2. 日期在ES存为long类型
     * 解决:需要加format = DateFormat.custom
     * 3. java.time.DateTimeException: Unable to obtain LocalDate from TemporalAccessor: {DayOfMonth=5, YearOfEra=2020, MonthOfYear=6},ISO of type java.time.format.Parsed
     * 解决:pattern = "uuuu-MM-dd HH:mm:ss" 即将yyyy改为uuuu,或8uuuu: pattern = "8uuuu-MM-dd HH:mm:ss"
     * 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/migrate-to-java-time.html#java-time-migration-incompatible-date-formats
     */
}

3. Define the Elasticsearch warehouse interface

public interface ESBookRepository extends ElasticsearchRepository<Book, String> {
    
    

    List<Book> findByTitleOrAuthor(String title, String author);

    @Highlight(fields = {
    
    
            @HighlightField(name = "title"),
            @HighlightField(name = "author")
    })
    @Query("{\"match\":{\"title\":\"?0\"}}")
    SearchHits<Book> find(String keyword);
}

4. Implement Elasticsearch data operations

@Service
public class ESBookImpl {
    
    
    @Autowired
    ESBookRepository esBookRepository;

    public void insertBook(Book book){
    
    

       Book a= esBookRepository.save(book);
        System.out.println(a);
    }

    public Book queryBook(String keyWord){
    
    
       return esBookRepository.findById(keyWord).get();
    }
}

4. Basic query and index operations

1. Insert and update data

Insert image description here

2. Delete data and indexes

    /**
     * @description: 根据id删除
     * @author: gepengjun
     * @date: 2023/9/7 10:35
     * @param: [keyWord]
     * @return: void
     **/
    public void deleteBook(String keyWord){
    
    
        esBookRepository.deleteById(keyWord);
//        esBookRepository.delete(book);  //可通过实体删除
    }

First, get all the data according to the findAll method provided by spring.
Insert image description here
Then call the delete method and delete according to the ID.
You can see that the data with ID 1 is no longer there.
Insert image description here

3. Conditional query and paging query

Define a paging query method in the Elasticsearch warehouse

 Page<Book> findByTitle(String title, Pageable pageable);

Call this method in the business encapsulated class

    public Object pageBook(String author){
    
    
        Pageable pageable= PageRequest.of(0, 3);
       return esBookRepository.findByTitle(author,pageable);
    }

Finally, call it in control and you can see the execution.
Insert image description here

4. Sorting and aggregation queries

sort

This is the sorting of all queries. If you need to sort based on conditional queries, you can refer to the paging above to set it yourself.

    public Object findDESCBook(){
    
    
        //设置排序规则,针对某个字段排序
        Sort sort = Sort.by(Sort.Direction.DESC, "price");
        return esBookRepository.findAll(sort);
    }

Sort by price field

Insert image description here

Aggregation query

There are still some minor flaws in this aggregate query.


    @Autowired
    private ElasticsearchOperations elasticsearchOperations;
    /**
     * @description: 聚合查询
     * @author: gepengjun
     * @date: 2023/9/7 11:37
     * @param: []
     * @return: java.lang.Object
     **/
    public Object findAggregationBOOK(String title){
    
    
        Pageable pageable= PageRequest.of(0, 3);
        TermsAggregationBuilder builder1 = AggregationBuilders.terms("taxonomy").field("title.keyword");

        //构建查询
        NativeSearchQuery build = new NativeSearchQueryBuilder()
                .addAggregation(builder1)
                .withPageable(pageable)
                .build();

        SearchHits<Book> search = elasticsearchOperations.search(build, Book.class);
        for (SearchHit<Book> bookSearchHit : search) {
    
    
            System.out.println(bookSearchHit.getContent());
        }
        Aggregations aggregations = search.getAggregations();
        Map<String, Aggregation> asMap = aggregations.getAsMap();
        return asMap;
    }

Insert image description here

Application scenarios

Aggregate queries are an important feature in Elasticsearch that can be used to extract meaningful summary information and statistical results from large amounts of data. The following is a summary of several common application scenarios of aggregate queries in Elasticsearch:

  1. Data analysis and statistics: Aggregation queries can perform statistics and analysis on large amounts of data, such as calculating averages, sums, maximum values, minimum values, etc. It can be used to generate reports, plots, or perform complex data analysis tasks.

  2. Group statistics: Aggregation queries enable us to group data based on specified fields and calculate statistics for each group. For example, in e-commerce, sales data can be grouped according to product categories to obtain sales or sales volume for each category.

  3. Nested aggregation: Elasticsearch supports nesting multiple aggregation operations together to achieve more complex statistical and analytical needs. By building multiple levels of nested aggregations, you can drill down into the relationships between your data and gain more detailed insights.

  4. Time Analysis: Aggregation queries are very useful in time series data analysis. It can bucket data according to specified time intervals, and then perform statistical analysis operations within each time period. For example, access log data can be analyzed over time by hours, days, weeks, or months.

  5. Bucket analysis: Bucket aggregation is an aggregation method that divides data into different buckets or intervals. Bucket conditions can be defined through range, term matching, or scripts, and statistical analysis can be performed on each bucket.

  6. Cardinality and deduplication counting: Aggregation queries also support cardinality statistics and deduplication counting. You can find the number of unique values ​​in a field, or count duplicate values ​​in it.

  7. Multi-field statistics: Elasticsearch allows statistics of multiple fields in one aggregation operation. This is useful for analyzing multiple metrics or dimensions simultaneously.

5. Advanced query and full-text search

1. Multi-field matching and fuzzy query

    /**
     * @description: 多字段匹配查询
     * @author: gepengjun
     * @date: 2023/9/7 15:40
     * @param: [field1, field2]
     * @return: java.util.List<com.example.springelas.elas.entity.Book>
     **/
    List<Book> findByAuthorOrPrice(String field1, String field2);

    /**
     * @description: 针对一个字段模糊查询
     * @author: gepengjun
     * @date: 2023/9/7 15:40
     * @param: [pattern]
     * @return: java.util.List<com.example.springelas.elas.entity.Book>
     **/
    List<Book> findByAuthorLike(String pattern);

2. Range query and regular expression query

        /**
         * @description: 查询某一个字段根据正则表达式
         * @author: gepengjun
         * @date: 2023/9/7 15:41
         * @param: [regexPattern]
         * @return: java.util.List<com.example.springelas.elas.entity.Book>
         **/
    List<Book> findByAuthorRegex(String regexPattern);

//具体使用即使直接传入一个正则表达式
List<Book> entityList = esBookRepository.findByAuthorRegex("^abc.*");

3. Full text search and highlighting

This is the highlight

    @Highlight(fields = {
    
    
            @HighlightField(name = "title"),
            @HighlightField(name = "author")
    })
    @Query("{\"match\":{\"title\":\"?0\"}}")
    SearchHits<Book> find(String keyword);

6. Summary

The use of EL is the same as some ORM frameworks we use, so the package provided by spring for interacting with EL is placed under data.

Guess you like

Origin blog.csdn.net/pengjun_ge/article/details/132729939