Easy-Es framework practice test organizes ORM framework based on ElasticSearch

introduce

Easy-Es (referred to as EE) is an ORM development framework based on the official RestHighLevelClient provided by ElasticSearch (referred to as Es). On the basis of RestHighLevelClient, only enhancements and no changes are made to simplify development and improve efficiency. EE is the Es replacement version of Mybatis-Plus. It is even simpler than MP in some respects, and it also incorporates more Es unique functions to help you quickly realize the development of various scenarios.

(1) Types of Elasticsearch java clients

Elasticsearch officially provides many versions of Java clients, including but not limited to:
[1] Transport client
[2] Java REST client
[3] Low Level REST client
[4] High Level REST client
[5] Java API Client
Unofficial Java client, including but not limited to:
[1] Jest client
[2] BBoss client
[3] Spring Data Elasticsearch client
[4] easy-es client

(2) Analysis of advantages and characteristics

[1] Fully automatic index hosting
is the world's first open source index hosting model. Developers don't need to care about tedious steps such as index creation and update and data migration. The entire life cycle of the index can be hosted by the framework, which is automatically completed by the framework, with zero downtime during the process, and users No perception, completely liberating developers. This feature can help us automatically update and migrate data after modifying the index name, index configuration, and index structure, reducing operation and maintenance costs.

[2] Shield language differences.
It provides a usage method similar to Mybatis-Plus, which is much more convenient to use than RestHighLevelClient, and has further improvements compared to Springdata-ElasticSearch, which is more in line with the usage habits of Chinese people.

[3] Zero magic value and very little code.
This is mainly to solve the bloated problem of RestHighLevelClient code, which is easier to use.

【4】Powerful CRUD operations.
The built-in general Mapper can realize most CRUD operations with only a small amount of configuration, and has a powerful conditional constructor to meet various usage needs.

【5】Support Lambda form call
Through Lambda expressions, it is convenient to write various query conditions, and there is no need to worry about writing wrong paragraphs in fields.

【6】Built-in pagination plug-in.
Based on RestHighLevelClient physical paging, developers do not need to care about specific operations, and do not need to configure additional plug-ins. Writing paging is equivalent to ordinary List query, and maintains the same paging return fields as PageHelper plug-ins, without worrying about naming impact.

【7】Support ES high-level syntax
Support high-level syntax such as highlight search, word segmentation query, weight query, Geo location query, IP query, aggregation query, etc.

【8】Good scalability, support mixed use.
The bottom layer still uses RestHighLevelClient to maintain its scalability. Developers can still use the functions of RestHighLevelClient while using EE.

(3) Performance, Security, Expansion, Community

Easy-Es has a special document description for performance and security issues, the link is as follows:
https://www.easy-es.cn/pages/6e2197
According to the document description, the overall performance is still very good, and it has been accessed in terms of security OSCS Murphy Security Scan has no security risks. The comprehensive coverage rate of unit test cases exceeds 95%, and all functions that have been launched are covered by test cases, and have been verified by a large number of users in the production environment and the open source community.

The bottom layer of EE uses the RestHighLevelClient officially provided by Es. We only enhanced RestHighLevelClient, and did not change, reduce or weaken its original functions. We can use the EE framework in the project or directly use the RestHighLevelClient as needed, which supports mixed use.

At present, the open source framework has joined the dromara open source community. The community is currently active and releases many versions every year to continuously improve user experience.

gitee warehouse situation:
insert image description here
github warehouse situation:
insert image description here
attach the same type of product spring-data-elasticsearch warehouse situation:
insert image description here

(2) ES version and SpringBoot version description

The bottom layer of Easy-Es uses the official RestHighLevelClient of ES, so there is a requirement for the ES version, and the ES and RestHighLevelClient JAR dependency version must be 7.14.0. As for the es client, any version of 7.X can be well compatible .

It is worth noting that due to the existence of Springdata-ElasticSearch, Springboot has built-in dependencies with ES and RestHighLevelClient, which leads to different versions of ES and RestHighLevelClient actually introduced by different versions of Springboot, and these two official ES dependencies are in different versions The compatibility between them is very poor, which further causes many users to be unable to use Easy-Es normally. This is actually a dependency conflict problem. Easy-Es does a dependency check when the project starts. If the project can be controlled at startup When the station sees the log with the level Error and the content "Easy-Es supported elasticsearch and restHighLevelClient jar version is:7.14.0 , Please resolve the dependency conflict!", it means that there is a dependency conflict to be resolved. The solution is actually It's very simple. You can configure maven's exclude as follows to remove the ES and RestHighLevelClient dependencies declared by Springboot or Easy-Es, and then re-introduce. Specify the version number to be 7.14.0 when importing to solve the problem.

        <dependency>
            <groupId>cn.easy-es</groupId>
            <artifactId>easy-es-boot-starter</artifactId>
            <version>1.1.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.elasticsearch.client</groupId>
                    <artifactId>elasticsearch-rest-high-level-client</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.elasticsearch</groupId>
                    <artifactId>elasticsearch</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.14.0</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.14.0</version>
        </dependency>

It is also possible to adjust the springboot version to 2.5.5 simply and rudely. Others do not need to be adjusted, and they can barely be used normally.

index processing

(1) Index alias strategy

In order to better understand and use the update principle of the index, it is best to first understand the mechanism of the ES index alias.
Index alias refers to defining another name for one or more indexes, so that a certain logical relationship can be established between the index alias and the index.
You can use an alias to indicate the inclusion relationship between the alias and the index. Suppose we currently have multiple date log index records, such as log_index_01, log_index_02, log_index_03..., then we can set an alias named log_index for unified retrieval, and then request the index to be log_index. In this way, the data of multiple indexes can be queried through log_index instead of specifying the query one by one.
It should be pointed out that by default, when an alias points to only one index, the request to write data can point to this alias, if the alias points to multiple indexes, the request to write data cannot point to this alias .
After the introduction of aliases, aliases can also be used to represent alternative relationships between indexes. This kind of relationship generally means that after an index is created, some parameters cannot be changed (such as the number of primary shards), but as the business develops, the data in the index increases, and the index parameters need to be changed for optimization. We need to solve this problem smoothly. We need to change the index settings without changing the index name. At this time, we can use index aliases.
Assuming that the search alias of a hotel is set to hotel, when the index hotel_1 is initially created, the number of primary shards is set to 5, and then the alias of hotel_1 is set to hotel. At this time, the client uses the index alias hotel to make a search request, and the request will be forwarded to the index hotel_1. Assuming that the new data in the hotel index increases sharply at this time, the index fragmentation needs to be expanded, and it needs to be expanded into an index of 10 fragments. However, after an index is created, the number of primary shards cannot be changed, so index replacement can only be considered to complete the expansion of the index. At this time, you can create an index hotel_2, except that the number of primary fragments is set to 10, and the other settings are the same as hotel_1. When the index data of hotel_2 is ready, delete the alias hotel of hotel_1 and set the alias of hotel_2 to hotel. At this point, the client does not need to make any changes. When it continues to use hotel for search requests, the request will be forwarded to the index hotel_2. If the service is stable, delete hotel_1 at last. At this point, an index replacement work is completed with the help of the alias. As shown in the figure below, in the left figure, the hotel index alias temporarily points to hotel_1, and hotel_2 is ready for data preparation; in the right figure, the hotel index alias points to hotel_2, and the index expansion switch is completed.
insert image description here
Refer to "Elasticsearch search engine construction introduction and actual combat"
index alias. In this case where the index needs to be changed, the search end can complete the switch without any changes, which is very convenient in the actual production environment.

(2) Smooth mode practice of automatic hosting of easy-es index

(1 Introduction

Tip: If you find that the index has not been updated during use, it is recommended to check the next version against the article "ES version and SpringBoot version".

The smooth mode of automatic hosting (automatic transmission-snow mode) This mode is enabled by default.
In this mode, users can complete the entire lifecycle, such as index creation, update, data migration, etc., without any operations. The process has zero downtime and no user perception. It can achieve a smooth transition in the production environment, similar to the automatic transmission of a car - snow mode, stable and comfortable, and completely liberate users! It is worth noting that in the automatic hosting mode, the system will automatically generate an index named ee-distribute-lock, which is used internally by the framework and can be ignored by users. If a deadlock occurs with a very small probability due to power failure and other factors, you can delete the index. In addition, if you encounter an index change during use, the original index name may be appended with the suffix _s0 or _s1. Don’t panic. It is the only way to fully automatic smooth migration with zero downtime. The index suffix does not affect the use, and the framework will automatically activate the new index. Regarding the _s0 and _s1 suffixes, it is unavoidable in this mode, because the original index data migration must be preserved. It is also not possible to have two indexes with the same name at the same time, and there is a price to pay. If you do not agree with this processing method, you can use other methods.

(2) Practice Test

[1] Create an entity class and bind the index name to document, add related field attributes, and the current index has not been created.
insert image description here
[2] Modify the log to debug mode to view DSL statements. Start the project, you can see that the framework helped us create the index automatically.
insert image description here
[3] The default created primary shard is 1, and the number of replicas is 1
insert image description here
[4] Now we add some test data, directly through the API test provided by the framework, and now three pieces of data have been created.

insert image description here
[5] Next, modify the size of the main fragment of the entity class, the size of the number of replicas, add fields, and then restart the project.
insert image description here
[6] Next, observe the output DSL statement and analyze the implementation principle.
1: First create a new index, the index name is: original index name_s0
insert image description here
2: Then use the reindex command to migrate the data of the original index to the new index, the DSL statement is as follows
insert image description here
3: Modify the alias, the original alias Remove the old index contained under, and then add the new index just created, so that the data can still be queried and processed normally through the original index alias.
insert image description here
4: After the above document migration operation is completed, delete the old index directly
insert image description here
5: The above execution process also includes the creation of the ee-distribute-lock index, which is used internally by the framework and can be ignored. Then query the past data through the query interface, you can query the historical data normally, and its _settings attribute and _mapping table structure have been updated.

Addition, deletion, modification and query of index documents

insert record

// 插入一条记录
Integer insert(T entity);

// 批量插入多条记录
Integer insertBatch(Collection<T> entityList)

update record

//根据 ID 更新
Integer updateById(T entity);

// 根据ID 批量更新
Integer updateBatchByIds(Collection<T> entityList);

//  根据动态条件 更新记录
Integer update(T entity, LambdaEsUpdateWrapper<T> updateWrapper);

Delete Record

 // 根据 ID 删除
Integer deleteById(Serializable id);

// 根据 entity 条件,删除记录
Integer delete(LambdaEsQueryWrapper<T> wrapper);

// 删除(根据ID 批量删除)
Integer deleteBatchIds(Collection<? extends Serializable> idList);

keyword exact query

When we need to perform exact matching, left fuzzy, right fuzzy, full fuzzy, sorting and aggregation operations on the query field, the index type of the field needs to be keyword type, otherwise you will find that the query does not find the desired result, or even reports an error For example, the commonly used APIs eq(), like(), distinct() in EE all require the field type to be keyword type.

    @GetMapping("/search")
    public List<UserInfo> search(String userName) {
    
    
        LambdaEsQueryWrapper<UserInfo> wrapper = new LambdaEsQueryWrapper<>();
        wrapper.eq(UserInfo::getUserName, userName);
        return userInfoMapper.selectList(wrapper);
    }

insert image description here

keyword fuzzy query

        LambdaEsQueryWrapper<UserInfo> wrapper = new LambdaEsQueryWrapper<>();
        wrapper.like(UserInfo::getUserName, userName);

insert image description here

text word segmentation query

When we need to perform word segmentation query on a field, we need the type of the field to be text type, and specify the word breaker (use the ES default word breaker if you don’t specify it, the effect is usually not ideal). For example, API match() commonly used in EE, etc. The field type is required to be text type. When the expected result is not found when using match query, you can check the index type first, and then check the tokenizer, because if a word is not separated by the tokenizer, the result will not be queried.
insert image description here
Chinese needs to install the word breaker in ES in advance.

    /**
     * 分词测试
     */
    @GetMapping("/match")
    public EsPageInfo<UserInfo> match(String word) {
    
    
        LambdaEsQueryWrapper<UserInfo> wrapper = new LambdaEsQueryWrapper<>();
        wrapper.match(UserInfo::getContent, word);
        return EsPageInfo.of(userInfoMapper.selectList(wrapper));
    }

conditional constructor

query_string method

        // 假设我的查询条件是:创建者等于老王,且创建者分词匹配"隔壁"(比如:隔壁老汉,隔壁老王),或者创建者包含猪蹄
        // 对应mysql语法是(creator="老王" and creator like "老王") or creator like "%猪蹄%",下面用es的queryString来演示实现一样的效果
        // 足够灵活,非常适合前端页面中的查询条件列表字段及条件不固定,且可选"与或"的场景.
        LambdaEsQueryWrapper<Document> wrapper = new LambdaEsQueryWrapper<>();
        String queryStr = QueryUtils.combine(Link.OR,
                QueryUtils.buildQueryString(Document::getCreator, "老王", Query.EQ, Link.AND),
                QueryUtils.buildQueryString(Document::getCreator, "隔壁", Query.MATCH))
                + QueryUtils.buildQueryString(Document::getCreator, "*猪蹄*", Query.EQ);
        wrapper.queryStringQuery(queryStr);
        List<Document> documents = documentMapper.selectList(wrapper);
        System.out.println(documents);

Corresponding DSL statement and result display
insert image description here

Paging query

Regarding paging, it supports three paging modes of ES. You can refer to the table below and choose according to your needs.
insert image description here

    // 物理分页
    EsPageInfo<T> pageQuery(LambdaEsQueryWrapper<T> wrapper, Integer pageNum, Integer pageSize);

Document link:
https://www.easy-es.cn/pages/0cf11e/#shallow pagination

Precautions

[1] ElasticSearch 8.X version is not yet supported, currently only es7x is supported, and version 6.X is also not supported.
[2] The smooth mode of automatic hosting of the easy-es index is very convenient to use, but note that it will delete the original index and use a new index. Although it will not be affected in the framework, if other places rely on the index then It may have an impact. If you use this method, it is recommended to adopt an alias strategy and not directly access the index.
【3】Compared with spring-data-elasticsearch and easy-es, both of them are more concise and operable than the official RestHighLevelClient provided by Es. Because spring-data-elasticsearch belongs to the maintenance of spring-data project, the community is more active and updated more frequently. Currently, it supports version 8.X. easy-es is relatively more user-friendly and more in line with the habits of Chinese people.
[4] The hybrid mode supported by easy-es, when easy-es cannot meet the requirements, the native RestHighLevelClient can be used, which is very suitable for practical applications.
[5] ES query function is very strong, with many features, and it is impossible to test all functions due to time problems. However, the current tests for commonly used functions, from the results, basically meet the daily development and use, and the operation is still very simple. According to the feedback from issues, easy-es is expected to launch version 2.X this year, which will have many optimizations and new features to further meet the needs of development and use.

reference documents

https://www.easy-es.cn/pages/ec7460/

https://github.com/zwzhangyu/ZyCodeHub/tree/main/middleware/elasticsearch/easy-es

Guess you like

Origin blog.csdn.net/Octopus21/article/details/128988806