Elasticsearch: Have you stepped on these pits?

I. Introduction

This article lists some pitfalls that most people may encounter when using Elasticsearch, for your reference, discussion, and supplement.


2. Pit 1: Is ES quasi-real-time?

In order to verify whether this pit is a real pit, you can manually test it yourself:
when the data is updated to ES and the return prompt is successful, immediately query through ES to check whether the returned data is the latest.

Thinking: If the queried data is the latest, the pit is not considered a pit and can be filled with soil; but if it is not the latest data, what is the reason behind it?

If you haven't done verification yet, it doesn't matter, let's take a look at the whole process of ES data indexing, maybe you will find clues from it.


|| The entire process of data indexing
The entire process of data indexing involves knowledge points such as ES fragmentation, the relationship between Lucene Index, Segment, and Document.

The relationship diagram between Lucene, Segment and Document is as follows:
insert image description here
the relationship between each other:

  • Lucene Index can store multiple Segments;
  • Segment can store multiple Documents.


A fragment of ES is a Lucene Index, and each Lucene Index is composed of multiple Segments, that is, a subset of Lucene Index is Segment:
insert image description here


|| Detailed explanation of data indexing process

  1. When a new Document is created, the data will first be stored in the new Segment, while the old Document will be deleted, and a deletion mark will be marked on the original Segment. When the Document is updated, the old version of the Document will be marked as deleted, and the new version of the Document will be stored in the new Segment.

  2. When Shared receives a write request, the request will be written into the Translog, and then the Document will be stored in the memory buffer (note: the data in the memory buffer cannot be searched), and finally the Translog will save all modification records.
    insert image description here

  3. Every 1 second (the default setting), the refresh operation is executed once, and the data in the memory buffer will be written into a segment and stored in the filesystem cache, and new data can be searched at this time.
    insert image description here
    Through the description of the above indexing process, we can draw a conclusion: ES is not real-time, there is a delay of 1 second.

You may ask, how should we solve the delay problem in practical application?
Simple method: prompt the user to query the data with a certain delay, just try again.


3. Pit 2: After the ES crashes and recovers, the data is lost

In the first pit point above, we mentioned that every 1 second (the default configuration), the data in the memory buffer will be written into the Segment. At this time, this part of the data can be searched by the user, but there is no Persisted, once the system goes down, the data will be lost.

insert image description here
The data in the gray bucket in the above figure can be searched, but it is not persisted. Once the ES goes down, this part of the data will be lost.


How is the data lost?
This problem can be easily solved by using the commit operation in Lucene.

The specific operation of commit: first merge and save multiple segments to the disk, and then change the gray bucket into a green bucket.
Disadvantages of commit: IO consumption, which causes the problem of ES downtime between commits. Once the system goes down before translog fsync, the data will be lost directly.

This leads to a new question, how to ensure the integrity of the data?
Translog is used to solve the problem, because the data in Translog will not be saved directly on the disk, and will only be saved after fsync. Translog has two solutions:

  • The first method: set Index.translog.durability to request. If we find that the system is running well, we can use this method.
  • The second method: set Index.translog.durability to fsync, and after each ES shutdown starts, first compare the main data with the ES data, and then find out the missing data of the ES.

Note that a knowledge point should be emphasized here: when will Translog fsync?

When Index.translog.durability is set to request, each request will be fsynced, but this affects ES performance. At this time, we can set Index.translog.durability to fsync, then every request will fsync once every Index.translog.sync_interval.


3. Pit 3: The deeper the paging, the slower the query efficiency

The emergence of the ES paging pit is closely related to the processing flow of the ES read operation request. Therefore, it is necessary for us to deeply analyze the processing flow of the ES read operation request, as shown in the following figure:
insert image description here

The read operation process of ES is mainly divided into two phases: Query Phase and Fetch Phase.


|| Query Phase
The nodes coordinated in the query phase first distribute the request to all shards, then each shard queries locally, builds a result set queue, stores the Document id and search score in the command in the queue, and then returns it to The coordinating node, and finally the coordinating node will build a global queue, merge all the result sets received and perform global sorting.

Query Phase needs to be emphasized: during the ES query process, if the search takes from and size parameters, the Elasticsearch cluster needs to return shards number * (from + size) pieces of data to the coordinating node, then sort them on a single machine, and finally return them to the client size The size of the data. For example, if the client requests 10 pieces of data (such as 3 fragments), then each fragment will return 10 pieces of data, and the coordinating node will finally merge 30 pieces of data, but finally only return 10 pieces of data to the client.


|| Fetch Phase
The coordinating node first obtains the complete Document from all fragments according to the Document id in the result set, then all the fragments return the complete Document to the coordinating node, and finally the coordinating node returns the result to the client.

In the entire ES read operation process, the Elasticsearch cluster actually needs to return shards number * (from + size) pieces of data to the coordinating node, then sort them on a single machine, and finally return data of this size to the client.

For example, if there are 5 shards, we need to query the results with sorting numbers ranging from 10000 to 10010 (from=10000, size=10). How much data does each shard return to the coordinating node for calculation? Tell you not 10, but 10010. That is to say, the coordinating node needs to calculate 10010*5=50050 records in memory, so in system use, if the user pages deeper, the query speed will be slower, that is to say, the more pages, the better.


So how to better solve the ES paging problem?

In order to control performance, we mainly use the max_result_window configuration in ES, which defaults to 10000. When from+size > max_result_window, ES will return an error.

It can be seen that when designing the system, we generally need to control the user from turning pages too deeply, and this is acceptable to users in real scenarios. This is also the design method adopted in my previous plan. If the user really needs to turn pages in depth, we can use the search_after function in ES to solve it, but it is impossible to realize page jumping.

insert image description here
The query is paged according to the total amount of orders. The total amount of the last order on the previous page is 10. Then the query sample code on the next page is as follows: (the search_after value is the result value of the sorting field of the last query result)

{
    
    
    "query":{
    
    
        "bool":{
    
    
            "must":[
                {
    
    
                    "term":{
    
    
                        "user.user_name.keyword":"李大侠"
                    }
                }
            ],
            "must_not":[

            ],
            "should":[

            ]
        }
    },
    "from":0,
    "size":2,
    "search_after":[
        "10"
    ],
    "sort":[
        {
    
    
            "total_amount":"asc"
        }
    ],
    "aggs":{
    
    

    }
}

Guess you like

Origin blog.csdn.net/locahuang/article/details/123570627