Technology from major Internet companies - elasticsearch(es) - improves query efficiency when the amount of data is large (billions of levels)

Technology from major Internet companies - elasticsearch(es) - improves query efficiency when the amount of data is large (billions of levels)

Table of contents

1. Problem analysis

2. Problem analysis

3. The trump card for performance optimization (filesystem cache)

4. Data warm-up

5. Separation of hot and cold

6. Document model design

7. Paging performance optimization

8. Solution


1. Problem analysis

This question must be asked. To put it bluntly, it depends on whether you have actually done ES. Why? In fact, es performance is not as good as you think. Many times when the amount of data is large, especially when there are hundreds of millions of pieces of data, you may be confused to find that it takes 5 to 10 seconds to run a search, which is a scam. When searching for the first time, it takes 5 to 10 seconds, but later it gets faster, maybe just a few hundred milliseconds.

You are very confused. Every user's first access will be slower, is it laggy? So if you have never played es, or you have just played the demo yourself, you will easily be confused when asked this question, which shows that you are really not good at playing es?

2. Problem analysis

To be honest, there is no silver bullet for ES performance optimization. What does it mean? Just don’t expect that just adjusting a parameter will be able to deal with all slow performance scenarios. Maybe in some scenarios, you can change the parameters or adjust the syntax, but this is definitely not the case in all scenarios.

3. The trump card for performance optimization (filesystem cache)

The data you write to es is actually written to the disk file. When querying, the operating system will automatically cache the data in the disk file into the filesystem cache.

es-search-process

The search engine of es relies heavily on the underlying filesystem cache . If you give the filesystem cache more memory and try to make the memory accommodate all the idx segment file index data files, then your search will basically use the memory, and the performance It will be very high.

How big can the performance gap be? In many of our previous tests and stress tests, if you use the disk, it will usually take seconds, and the search performance is definitely at the second level, 1 second, 5 seconds, 10 seconds. But if you use filesystem cache, which is pure memory, then generally speaking, the performance is an order of magnitude higher than that of disk, which is basically millisecond level, ranging from a few milliseconds to hundreds of milliseconds.

Here is a real case. A certain company's es node has 3 machines. Each machine seems to have a lot of memory, 64G. The total memory is 64 * 3 = 192G. Each machine provides ES JVM heap of 32G, so the remaining memory for the filesystem cache is 32G per machine. The total memory provided for the filesystem cache in the cluster is 32 * 3 = 96G. At this time, the index data files on the entire disk occupy a total of 1T of disk capacity on three machines. The es data volume is 1T, so the data volume of each machine is 300G. Does this perform well? The memory of the filesystem cache is only 100G. One-tenth of the data can be placed in the memory, and the rest is on the disk. Then when you perform search operations, most of the operations are performed on the disk, and the performance is definitely poor.

In the final analysis, you want es to perform better. In the best case, your machine's memory can accommodate at least half of your total data volume.

According to our own practical experience in the production environment, in the best case, only a small amount of data is stored in es, which is the index you want to use for searching. If the memory left for the filesystem cache is 100G, then you will The index data is controlled within 100G. In this case, almost all of your data is searched in memory, and the performance is very high, generally within 1 second.

For example, you now have a row of data. id,name,age.... 30 fields. But when you search now, you only need to search based on the three fields of id, name, and age. If you foolishly write all the fields of a row of data into es, it will result in 90% of the data not being used for searching. As a result, it will occupy the filesystem cache space on the es machine. The larger the data volume of a single piece of data, the This will result in less data that filesystem cahce can cache. In fact, you only need to write a few fields in es that you want to retrieve. For example, write the three fields of es id, name, and age. Then you can store other field data in mysql/hbase. We It is generally recommended to use an architecture such as es + hbase.

The characteristic of hbase is that it is suitable for online storage of massive data, that is, you can write massive data to hbase, but do not perform complex searches, just perform a simple query based on id or range. Search according to name and age in es, and the results obtained may be only 20 doc ids. Then according to the doc id, go to hbase to query the complete data corresponding to each doc id, find it out, and then return it to the front end.

The data written to es is preferably less than or equal to, or slightly larger than the memory capacity of the filesystem cache of es. Then it may take you 20ms to retrieve from es, and then query 20 pieces of data in hbase based on the ID returned by es, which may only cost 30ms. Maybe you used to play like this, put 1T of data in es, and it would happen every time The query time is 5~10s, and now the performance may be very high, each query is 50ms.

4. Data warm-up

Suppose, even if you follow the above plan, the amount of data written by each machine in the ES cluster is still more than double the filesystem cache. For example, if you write 60G data to a machine, the filesystem cache will be 30G. , there is still 30G of data left on the disk.

In fact, you can do data preheating.

For example, take Weibo, you can create a backend system for some big V data that is usually viewed by a lot of people. Every once in a while, your backend system will search for hot data and flash it to the filesystem. cache, when users actually look at the hot data later, they will search directly from the memory, which is very fast.

Or for e-commerce, you can create a program in the background for the hot data of some of the most viewed products, such as iPhone 8, and actively access it every 1 minute, and then flush it to the filesystem cache.

For those data that you think are hot and that are frequently accessed, it is best to build a special cache preheating subsystem, which means to access the hot data in advance every once in a while so that the data can enter the filesystem cache. In this way, the performance will be much better the next time others visit.

5. Separation of hot and cold

es can do horizontal splitting similar to mysql, which means writing a separate index for a large amount of data that is seldom accessed and very low frequency, and then writing a separate index for hot data that is accessed frequently. It is best to write the cold data into one index, and then write the hot data into another index. This can ensure that after the hot data is warmed up, try to keep them in the filesystem os cache and prevent the cold data from being flushed. Lose.

You see, suppose you have 6 machines, 2 indexes, one for cold data, one for hot data, and each index has 3 shards. 3 machines release heating data index, and the other 3 machines release cooling data index. In this case, you spend a lot of time accessing hot data index, and hot data may account for 10% of the total data volume. At this time, the amount of data is very small, and almost all of it is retained in the filesystem cache, which ensures access to hot data. Performance is very high. But for cold data, it is in another index and is not on the same machine as the hot data index, so there is no connection between them. If someone accesses cold data, a large amount of data may be on the disk. At this time, the performance is poor. It doesn't matter if only 10% of the people access the cold data and 90% of the people access the hot data.

6. Document model design

For MySQL, we often have some complex related queries. How to play in es? Try not to use complex related queries in es. Once used, the performance is generally not very good.

It is best to complete the association in the Java system first and write the associated data directly into es . When searching, there is no need to use the search syntax of es to complete associated searches such as join.

Document model design is very important. There are many operations. Don't be tempted to perform various complex and messy operations only when searching. There are only so many operations that es can support. Don't consider using es to do something that is not easy to operate. If there is such an operation, try to complete it when the document model is designed and written. In addition, some operations that are too complex, such as join/nested/parent-child searches, should be avoided as much as possible, as the performance will be poor.

7. Paging performance optimization

The paging of es is quite tricky, why? For example, if you have 10 pieces of data per page, and you now want to query page 100, you will actually query the first 1,000 pieces of data stored on each shard to a coordinating node. If you have 5 shard, then there are 5,000 pieces of data, and then the coordination node performs some merging and processing on these 5,000 pieces of data, and then obtains the final 10 pieces of data on page 100.

Distributed, if you want to check 10 pieces of data on page 100, it is impossible to check 2 pieces of data for each shard from 5 shards, and finally merge it into 10 pieces of data at the coordination node, right? You have to check 1,000 pieces of data from each shard, then sort, filter, etc. according to your needs, and finally paginate again to get the 100th page of data. When you turn the page, the deeper you turn, the more data each shard returns, and the longer the coordination node takes to process, which is very frustrating. So when using es for paging, you will find that the further you turn to the back, the slower it becomes.

We have encountered this problem before. When using es for paging, it takes tens of milliseconds for the first few pages. When turning to 10 pages or dozens of pages, it basically takes 5 to 10 seconds to find out one page of data.

8. Solution

Do not allow deep paging (default deep paging performance is poor)

Tell the product manager that your system does not allow you to turn pages that deep. The deeper you turn by default, the worse the performance will be.

Similar to the recommended products in the app, which are constantly pulled down page by page.

Similar to Weibo, you can scroll through Weibo by scrolling down to see page after page. You can use the scroll API to search online on how to use it.

scroll will generate a snapshot of all the data for you at one time, and then each time you slide to turn the page backward, you will move through the cursor scroll_id to get the next page and the next page. The performance will be much higher than the paging performance mentioned above. A lot, basically on the millisecond level.

However, the only thing is that this is suitable for scenes like scrolling down Weibo pages and not jumping to any page at will. In other words, you can't go to page 10 first, then go to page 120, and then back to page 58. You can't jump around at will. Therefore, many products now do not allow you to turn pages at will. Apps and some websites also allow you to scroll down and turn page by page.

The scroll parameter must be specified during initialization to tell es how long to save the context of this search. You need to make sure the user doesn't keep scrolling for hours, otherwise it might fail due to timeouts.

In addition to using the scroll API, you can also use search_after. The idea of ​​search_after is to use the results of the previous page to help retrieve the data of the next page. Obviously, this method does not allow you to turn pages at will. You can only page one page. Turn the page back. During initialization, a field with a unique value needs to be used as the sort field.

Technology of major Internet companies - elasticsearch principle - ES architecture, elasticsearch architecture principle, ES data writing, data deletion, data reading, search process, retrieval principle, node coordination process, search engine, indexing principle_es principle and search process_code Blog of life - CSDN blog

Guess you like

Origin blog.csdn.net/philip502/article/details/131871959
Recommended