table of Contents
Considerations for deep paging in cluster systems
How to use paging
Just as SQL uses the LIMIT keyword to return only one page of results, Elasticsearch accepts from and size parameters:
size: 结果数,默认10
from: 跳过开始的结果数,默认0
Examples
Considerations for deep paging in cluster systems
analysis
To understand why deep paging is problematic, let us assume that we are searching in an index with 5 main shards. When we request the first page of results (results 1 to 10), each shard produces its top 10 results and then returns them to the requesting node (requesting node), which sorts all 50 results to select The top 10 results. Now suppose we request page 1000-results 10001 to 10010. The way of working is the same, the difference is that each shard must produce the top 10010 results. Then request the node to sort these 50050 results and discard 50040!
Processing method
You can see that in a distributed system, the cost of sorting results grows exponentially with the depth of paging. This is why any statement in the web search engine cannot return more than 1,000 results.