table of Contents
Batch operations are more efficient
Batch operations are more efficient
The entire batch request needs to be loaded into the memory of the node that accepts our request, so the larger the request, the smaller the memory available to other requests. There is an optimal bulk request size. Beyond this size, performance no longer improves and may decrease. The optimal size is certainly not a fixed number. It depends entirely on your hardware, the size and complexity of your documents, and the load of indexing and searching.
The sweet spot is still easy to find: try batch indexing standard documents. As the size increases, when performance starts to decrease, it means that your batch size is too large. The initial number can be between 1000 and 5000 documents. If your document is very large, you can use a smaller batch.
It is often useful to focus on the physical size of the batch you are requesting. A thousand 1KB documents is very different from a thousand 1MB documents. A good batch is best kept between 5-15MB in size.
_bulk operation
In Elasticsearch, batch insert, modify, and delete operations are all done through the _bulk API.
Insert data in bulk
Note that there must be a carriage return in the last line, that is, there must be a blank line at the end
{"create":{"_index":"haoke","_type":"user","_id":2001}}
{"id":2001,"name":"name1","age": 20,"sex": "男"}
{"create":{"_index":"haoke","_type":"user","_id":2002}}
{"id":2002,"name":"name2","age": 20,"sex": "男"}
{"create":{"_index":"haoke","_type":"user","_id":2003}}
{"id":2003,"name":"name3","age": 20,"sex": "男"}
batch deletion
{"delete":{"_index":"haoke","_type":"user","_id":2001}}
{"delete":{"_index":"haoke","_type":"user","_id":2002}}
{"delete":{"_index":"haoke","_type":"user","_id":2003}}