Explanation
By default, ElasticSearch
the index refresh_interval
for the 1
second, which means that data written 1
seconds before it can be searched.
Every index refresh
will produce a new lucene segment, which results in frequent segment merge behavior, are high on the system CPU and IO usage.
If the product for less demanding real-time, you can reduce the refresh cycle, such as: index.refresh_interval: 120s
.
But this characteristic is too much trouble for functional testing is:
- Because real-time is not guaranteed, so every time after inserting the test data, we need
sleep
some time to be tested. - Because real-time can not guarantee, in a timely manner by
sleep
the policy adoptedcase
, it may occasionally fail.
In order to solve the above problems, the need to provide after ElasticSearch additions and deletions to the data immediately refresh strategy.
version
ElasticSearch 5.1.1
Source
org.elasticsearch.action.support.WriteRequestBuilder#setRefreshPolicy接口如下
/**
* Should this request trigger a refresh ({@linkplain RefreshPolicy#IMMEDIATE}), wait for a refresh (
* {@linkplain RefreshPolicy#WAIT_UNTIL}), or proceed ignore refreshes entirely ({@linkplain RefreshPolicy#NONE}, the default).
*/
@SuppressWarnings("unchecked")
default B setRefreshPolicy(RefreshPolicy refreshPolicy) {
request().setRefreshPolicy(refreshPolicy);
return (B) this;
}
Enumeration org.elasticsearch.action.support.WriteRequest.RefreshPolicy
defines three strategies:
/**
* Don't refresh after this request. The default.
*/
NONE,
/**
* Force a refresh as part of this request. This refresh policy does not scale for high indexing or search throughput but is useful
* to present a consistent view to for indices with very low traffic. And it is wonderful for tests!
*/
IMMEDIATE,
/**
* Leave this request open until a refresh has made the contents of this request visible to search. This refresh policy is
* compatible with high indexing and search throughput but it causes the request to wait to reply until a refresh occurs.
*/
WAIT_UNTIL;
There are three known refresh policy:
RefreshPolicy#IMMEDIATE
:- ElasticSearch request submitted to the data, the data is refreshed immediately before the end of the request.
- Advantages: high real-time, a short time delay operation.
- Disadvantages: high consumption of resources.
RefreshPolicy#WAIT_UNTIL
:- ElasticSearch request submitted to the data until the data is complete refresh before the end of the request.
- Advantages: high real-time, long latency operation.
- Disadvantages: low resource consumption.
RefreshPolicy#NONE
:- The default policy.
- Request submitted data to ElasticSearch, whether or not the relational data refresh has been completed, ends the request.
- Advantages: short time delay operation, low resource consumption.
- Cons: real-time low.
The main class implements this interface follows:
DeleteRequestBuilder
IndexRequestBuilder
UpdateRequestBuilder
BulkRequestBuilder
Examples
/**
* ElasticSearch立即更新的示例代码
*/
@Test
public void refreshImmediatelyTest() {
//删除操作
client.prepareDelete("index", "type", "1").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
//索引操作
client.prepareIndex("index", "type", "2").setSource("{\"age\":1}").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
//更新操作
client.prepareUpdate("index", "type", "3").setDoc("{\"age\":1}").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
//批量操作
client.prepareBulk()
.add(client.prepareDelete("index", "type", "1"))
.add(client.prepareIndex("index", "type", "2").setSource("{\"age\":1}"))
.add(client.prepareUpdate("index", "type", "3").setDoc("{\"age\":1}"))
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
}