Elasticsearch study notes -Delete By Query API

Record learn about Elasticsearch document delete API's

first official website Document APIs introduced Delete API and Delete By Query API.

Delete API
by specifying an index - the document> id manner Delete -> type

the DELETE / index / type /. 1
. 1
in response to body

{
    "_shards": {
        "Total": 2,
        "failed": 0,
        "successful": 2
    },
    "found": to true,
    "the _index": "index",
    "_type": "type",
    "the _id": ". 1",
    "_VERSION": 2,
    "Result": "deleted"















Each document will correspond to a version, when we perform the removal, the version number should be specified. To ensure the implementation of the removal, no write operation is performed at the same time. Whether a write operation or a delete operation, will make changes to the version of the document. So when we use the Delete API to delete a document, not delete the true sense, but a version change and increased document marked for deletion. When we search again, and then filter out searches of all documents marked for deletion. If the data is large, then, have a certain impact on the performance of the search. They must be physically removed.

Physically remove the method:
it comes to physical delete, delete the document information is to be removed from the disk space. Also you need to understand Indices Segments Indices APIs of Elasticsearch official document.

Indeices Segments (segment)
which is used for low-level pieces of information to build a Lucene index (fragmentation level) to provide more information about the status of the debris and the index may be optimized information, when you delete a "waste" of data, and so on.

Segments have an important attribute to delete a document that was marked for deletion of documents stored in the Segment. If this number is greater than 0 is entirely possible, then this segment will be merged when the recovery space.

So if we want to physically remove sections must be merged. Theoretically Elasticsearch will conduct its own segment merge, but merge the random number, it is difficult to ensure that sections of the document marked for deletion will be merged. Therefore it needs to be configured.

Delete By Query API
In addition to specifying delete outside official website also provides for document based on the query conditions removed.

Twitter the POST / _delete_by_query
{
  "Query": {
    "match": {
      "Message": "
    }
  }
}
1
2
3
4
5
6
7
8
请求体跟Search API是一样的

响应Body

{
  "took" : 147,
  "timed_out": false,
  "deleted": 119,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1.0,
  "throttled_until_millis": 0,
  "total": 119,
  "failures" : [ ]
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Application examples:
After studied delete document operations, it will be applied. Because the project is the use of java, for Elasticsearch document operations. So to be on the Elasticsearch Client selection. TransportClient sooner or later gg, thus decisively choose java REST Client (advantages: 1 can be used TransportClient function 2. Elasticsearch forward-compatible version of the cluster). But before REST Client 6.5, the official website is no Delete By Query API introduces, that want to delete the document in the search mode, you need TransportClient. Here we can directly use the Client 6.5 REST

REST Client ---- By the Delete Query API
Code:

// Create a client
RestHighLevelClient Client = new new RestHighLevelClient (
                RestClient.builder (new new HttpHost ( "192.168.XXX.XX", 9200, "HTTP "))
                .setMaxRetryTimeoutMillis (X * 60 * 1000) // timeout to X minutes
                );
// query document to be deleted
DeleteByQueryRequest deleteByQueryRequest = new DeleteByQueryRequest("_all");
deleteByQueryRequest.setConflicts("proceed");        
request.setQuery(new TermQueryBuilder("user", "kimchy"));
deleteByQueryRequest.setSize(size);
BulkByScrollResponse bulkResponse = client.deleteByQuery(deleteByQueryRequest, RequestOptions.DEFAULT);
//合并段,进行物理删除
ForceMergeRequest requestAll = new ForceMergeRequest();
requestAll.maxNumSegments(1);
requestAll.onlyExpungeDeletes(true);
ForceMergeResponse forceMergeResponse = client.indices().forcemerge(requestAll, RequestOptions.DEFAULT);
1
2
3
4
5
6
7
8
9
10
11
12 is
13 is
14
15
16
Reference:

[. 1] https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docs-delete.html elasticsearch5.6 official website
[2] https: //www.elastic. co / guide / en / elasticsearch / reference / 5.6 / indices-forcemerge.html elasticsearch5.6 official website
[3] https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java- High-Document--REST-by-query.html the REST Delete 6.5 Delete Client Query by

[. 4] https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest- Force-merge.html-High
the REST Client segment merging 6.5

Guess you like

Origin www.cnblogs.com/zhuyeshen/p/10950567.html