How to delete duplicate data records in elasticsearch

Friendly Attention: The inspiration writing platform that our Mu Feiyu team spent more than half a year developing will be launched at the end of 2021, Here at Mo Feiyu I hope to help more people find the joy of reading and writing. Please bookmark it and see you at the end of December~

Delete specific records based on query conditions

POST /hk_tm_info_v1/_delete_by_query
{
  "query": {
    "bool": {
      "must_not": [
        {
          "exists": {
            "field": "app_date"
          }
        }
      ]
    }
  }
}

If it is difficult to avoid duplication of query conditions, you can use java code to process the query results. Add the data that needs to be deleted to the dataList, and then delete it in batches to achieve the effect.

	private void updateDataToES(BulkProcessor bulkProcessor,List<Map<String, Object>> dataList){
    
    
		// 将数据添加到 bulkProcessor 中
		for (Map<String, Object> rowMap : dataList) {
    
    
			String id = (String)rowMap.get("_id");
			
			DeleteRequest delRequest = new DeleteRequest(ESProcessor.HK_TM_INFO, id);

			bulkProcessor.add(delRequest);
		}
	}

Statement: The article is a summary of the author's personal experience and experience. It is a personal original article. You are welcome to reprint and indicate the source. The link address of the original blog article is:
https:// tech.limuqiao.com/archives/43.html

Guess you like

Origin blog.csdn.net/mini_snow/article/details/114501229