ES batch delete data

If ES stores a lot of data, querying will be very slow and waste storage.
Unnecessary data can be deleted through Delete By Query.

Basic operations

Case

// 删除 version_value 字段为 64181 的数据
// log_xxx_* 后面的是通配符
POST /log_xxx_*/_delete_by_query
{
  "query": {
    "bool": {
      "must": [
        { "match": { "version_value": 64181 } }
      ]
    }
  }
}

POST /log_xxx_2022_10_20/_delete_by_query
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "version_value"
          }
        },
        {
          "range": {
            "version_value": {
              "gte": 1,
              "lte": 87750
            }
          }
        }
      ]
    }
  }
}

C# code automation

As needed, you can also use code to execute it regularly and delete data.

NuGet Gallery | NEST 7.17.5

For code samples and specific writing methods, please refer to the NEST usage documentation.

Note that the DateTime in DateRangeQuery does not seem to have milliseconds (it needs to be 0)

// DateTime start, DateTime end, int count
var request = new DeleteByQueryRequest<SourceDocument>()
{
    
    
    Size = count, // 如果一次删除的范围太多,可能会执行失败,执行超时什么的
    Query = new BoolQuery()
    {
    
    
        Must = new List<QueryContainer>
        {
    
    
            new DateRangeQuery
            {
    
    
                Field = new Field("date_time"),
                GreaterThanOrEqualTo = new DateMathExpression(start),
                LessThan = new DateMathExpression(end),
            },
            new ExistsQuery()
            {
    
    
                Field = new Field("version_value"),
            },
            new LongRangeQuery()
            {
    
    
                Field = new Field("version_value"),
                GreaterThanOrEqualTo = 1,
                LessThanOrEqualTo = _maxVersion
            }
        }
    },
};
var response = await _originEsClient.DeleteByQueryAsync(request);

References

ElasticSearch regularly deletes data in batches N days ago_geekswg's blog - CSDN blog
Use the Delete By Query API to delete data in the ES index - Tencent Cloud Developer Community - Tencent Cloud

How to solve version_conflict_engine_exception in Elasticsearch Exception? - Stack Overflow
If you still don’t know how to search in Elasticsearch after reading this article, then I will cry! - Wu Peixuan - Blog Park

Additional

This warehouse can generate ES query statements based on js code.

danpaz/bodybuilder: An elasticsearch query body builder
Bodybuilder | An elasticsearch query body builder

Guess you like

Origin blog.csdn.net/lj22377/article/details/127551628