Aggregation method by date based on time type field in ES

1. Demand

The records in ES only have the field format of specific time, and the requirement is nested aggregation by date and other fields.

2. Method

1. The date_histogram that comes with ES

2. Use script script.

3. Specific explanation

1. Use the date_histogram method

(1) parameters

The value of "interval" is assigned to "day"

The value of "format" is assigned to "yyyy-MM-dd"

(2) Results

key_as_string: date

key: the number of milliseconds that key_as_string has experienced since 0:00 on January 1, 1970, so this value/1000/3600/24 ​​can be used in the second method to get the number of days.

2. Use script

(1) Code

"script": {
	"source": '''
	Calendar c = Calendar.getInstance();
	c.set(1970, 1 - 1, 1, 0, 0, 0);##注意月份是1-1,Calendar类的month从0开始
	c.add(Calendar.DATE, (int)(doc['startTime'].value / 1000 / 3600 / 24));##加上经过的天数
	Date date = c.getTime();
	String format = new SimpleDateFormat('YYYY-MM-dd').format(date);
	return format + ',' + doc['content.RegistrationNo.keyword'].value + ',' + doc['content.RegistrationNoColor.keyword'].value + ',' + doc['eventTypeCode.keyword'].value '''
}

(2) kibana results

3. Application in python

def func(index):
    query_json={
        "size": 0,
         "aggs": {
          "result": {
            "terms": {
                "script": {
                    "source":'''
                        Calendar c = Calendar.getInstance();
                        c.set(1970,1-1,1,0,0,0);
                        c.add(Calendar.DATE,(int)(doc['startTime'].value/1000/3600/24));
                        Date date = c.getTime();
                        String format = new SimpleDateFormat('YYYY-MM-dd').format(date);
                        return format+','+doc['content.RegistrationNo.keyword'].value +','+ doc['content.RegistrationNoColor.keyword'].value +','+doc['eventTypeCode.keyword'].value'''
                },"size":1000000
            }
        }
        }
}
    query = ESconn.search(index=index, body=query_json, request_timeout=360)
    data=query['aggregations']['result']['buckets']
    return data

Guess you like

Origin blog.csdn.net/m0_49621298/article/details/125337175