Using Elasticsearch - Pipeline Aggregation

pipeline aggregation

Based on the results of the previous aggregation, the second aggregation statistics are performed.

Structurally, it can be divided into two ways: sibling (Sibling) pipeline aggregation and parent (Parent) pipeline aggregation.

  • Sibling pipeline aggregates: new aggregates can be spawned at the same aggregate level.
GET kibana_sample_data_logs/_search
{
    
    
  "size": 0,
  "aggs": {
    
    
    "count_per_day": {
    
    
      "date_histogram": {
    
    
        "field": "@timestamp",
        "calendar_interval": "day"
      }
    },
    "total_bytes_of_download": {
    
    
      "sum": {
    
    
        "field": "bytes"
      }
    }
  }
}
  • Parent pipeline aggregation. With output provided by the parent aggregate, child aggregates are able to produce new buckets which can then be added to the parent bucket.
GET  kibana_sample_data_logs/_search
{
    
    
  "size": 0,
  "aggs": {
    
    
    "count_per_day": {
    
    
      "date_histogram": {
    
    
        "field": "@timestamp",
        "calendar_interval": "day"
      },
      "aggs": {
    
    
        "total_bytes_per_day": {
    
    
          "sum": {
    
    
            "field": "bytes"
          }
        }
      }
    }
  }
}

max_bucket、min_bucket、avg_bucket、sum_bucket

Use sibling pipeline aggregation to obtain the aggregation results (maximum/minimum/average/sum) of specified numerical indicators from multiple buckets.

  • buckets_path: (required) The field path points to.
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values ​​and do not participate in aggregation calculations.
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregation calculations.
    • keep_values: If the provided index value is a null or missing value, it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation.
  • format: Specify the output format of the value, such as #, #0.00. The default is null.

1. Count the maximum flight time of each departure country, and obtain the country with the longest flight time and the corresponding flight time.

GET kibana_sample_data_flights/_search
{
    
    
  "track_total_hits": true, 
  "size": 0,
  "aggs": {
    
    
    "OriginCountry_terms": {
    
    
      "terms": {
    
    
        "field": "OriginCountry",
        "size": 20
      },
      "aggs": {
    
    
        "FlightTimeMin_max": {
    
    
          "max": {
    
    
            "field": "FlightTimeMin"
          }
        }
      }
    },
    "pipeline-max-bucket": {
    
    
      "max_bucket": {
    
    
        "buckets_path": "OriginCountry_terms>FlightTimeMin_max"
      }
    }
  }
}

The aggregation results of the intercepted part are as follows:

"my-pipeline" : {
    
    
  "value" : 1902.9019775390625,
  "keys" : [
    "AR"
  ]
}

2. Obtain the relevant indicators of the flight time of each departure city in the departure country, and then count the longest flight duration of the flight in the departure city and the corresponding departure city name.

GET kibana_sample_data_flights/_search
{
    
    
  "track_total_hits": true, 
  "size": 0,
  "aggs": {
    
    
    "OriginCountry_terms": {
    
    
      "terms": {
    
    
        "field": "OriginCountry",
        "size": 20
      },
      "aggs": {
    
    
        "OriginCityName_terms": {
    
    
          "terms": {
    
    
            "field": "OriginCityName",
            "size": 20
          },
          "aggs": {
    
    
            "FlgihtTimeMin_stats": {
    
    
              "stats": {
    
    
                "field": "FlightTimeMin"
              }
            }
          }
        },
        "OriginCityName_FlightTimeMin_max": {
    
    
          "max_bucket": {
    
    
            "buckets_path": "OriginCityName_terms>FlgihtTimeMin_stats.max"
          }
        }
      }
    }
  }
}

Intercept the result of the pipeline aggregation in the first bucket of the origin country dimension.

"OriginCityName_FlightTimeMin_max" : {
    
    
  "value" : 1559.6236572265625,
  "keys" : [
    "Rome"
  ]
}

3. On the basis of 2, obtain the longest flight duration of the flight in the country of departure and the name of the corresponding country of departure.

GET kibana_sample_data_flights/_search
{
    
    
  "track_total_hits": true, 
  "size": 0,
  "aggs": {
    
    
    "OriginCountry_terms": {
    
    
      "terms": {
    
    
        "field": "OriginCountry",
        "size": 20
      },
      "aggs": {
    
    
        "OriginCityName_terms": {
    
    
          "terms": {
    
    
            "field": "OriginCityName",
            "size": 20
          },
          "aggs": {
    
    
            "FlgihtTimeMin_stats": {
    
    
              "stats": {
    
    
                "field": "FlightTimeMin"
              }
            }
          }
        },
        "OriginCityName_FlightTimeMin_max": {
    
    
          "max_bucket": {
    
    
            "buckets_path": "OriginCityName_terms>FlgihtTimeMin_stats.max"
          }
        }
      }
    },
    "OriginCountry_FlightTimeMin_max": {
    
    
      "max_bucket": {
    
    
        "buckets_path": "OriginCountry_terms>OriginCityName_FlightTimeMin_max"
      }
    }
  }
}

On the basis of 2, additionally output the result of the pipeline aggregation of the origin country dimension.

"OriginCountry_FlightTimeMin_max" : {
    
    
  "value" : 1902.9019775390625,
  "keys" : [
    "AR"
  ]
}

stats_bucket

Use sibling pipeline aggregation to obtain statistical aggregation results of specified numerical indicators from multiple buckets.

  • buckets_path: (required) The field path points to.
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values ​​and do not participate in aggregation calculations.
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregation calculations.
    • keep_values: If the provided index value is a null or missing value, it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation.
  • format: Specify the output format of the value, such as #, #0.00. Default is null,
  • sigma: multiple of the standard deviation, default 2. Used to determine whether a data point is an outlier.

1. Calculate the average flight time of each departure city in the departure country, and then calculate the indicators of these average flight times under the dimension of the departure city.

GET kibana_sample_data_flights/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "OriginCountry_terms": {
    
    
      "terms": {
    
    
        "field": "OriginCountry",
        "size": 20
      },
      "aggs": {
    
    
        "OriginCityName_terms": {
    
    
          "terms": {
    
    
            "field": "OriginCityName",
            "size": 20
          },
          "aggs": {
    
    
            "FlightTimeMin_avg": {
    
    
              "avg": {
    
    
                "field": "FlightTimeMin"
              }
            }
          }
        },
        "pipeline_stats_bucket": {
    
    
          "stats_bucket": {
    
    
            "buckets_path": "OriginCityName_terms>FlightTimeMin_avg"
          }
        }
      }
    }
  }
}

Cut off part of the pipeline aggregation results.

"OriginCityName_FlightTimeMin_stats" : {
    
    
  "count" : 15,
  "min" : 226.4979310909907,
  "max" : 472.0975369329038,
  "avg" : 378.1233526619374,
  "sum" : 5671.850289929062
}

extended_stats_bucket

Use sibling pipeline aggregation to obtain the extended statistical aggregation results of specified numerical indicators from multiple buckets.

  • buckets_path: (required) The field path points to.
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values ​​and do not participate in aggregation calculations.
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregation calculations.
    • keep_values: If the provided index value is a null or missing value, it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation.
  • format: Specify the output format of the value, such as #, #0.00. The default is null.

1. Calculate the average flight time of each departure city in the departure country, and then calculate the indicators of these average flight times under the dimension of the departure city.

GET kibana_sample_data_flights/_search
{
    
    
  "track_total_hits": true, 
  "size": 0,
  "aggs": {
    
    
    "OriginCountry_terms": {
    
    
      "terms": {
    
    
        "field": "OriginCountry",
        "size": 20
      },
      "aggs": {
    
    
        "OriginCityName_terms": {
    
    
          "terms": {
    
    
            "field": "OriginCityName",
            "size": 20
          },
          "aggs": {
    
    
            "FlgihtTimeMin_avg": {
    
    
              "avg": {
    
    
                "field": "FlightTimeMin"
              }
            }
          }
        },
        "OriginCityName_FlightTimeMin_extended_stats": {
    
    
          "extended_stats_bucket": {
    
    
            "buckets_path": "OriginCityName_terms>FlgihtTimeMin_avg"
          }
        }
      }
    }
  }
}

Cut off part of the pipeline aggregation results.

"OriginCityName_FlightTimeMin_extended_stats" : {
    
    
  "count" : 15,
  "min" : 226.4979310909907,
  "max" : 472.0975369329038,
  "avg" : 378.1233526619374,
  "sum" : 5671.850289929062,
  "sum_of_squares" : 2264926.8781446246,
  "variance" : 8017.855381337739,
  "variance_population" : 8017.855381337739,
  "variance_sampling" : 8590.559337147579,
  "std_deviation" : 89.54247808352045,
  "std_deviation_population" : 89.54247808352045,
  "std_deviation_sampling" : 92.68527033540755,
  "std_deviation_bounds" : {
    
    
    "upper" : 557.2083088289783,
    "lower" : 199.03839649489652,
    "upper_population" : 557.2083088289783,
    "lower_population" : 199.03839649489652,
    "upper_sampling" : 563.4938933327526,
    "lower_sampling" : 192.75281199112231
  }
}

cumulative_sum

[ˈkjuːmjəleɪtɪv], cumulative; cumulative

From histogram aggregation and date histogram aggregation, the parent pipeline aggregation method is used to perform cumulative aggregation statistics on related numerical indicators.

The min_doc_count of the outer histogram aggregation and date histogram aggregation must be set to 0.

  • buckets_path: The field path of the pipeline aggregation function points to
  • format: Specify the output format of the value, such as #, #0.00. default null

1. Count the total daily consumption of each user, and at the same time count the cumulative consumption amount of each user's daily growth.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_sum": {
    
    
          "sum": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline_cumulative_sum": {
    
    
          "cumulative_sum": {
    
    
            "buckets_path": "taxful_total_price_sum"
          }
        }
      }
    }
  }
}

Cut off part of the pipeline aggregation results.

"pipeline_cumulative_sum" : {
    
    
  "value" : 41455.5390625
}

cumulative_cardinality

[kɑːdɪ'nælɪtɪ], base

From histogram aggregation and date histogram aggregation, use parent pipeline aggregation to perform cumulative cardinality aggregation statistics on related numerical indicators.

The min_doc_count of the outer histogram aggregation and date histogram aggregation must be set to 0.

  • buckets_path: (required) The field path points to.
  • format: Specify the output format of the value, such as #, #0.00. The default is null.

1. Count the number of users who place orders every day and the accumulated number of users.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "customer_id_cardinality": {
    
    
          "cardinality": {
    
    
            "field": "customer_id"
          }
        },
        "pipeline_cumulative_cardinality": {
    
    
          "cumulative_cardinality": {
    
    
            "buckets_path": "customer_id_cardinality"
          }
        }
      }
    }
  }
}

Cut off part of the pipeline aggregation results.

{
    
    
  "key_as_string" : "2022-07-16",
  "key" : 1657929600000,
  "doc_count" : 143,
  "customer_id_cardinality" : {
    
    
    "value" : 45
  },
  "pipeline_cumulative_cardinality" : {
    
    
    "value" : 46
  }
},
{
    
    
  "key_as_string" : "2022-07-17",
  "key" : 1658016000000,
  "doc_count" : 140,
  "customer_id_cardinality" : {
    
    
    "value" : 42
  },
  "pipeline_cumulative_cardinality" : {
    
    
    "value" : 46
  }
}

moving_avg

Moving average aggregation. Slide a window in the specified data sequence, and calculate the average value inside the window through the aggregation of the parent pipeline.

  • buckets_path: (required) The field path points to.

  • window: The size of the sliding window. The default is 5.

  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description

    • skip: Skip null or missing values ​​and do not participate in aggregation calculations.
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregation calculations.
    • keep_values: If the provided index value is a null or missing value, it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation.
  • model: Specify the model for moving average aggregation, the default is simple. Each model weights the values ​​inside the window differently.

    • simple: simple model. It calculates the sum of all values ​​inside the window and divides by the window size. Simple models do not perform time-varying weighting, which means that the moving average under the model tends to lag the real data.
    • linear: linear model. It assigns linear weights to older data points to reduce lag in the data mean.
    • ewma: single exponential model. It assigns exponential weights to older data points. The speed of weight decay can be controlled by the alpha parameter. The default is 0.3. The alpha parameter supports floating-point numbers between 0 and 1. A smaller value can make the weight decay slowly, providing a better smoothing effect; a larger value can make the weight decay quickly, reducing the influence of the old value on the moving average, although Less smoothing, but makes the moving average track the data more closely. The model can be minimized.
    • holt: double exponential model. The model internally calculates two values: level , trend . Based on data trends, future trends can be predicted. The alpha parameter corresponds to the horizontal attenuation value, the default is 0.3. The beta parameter corresponds to the trend decay value, the default is 0.1. Both alpha and beta parameters support floating point numbers between 0 and 1. The model can be minimized.
    • holt_winters: Triple exponential model. The model internally calculates three values: level , trend , seasonality . Based on seasonal changes in data, future trends can be predicted. The alpha parameter corresponds to the horizontal attenuation value, the default is 0.3. The beta parameter corresponds to the trend decay value, the default is 0.1. The gamma parameter corresponds to the seasonal attenuation value, and the default is 0.3. The alpha parameter, beta parameter, and gamma parameter all support floating-point numbers between 0 and 1. The period parameter corresponds to the period, and the default is 1. The type parameter controls how seasonal changes act on the data, and supports add and mult. The model can be minimized.
  • settings: Specify the relevant parameters of the model.

  • predict: Specifies the number of predictions (will be added to the end of the sequence). Each moving average model supports a forecasting mode that extrapolates future data based on the current smoothed moving average. Depending on the model and parameters, the accuracy of the forecast results will vary. For example: predict: 10.

  • minimize: Specifies whether to enable minimization of the model. Minimization is the process of tuning parameters until the predictions produced by the model closely match the output data. For the ewma and holt models, this parameter defaults to false and is not very useful; for the holt_winters model, this parameter defaults to true, which helps to improve the accuracy of prediction. For example: minimize: true.

Take the simple model with a window size of 3 for an example.

分桶序号		分桶值			移动平均值
	1					10
	2					20				10
	3					30				(10 + 20) / 2
	4					40				(10 + 20 + 30) / 3
	5					50				(20 + 30 + 40) / 3

1. Count the daily consumption amount of users, and count the average consumption amount for five consecutive days.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_sum": {
    
    
          "sum": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline_moving_avg": {
    
    
          "moving_avg": {
    
    
            "buckets_path": "taxful_total_price_sum",
            "window": 5,
            "model": "simple"
          } 
        }
      }
    }
  }
}

Cut off part of the aggregation result.

{
    
    
  "key_as_string" : "2022-07-14",
  "key" : 1657756800000,
  "doc_count" : 146,
  "taxful_total_price_sum" : {
    
    
    "value" : 10578.53125
  }
},
{
    
    
  "key_as_string" : "2022-07-15",
  "key" : 1657843200000,
  "doc_count" : 153,
  "taxful_total_price_sum" : {
    
    
    "value" : 10448.0
  },
  "pipeline_moving_avg" : {
    
    
    "value" : 10578.53125
  }
},
{
    
    
  "key_as_string" : "2022-07-16",
  "key" : 1657929600000,
  "doc_count" : 143,
  "taxful_total_price_sum" : {
    
    
    "value" : 10283.484375
  },
  "pipeline_moving_avg" : {
    
    
    "value" : 10513.265625
  }
}

Change to the ewma model to see the effect.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_sum": {
    
    
          "sum": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline_moving_avg": {
    
    
          "moving_avg": {
    
    
            "buckets_path": "taxful_total_price_sum",
            "window": 5,
            "model": "ewma",
            "settings": {
    
    
              "alpha": 0.5
            }
          } 
        }
      }
    }
  }
}

moving_fn

Move function aggregation. Slide a window in the specified data sequence, calculate the value inside the window through parent pipeline aggregation and use custom scripts. Move function aggregation has some common functions built in.

  • buckets_path: (required) field path points to
  • window: (required) the size of the sliding window
  • script: (required) the script to execute on the data in each window
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values, do not participate in aggregation calculations
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregate calculations
    • keep_values: If the provided index value is a null or missing value, skip it and do not participate in the aggregation calculation; otherwise, use the index value to participate in the aggregation calculation
  • shift: Specifies how many bits the starting position of the window is moved to the right. The default is 0, that is, the aggregation calculation inside the window does not include the current bucket. Each increment of this value moves the starting position of the window one bit to the right. If the aggregate calculation inside the specified window includes the current bucket, shift can be set to 1

Mobile function aggregation has built-in some common functions, as follows:

  • MovingFunctions.max(values): Get the maximum value (ignore null, NaN values, if the window is empty or the values ​​inside the window are all null, NaN, then return NaN)

  • MovingFunctions.min(values): Get the minimum value (ignore null, NaN values, if the window is empty or the values ​​inside the window are all null, NaN, then return NaN)

  • MovingFunctions.sum(values): Get the sum (ignore null and NaN values, if the window is empty or the values ​​inside the window are all null and NaN, return 0.0)

  • MovingFunctions.stdDev(values, average): Get the standard deviation (ignore null and NaN values, if the window is empty or the values ​​inside the window are all null and NaN, return 0.0)

  • MovingFunctions.unweightedAvg(values): Use a simple model to get the average.

  • MovingFunctions.linearWeightedAvg(values): Get the average using the linear model.

  • MovingFunctions.ewma(values, alpha): Use the ewma model to get the mean.

  • MovingFunctions.holt(values, alpha, beta): Use the holt model to get the mean.

  • MovingFunctions.holtWinters(values, alpha, beta, gamma, period, multiplicative): Get averages using the holt_winters model. multiplicative: Boolean value, true means to use multiplication calculation; false means to use addition calculation

1. Count the daily order consumption amount, and count the sum of the consumption amount for five consecutive days.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_sum": {
    
    
          "sum": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline-moving-fn": {
    
    
          "moving_fn": {
    
    
            "buckets_path": "taxful_total_price_sum",
            "window": 5,
            "script": "MovingFunctions.sum(values)",
            "shift": 1
          }
        }
      }
    }
  }
}

Cut off part of the aggregation result.

{
    
    
  "key_as_string" : "2022-07-14",
  "key" : 1657756800000,
  "doc_count" : 146,
  "taxful_total_price_sum" : {
    
    
    "value" : 10578.53125
  },
  "pipeline-moving-fn" : {
    
    
    "value" : 10578.53125
  }
},
{
    
    
  "key_as_string" : "2022-07-15",
  "key" : 1657843200000,
  "doc_count" : 153,
  "taxful_total_price_sum" : {
    
    
    "value" : 10448.0
  },
  "pipeline-moving-fn" : {
    
    
    "value" : 21026.53125
  }
}

bucket_script

Bucket script aggregation. Based on the method of parent-child pipeline aggregation, use scripts to aggregate and count indicators of numeric type aggregated in multiple buckets .

  • buckets_path: (required) the field path of the pipeline aggregation function points to
  • script: (required) custom script
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values, do not participate in aggregation calculations
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregate calculations
    • keep_values: If the provided index value is a null value or a missing value (null, NaN), it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation
  • format: Specify the output format of the value, such as #, #0.00. default null

1. Count the total daily order consumption and the consumption quantity of commodities, and then count the consumption amount of each commodity.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_stats": {
    
    
          "stats": {
    
    
            "field": "taxful_total_price"
          }
        },
        "total_quantity_stats": {
    
    
          "stats": {
    
    
            "field": "total_quantity"
          }
        },
        "pipeline-script": {
    
    
          "bucket_script": {
    
    
            "buckets_path": {
    
    
              "total_price": "taxful_total_price_stats.sum",
              "total_quantity": "total_quantity_stats.sum"
            },
            "script": """
              params.total_price/params.total_quantity;
            """
          }
        }
      }
    }
  }
}

Cut off part of the aggregation result.

{
    
    
  "key_as_string" : "2022-07-14",
  "key" : 1657756800000,
  "doc_count" : 146,
  "taxful_total_price_stats" : {
    
    
    "count" : 146,
    "min" : 18.984375,
    "max" : 230.0,
    "avg" : 72.45569349315069,
    "sum" : 10578.53125
  },
  "total_quantity_stats" : {
    
    
    "count" : 146,
    "min" : 2.0,
    "max" : 4.0,
    "avg" : 2.1780821917808217,
    "sum" : 318.0
  },
  "pipeline-script" : {
    
    
    "value" : 33.2658215408805
  }
}

bucket_selector

Bucket filter aggregation. Based on the method of parent-child pipeline aggregation, use scripts to filter out the bucketed data of the numeric type that meets the conditions to participate in aggregation statistics. The script needs to return a boolean. If the scripting language is expression, the script can return a value, 0 is considered false and other values ​​are considered true.

  • buckets_path: (required) the field path of the pipeline aggregation function points to
  • script: (required) custom script
  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description
    • skip: Skip null or missing values, do not participate in aggregation calculations
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregate calculations
    • keep_values: If the provided index value is a null value or a missing value (null, NaN), it will be skipped and will not participate in the aggregation calculation; otherwise, the index value will be used to participate in the aggregation calculation

1. Count the relevant indicators of the daily order consumption amount, and filter out the dates with the total amount greater than 13,000.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_stats": {
    
    
          "stats": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline-bucket-selector": {
    
    
          "bucket_selector": {
    
    
            "buckets_path": {
    
    
              "sum": "taxful_total_price_stats.sum"
            },
            "script": """
              params.sum > 13000;
            """
          }
        }
      }
    }
  }
}

Cut off part of the aggregation result.

{
    
    
  "key_as_string" : "2022-07-22",
  "key" : 1658448000000,
  "doc_count" : 163,
  "taxful_total_price_stats" : {
    
    
    "count" : 163,
    "min" : 18.984375,
    "max" : 393.0,
    "avg" : 83.1910467791411,
    "sum" : 13560.140625
  }
},
{
    
    
  "key_as_string" : "2022-08-07",
  "key" : 1659830400000,
  "doc_count" : 165,
  "taxful_total_price_stats" : {
    
    
    "count" : 165,
    "min" : 18.984375,
    "max" : 225.0,
    "avg" : 79.36732954545455,
    "sum" : 13095.609375
  }
}

bucket_sort

Bucket sort aggregation. Use parent-child pipeline aggregation to sort multiple buckets. You can specify no field or specify multiple fields for sorting. Each bucket can be sorted by _key, or by sub-aggregation._count

  • sort: Specifies a list of fields for sorting.

  • from: Specifies the bucket to start truncating from. The default is 0.

  • size: Specifies how many buckets to return. Returns all buckets by default.

  • gap_policy: The processing policy for empty or missing values ​​in bucket data, default skip. gap_policy detailed description

    • skip: Skip null or missing values, do not participate in aggregation calculations
    • insert_zeros: Treat null or missing values ​​as 0 to participate in aggregate calculations
    • keep_values: If the provided index value is a null or missing value, skip it and do not participate in the aggregation calculation; otherwise, use the index value to participate in the aggregation calculation

1. Count the relevant indicators of the consumption amount of the highest two days in the total daily consumption.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_stats": {
    
    
          "stats": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline-sort": {
    
    
          "bucket_sort": {
    
    
            "sort": [
              {
    
    
                "taxful_total_price_stats.sum": {
    
    
                  "order": "desc"
                }
              }  
            ],
            "size": 2
          }
        }
      }
    }
  }
}

Cut off part of the aggregation result.

"aggregations" : {
    
    
  "order_date_histogram" : {
    
    
    "buckets" : [
      {
    
    
        "key_as_string" : "2022-07-22",
        "key" : 1658448000000,
        "doc_count" : 163,
        "taxful_total_price_stats" : {
    
    
          "count" : 163,
          "min" : 18.984375,
          "max" : 393.0,
          "avg" : 83.1910467791411,
          "sum" : 13560.140625
        }
      },
      {
    
    
        "key_as_string" : "2022-08-07",
        "key" : 1659830400000,
        "doc_count" : 165,
        "taxful_total_price_stats" : {
    
    
          "count" : 165,
          "min" : 18.984375,
          "max" : 225.0,
          "avg" : 79.36732954545455,
          "sum" : 13095.609375
        }
      }
    ]
  }
}

2. Count the relevant indicators of the consumption amount of the previous three days.

GET kibana_sample_data_ecommerce/_search
{
    
    
  "size": 0,
  "track_total_hits": true,
  "aggs": {
    
    
    "order_date_histogram": {
    
    
      "date_histogram": {
    
    
        "field": "order_date",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
    
    
        "taxful_total_price_stats": {
    
    
          "stats": {
    
    
            "field": "taxful_total_price"
          }
        },
        "pipeline-sort": {
    
    
          "bucket_sort": {
    
    
            "from": 0,
            "size": 3
          }
        }
      }
    }
  }
}

Cut off part of the aggregation result.

"aggregations" : {
    
    
  "order_date_histogram" : {
    
    
    "buckets" : [
      {
    
    
        "key_as_string" : "2022-07-14",
        "key" : 1657756800000,
        "doc_count" : 146,
        "taxful_total_price_stats" : {
    
    
          "count" : 146,
          "min" : 18.984375,
          "max" : 230.0,
          "avg" : 72.45569349315069,
          "sum" : 10578.53125
        }
      },
      {
    
    
        "key_as_string" : "2022-07-15",
        "key" : 1657843200000,
        "doc_count" : 153,
        "taxful_total_price_stats" : {
    
    
          "count" : 153,
          "min" : 22.984375,
          "max" : 220.0,
          "avg" : 68.2875816993464,
          "sum" : 10448.0
        }
      },
      {
    
    
        "key_as_string" : "2022-07-16",
        "key" : 1657929600000,
        "doc_count" : 143,
        "taxful_total_price_stats" : {
    
    
          "count" : 143,
          "min" : 18.984375,
          "max" : 250.0,
          "avg" : 71.91247814685315,
          "sum" : 10283.484375
        }
      }
    ]
  }
}

Guess you like

Origin blog.csdn.net/qq_34561892/article/details/129960768