五、浅析[ElasticSearch]底层原理与分组聚合查询

集群节点介绍

es配置文件夹中

主节点:node.master:true
数据节点: node.data: true
  1. 客户端节点
      当主节点和数据节点配置都设置为false的时候,该节点只能处理路由请求,处理搜索,分发索引操作等,从本质上来说该客户节点表现为智能负载平衡器。独立的客户端节点在一个比较大的集群中是非常有用的,他协调主节点和数据节点,客户端节点加入集群可以得到集群的状态,根据集群的状态可以直接路由请求。

  2. 数据节点
      数据节点主要是存储索引数据的节点,主要对文档进行增删改查操作,聚合操作等。数据节点对cpu,内存,io要求较高, 在优化的时候需要监控数据节点的状态,当资源不够的时候,需要在集群中添加新的节点。

  3. 主节点
      主资格节点的主要职责是和集群操作相关的内容,如创建或删除索引,跟踪哪些节点是群集的一部分,并决定哪些分片分配给相关的节点。稳定的主节点对集群的健康是非常重要的,默认情况下任何一个集群中的节点都有可能被选为主节点,索引数据和搜索查询等操作会占用大量的cpu,内存,io资源,为了确保一个集群的稳定,分离主节点和数据节点是一个比较好的选择。

一、ElasticSearch文档分值_score计算底层原理

1.boolean model

第一步、根据用户的query条件,先过滤出包含指定term(关键字)的doc(文档)
例如查询"hello world"

query "hello world"  拆分不同的term-->  hello / world / hello & world

第二步、根据你的条件进行筛选

bool --> must/must not/should 筛选条件--> 过滤 --> 包含 / 不包含 / 可能包含

到这里还没有进行打分。

2.relevance score算法

该算法是计算出一个索引中的文本,与搜索文本,他们之间的关联匹配程度。
Elasticsearch使用的是 term frequency/inverse document frequency算法,简称为TF/IDF算法(TF除以IDF)。
第三步、开始计算

  1. Term frequency(TF):搜索文本中的各个词条在field文本中出现了多少次,出现次数越多,就越相关。
    例如
    搜索请求:hello world
    会拆成hello和world。去文档中去找这些关键字出现的次数。出现次数越多,分数越高。
doc1:hello you, and world is very good

doc2:hello, how are you
  1. Inverse document frequency(IDF):搜索文本中的各个词条在整个索引的所有文档中出现了多少次,出现的次数越多,就越不相关。
    (可以这么理解,就比如你搜索的关键字为:'的,是’这些关键字几乎在整个索引存在很多。考虑到类似这一情况进行的该算法。)
    例如
    搜索请求:hello world
doc1:hello, july is good

doc2:hi world, how are you

此外处理上述的tf和idf外还有一个因素有关
3. Field-length norm:field长度,field越长,相关度越弱

例如
搜索请求:hello world

doc1:{
    
     "title": "hello july", "content": "...... 1000个单词" }
doc2:{
    
     "title": "my baby", "content": "...... 1000个单词,hi world" }

hello world在整个index中出现的次数是一样多的,但是,doc1更相关,title 字段中内容更短。

2、分析一个document上的_score是如何被计算出来的

使用_explain进行一个简单的查询举例。

GET /test_index08/_doc/3/_explain
{
    
    "query":{
    
    "match":{
    
    "f":"hello"}}}

结果
包含上述所说的idf和tf等相关分数,这里先简单了解。es的计算分数涉及到的数学知识还是比较复杂的这里不展开讲解了。
在这里插入图片描述


二、分词器工作流程

1.character filter、tokenizer、token filter

  • 切分词语和normalization

根据指定的分词器,把要保存到es中的数据进行切分,给你一段句子,然后将这段句子拆分成一个一个的单个的单词,同时对每个单词进行normalization(时态转换,单复数转换等)。

工作流程大致可以分为三个步骤
第一步:character filter:在一段文本进行分词之前,先进行预处理,比如说最常见的就是,过滤一些内容(把html标签过滤掉,把一些特殊符号进行转换& --> and,&转and等。)

第二步:tokenizer:分词,hello you and me --> hello, you, and, me

第三步:token filter:lowercase,stop word,synonymom,(例如处理大小写转换,停用词的处理,同义词的处理等。)

经过各种处理后,最后处理好的结果才会拿去建立倒排索引。

2、内置分词器的简单介绍

测试内容:Set the shape to semi-transparent by calling set_trans(5)

  • standard analyze
    结果:set, the, shape, to, semi, transparent, by, calling, set_trans, 5(默认的是standard分词器)
  • simple analyzer
    结果:set, the, shape, to, semi, transparent, by, calling, set, trans
  • whitespace analyzer
    结果:Set, the, shape, to, semi-transparent, by, calling, set_trans(5)
  • stop analyzer
    结果:移除停用词,比如a the it等等

举例

POST _analyze
{
    
    
  "analyzer": "standard",
  "text": "Set the shape to semi-transparent by calling set_trans(5)"
}

详细结果

{
    
    
  "tokens" : [
    {
    
    
      "token" : "set",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
    
    
      "token" : "the",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
    
    
      "token" : "shape",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
    
    
      "token" : "to",
      "start_offset" : 14,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
    
    
      "token" : "semi",
      "start_offset" : 17,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
    
    
      "token" : "transparent",
      "start_offset" : 22,
      "end_offset" : 33,
      "type" : "<ALPHANUM>",
      "position" : 5
    },
    {
    
    
      "token" : "by",
      "start_offset" : 34,
      "end_offset" : 36,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
    
    
      "token" : "calling",
      "start_offset" : 37,
      "end_offset" : 44,
      "type" : "<ALPHANUM>",
      "position" : 7
    },
    {
    
    
      "token" : "set_trans",
      "start_offset" : 45,
      "end_offset" : 54,
      "type" : "<ALPHANUM>",
      "position" : 8
    },
    {
    
    
      "token" : "5",
      "start_offset" : 55,
      "end_offset" : 56,
      "type" : "<NUM>",
      "position" : 9
    }
  ]
}

3、定制分词器

3.1默认的分词器–standard

standard tokenizer:以单词边界进行切分

standard token filter:什么都不做

lowercase token filter:将所有字母转换为小写

stop token filer(默认被禁用):移除停用词,比如a the it等等

3.2修改分词器的设置

英文环境下,启用停用词。
例如
创建一个名为my_index的索引,其中es_std为自定义分词器名称,stopwords为设置英文环境下启用停用词。

PUT /my_index
{
    
    
  "settings": {
    
    
    "analysis": {
    
    
      "analyzer": {
    
    
        "es_std": {
    
    
          "type": "standard",
          "stopwords": "_english_"
        }
      }
    }
  }
}

默认分词器分词

GET /my_index/_analyze
{
    
    
  "analyzer": "standard", 
  "text": "a dog is in the house"
}

结果

{
    
    
  "tokens" : [
    {
    
    
      "token" : "a",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
    
    
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
    
    
      "token" : "is",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
    
    
      "token" : "in",
      "start_offset" : 9,
      "end_offset" : 11,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
    
    
      "token" : "the",
      "start_offset" : 12,
      "end_offset" : 15,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
    
    
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

测试自定义分词器的分词结果

GET /my_index/_analyze
{
    
    
  "analyzer": "es_std", 
  "text": "a dog is in the house"
}

结果

{
    
    
  "tokens" : [
    {
    
    
      "token" : "dog",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
    
    
      "token" : "house",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 5
    }
  ]
}

3.3定制化自己的分词器

创建一个my_index2索引,要求内容中的 & 转换成and,其中&Toand名称是自定义的,类型为mapping(映射关系),多个条件使用逗号分隔,设置停用词文本中有the、a把他过滤掉,其中my_stopwords名称自定义,类型为stop(停用词)。my_analyzer为自定分词的名称,类型为custom(自定义分词器),html_strip为es中自带的,自动过滤掉html标签,lowercase作用是大写转小写,“tokenizer”: "standard"表示在standard分词器基础上进行扩展。

PUT /my_index2
{
    
    
  "settings": {
    
    
    "analysis": {
    
    
      "char_filter": {
    
    
        "&Toand": {
    
    
          "type": "mapping",
          "mappings": [
            "&=> and",
            "!=> not"
          ]
        }
      },
      "filter": {
    
    
        "my_stopwords": {
    
    
          "type": "stop",
          "stopwords": [
            "the",
            "a"
          ]
        }
      },
      "analyzer": {
    
    
        "my_analyzer": {
    
    
          "type": "custom",
          "char_filter": [
            "html_strip",
            "&Toand"
          ],
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_stopwords"
          ]
        }
      }
    }
  }
}

进行测试

GET /my_index2/_analyze
{
    
    
  "text": "tom&jerry are a friend in the house, <a>, HAHA!!",
  "analyzer": "my_analyzer"
}

结果

{
    
    
  "tokens" : [
    {
    
    
      "token" : "tomandjerry",
      "start_offset" : 0,
      "end_offset" : 9,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
    
    
      "token" : "are",
      "start_offset" : 10,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
    
    
      "token" : "friend",
      "start_offset" : 16,
      "end_offset" : 22,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
    
    
      "token" : "in",
      "start_offset" : 23,
      "end_offset" : 25,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
    
    
      "token" : "house",
      "start_offset" : 30,
      "end_offset" : 35,
      "type" : "<ALPHANUM>",
      "position" : 6
    },
    {
    
    
      "token" : "hahanotnot",
      "start_offset" : 42,
      "end_offset" : 48,
      "type" : "<ALPHANUM>",
      "position" : 7
    }
  ]
}

3.4 ik分词器详解

ik配置文件地址:config目录下
在这里插入图片描述
文件主要作用:

  1. IKAnalyzer.cfg.xml:用来配置自定义词库
  2. main.dic:ik原生内置的中文词库,总共有27万多条,只要是这些单词,都会被分在一起
  3. quantifier.dic:放了一些单位相关的词
  4. suffix.dic:放了一些后缀
  5. surname.dic:中国的姓氏
  6. stopword.dic:英文停用词
  7. main.dic:包含了原生的中文词语,会按照这个里面的词语去分词
  8. stopword.dic:包含了英文的停用词

如何对IK分词器自定义词库?
方法1:
增加需要自定义的词库,更改指定配置文件中的内容,把增加的词库地址配置进去。
例如,我在config目录下新建了一个文件夹叫custom,然后里边有一个custom.dic文件
修改IKAnalyzer.cfg.xml配置文件内容(每个节点都要修改)

<entry key="ext_dict">custom/custom.dic</entry>

这种方法需要重启es,才能生效。

方法2(IK热更新):
把整个custom.dic文件放到一个指定的地址上,比如192.168.5.5:8888/custom.dic。当配置es 的时候把地址统一写成这个地址,此时你要更新custom.dic内容时,直接对它进行修改即可。也不需要再重启es了。

方法3(修改源码):
修改es中的源码,使其读取mysql中的词库。下载源码进行修改。


三、高亮显示

1.高亮简述

多查询的内容,进行高亮显示,类似百度搜索的结果。
在这里插入图片描述
高亮演示
先新建一个索引并增加一条数据。
指定某些字段使用的分词器。

PUT /test_highlight
{
    
    
  "mappings": {
    
    

      "properties": {
    
    
        "title": {
    
    
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
    
    
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
    }
  
}

或者设置索引默认分词器

PUT /test_highlight
{
    
    
    "settings" : {
    
    
        "index" : {
    
    
            "analysis.analyzer.default.type": "ik_max_word"
        }
    }
}

插入数据

PUT /test_highlight/_doc/1
{
    
    
  "title": "这是july写的第一篇文章",
  "content": "大家好,这是我写的第一篇文章,特别喜欢这个文章"
}

查询内容进行高亮

GET /test_highlight/_doc/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "title": "文章"
    }
  },
  "highlight": {
    
    
    "fields": {
    
    
      "title": {
    
    }
    }
  }
}

结果

{
    
    
  "took" : 416,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
    
    
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
    
    
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
    
    
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ]
        }
      }
    ]
  }
}

<em></em>标签,会变成红色,所以说你的指定的field中,如果包含了那个搜索词的话,就会在那个field的文本中,对搜索词进行红色的高亮显示

注意:这里只有query中的title条件这一个字段进行高亮,如果你想让content也高亮的话,content字段需要出现在query中,如果只是添加在highlight中是不生效的!请看如下举例

GET /test_highlight/_doc/_search 
{
    
    
  "query": {
    
    
    "bool": {
    
    
      "should": [
        {
    
    
          "match": {
    
    
            "title": "文章"
          }
        },
        {
    
    
          "match": {
    
    
            "content": "文章"
          }
        }
      ]
    }
  },
  "highlight": {
    
    
    "fields": {
    
    
      "title": {
    
    },
      "content": {
    
    }
    }
  }
}

结果

{
    
    
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.68324494,
    "hits" : [
      {
    
    
        "_index" : "test_highlight",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.68324494,
        "_source" : {
    
    
          "title" : "这是july写的第一篇文章",
          "content" : "大家好,这是我写的第一篇文章,特别喜欢这个文章"
        },
        "highlight" : {
    
    
          "title" : [
            "这是july写的第一篇<em>文章</em>"
          ],
          "content" : [
            "大家好,这是我写的第一篇<em>文章</em>,特别喜欢这个<em>文章</em>"
          ]
        }
      }
    ]
  }
}

2.常用的highlight

  • plain highlight,lucene highlight,默认

  • posting highlight,index_options=offsets

posting性能比plain要高,因为不需要重新对高亮文本进行分词。对磁盘的消耗更少。

高亮查询如何使用posting方式
在新建索引时,指定mapping格式如下。
例如:要对content字段进行高亮,设置"index_options": “offsets”。

PUT /test_highlight
{
    
    
  "mappings": {
    
    
      "properties": {
    
    
        "title": {
    
    
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
    
    
          "type": "text",
          "analyzer": "ik_max_word",
          "index_options": "offsets"
        }
      }
  }
}

查询方式和默认高亮是一样的

GET /test_highlight/_doc/_search 
{
    
    
  "query": {
    
    
    "match": {
    
    
      "content": "文章"
    }
  },
  "highlight": {
    
    
    "fields": {
    
    
      "content": {
    
    }
    }
  }
}

3.fast vector highlight

index-time term vector设置在mapping中,就会用fast verctor highlight。
对大field而言(大于1mb),性能更高
如何使用
例如:要对content字段进行高亮,设置"term_vector" : “with_positions_offsets”
PUT /test_highlight

{
    
    
  "mappings": {
    
    
      "properties": {
    
    
        "title": {
    
    
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "content": {
    
    
          "type": "text",
          "analyzer": "ik_max_word",
          "term_vector" : "with_positions_offsets"
        }
      }
  }
}

查询方式也是一样的。
如何强制使用指定高亮类型查询

GET /test_highlight/_doc/_search 
{
    
    
  "query": {
    
    
    "match": {
    
    
      "content": "文章"
    }
  },
  "highlight": {
    
    
    "fields": {
    
    
      "content": {
    
    
        "type": "plain"
      }
    }
  }
}

4.高亮片段fragment的设置

场景:你需要高亮的内容’java’,对应字段中内容超过1w个字。那么我可能不需要把所有内容都拿出来,只需要拿出来一小部分就可以,也不需要把所有匹配的一下子都展示出来,只展示前边几个高亮的就可以。

GET /test_highlight/_search
{
    
    
    "query" : {
    
    
        "match": {
    
     "content": "文章" }
    },
    "highlight" : {
    
    
        "fields" : {
    
    
            "content" : {
    
    "fragment_size" : 5, "number_of_fragments" : 3 }
        }
    }
}

fragment_size: 默认是100,设置获取内容的长度。
number_of_fragments:你可能你的高亮的fragment文本片段有多个片段,你可以指定就显示几个片段。

四、 聚合搜索技术深入

1.bucket和metric

在Elasticsearch中,bucket和metric是两种重要的聚合(Aggregation)类型。它们被用于在搜索结果中分组、过滤和计算数据。
Bucket:是一个用于将文档分成段或者桶的聚合操作。我们可以将Bucket看作是一种分类操作,通过Bucket聚合可以将搜索结果按照某种规则进行分组,形成多个不同的Bucket。

常见的Bucket类型有:

  • Terms Bucket:按照指定字段的值进行分组,类似于SQL中的GROUP BY。
  • Date Histogram Bucket:按照时间间隔对文档进行分组,比如每天、每周、每月等。
  • Range Bucket:按照数值范围进行分组,例如按照价格区间进行分组。

Metric:是对Bucket中的文档进行计算的聚合操作。Metric通常会应用于已经分组的数据上,从而计算出汇总数据。
常见的Metric类型有:

  • Sum Metric:对指定字段的数值进行求和计算。
  • Avg Metric:对指定字段的数值进行平均计算。
  • Max Metric:对指定字段的数值取最大值。
  • Min Metric:对指定字段的数值取最小值。
  • Cardinality Metric:对指定字段的不同值进行计数。

举个例子,如果我们有一个包含产品销售记录的索引,其中有字段"category"表示产品类型,那么我们可以使用Terms Bucket对每种产品类型进行分组,然后再应用某些Metric,如Sum Metric来计算每种产品类型的总销售额。
这可以通过以下Elasticsearch查询实现:

{
    
    
    "aggs": {
    
    
        "sales_by_category": {
    
    
            "terms": {
    
     "field": "category" },
            "aggs": {
    
    
                "total_sales": {
    
     "sum": {
    
     "field": "price" } }
            }
        }
    }
}

上述查询首先使用Terms Bucket将所有产品按照产品类型进行分组,然后使用Sum Metric对每个分组内的价格进行求和,最终得到每个产品类型的总销售额。其中sales_by_category为自定的分组名称。

2聚合操作案例

新建索引,并插入数据。

PUT /cars
{
    
    
  "mappings": {
    
    
    "properties": {
    
    
      "price": {
    
    
        "type": "long"
      },
      "color": {
    
    
        "type": "keyword"
      },
      "brand": {
    
    
        "type": "keyword"
      },
      "model": {
    
    
        "type": "keyword"
      },
      "sold_date": {
    
    
        "type": "date"
      },
      "remark": {
    
    
        "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

添加数据

POST /cars/_bulk
{
    
    "index":{
    
    }}
{
    
    "price":258000,"color":"金色","brand":"大众","model":"大众迈腾","sold_date":"2021-10-28","remark":"大众中档车"}
{
    
    "index":{
    
    }}
{
    
    "price":123000,"color":"金色","brand":"大众","model":"大众速腾","sold_date":"2021-11-05","remark":"大众神车"}
{
    
    "index":{
    
    }}
{
    
    "price":239800,"color":"白色","brand":"标志","model":"标志508","sold_date":"2021-05-18","remark":"标志品牌全球上市车型"}
{
    
    "index":{
    
    }}
{
    
    "price":148800,"color":"白色","brand":"标志","model":"标志408","sold_date":"2021-07-02","remark":"比较大的紧凑型车"}
{
    
    "index":{
    
    }}
{
    
    "price":1998000,"color":"黑色","brand":"大众","model":"大众辉腾","sold_date":"2021-08-19","remark":"大众最让人肝疼的车"}
{
    
    "index":{
    
    }}
{
    
    "price":218000,"color":"红色","brand":"奥迪","model":"奥迪A4","sold_date":"2021-11-05","remark":"小资车型"}
{
    
    "index":{
    
    }}
{
    
    "price":489000,"color":"黑色","brand":"奥迪","model":"奥迪A6","sold_date":"2022-01-01","remark":"政府专用?"}
{
    
    "index":{
    
    }}
{
    
    "price":1899000,"color":"黑色","brand":"奥迪","model":"奥迪A 8","sold_date":"2022-02-12","remark":"很贵的大A6"}

①根据color分组统计销售数量
只执行聚合分组,不做复杂的聚合统计。在ES中最基础的聚合为terms,相当于SQL中的count。
在ES中默认为分组数据做排序,使用的是doc_count数据执行降序排列。可以使用_key元数据,根据分组后的字段数据执行不同的排序方案,也可以根据_count元数据,根据分组后的统计值执行不同的排序方案。

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color",
        "order": {
    
    
          "_count": "desc"
        }
      }
    }
  }
}

结果,其中hits展示的是元数据内容,aggregations展示的是聚合后的内容。

{
    
    
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "group_by_color" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "黑色",
          "doc_count" : 3
        },
        {
    
    
          "key" : "白色",
          "doc_count" : 2
        },
        {
    
    
          "key" : "金色",
          "doc_count" : 2
        },
        {
    
    
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

如果不想要元数据则需设置一下size即可。

GET /cars/_search
{
    
    
  "size": 0, 
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color",
        "order": {
    
    
          "_count": "desc"
        }
      }
    }
  }
}

②统计不同color车辆的平均价格(下钻分析,aggs嵌套aggs)
本案例先根据color执行聚合分组,在此分组的基础上,对组内数据执行聚合统计,这个组内数据的聚合统计就是metric。同样可以执行排序,因为组内有聚合统计,且对统计数据给予了命名avg_by_price,所以可以根据这个聚合统计数据字段名执行排序逻辑。

GET /cars/_search
{
    
    
  "size": 0, 
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color",
        "order": {
    
    
          "avg_by_price": "asc"
        }
      },
      "aggs": {
    
    
        "avg_by_price": {
    
    
          "avg": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    
    
    "group_by_color" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price" : {
    
    
            "value" : 190500.0
          }
        },
        {
    
    
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price" : {
    
    
            "value" : 194300.0
          }
        },
        {
    
    
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price" : {
    
    
            "value" : 218000.0
          }
        },
        {
    
    
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price" : {
    
    
            "value" : 1462000.0
          }
        }
      ]
    }
  }
}

size可以设置为0,表示不返回ES中的文档,只返回ES聚合之后的数据,提高查询速度,当然如果你需要这些文档的话,也可以按照实际情况进行设置。

③统计不同color不同brand中车辆的平均价格

查询

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color",
        "order": {
    
    
          "avg_by_price_color": "asc"
        }
      },
      "aggs": {
    
    
        "avg_by_price_color": {
    
    
          "avg": {
    
    
            "field": "price"
          }
        },
        "group_by_brand": {
    
    
          "terms": {
    
    
            "field": "brand",
            "order": {
    
    
              "avg_by_price_brand": "desc"
            }
          },
          "aggs": {
    
    
            "avg_by_price_brand": {
    
    
              "avg": {
    
    
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "group_by_color" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "金色",
          "doc_count" : 2,
          "avg_by_price_color" : {
    
    
            "value" : 190500.0
          },
          "group_by_brand" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "大众",
                "doc_count" : 2,
                "avg_by_price_brand" : {
    
    
                  "value" : 190500.0
                }
              }
            ]
          }
        },
        {
    
    
          "key" : "白色",
          "doc_count" : 2,
          "avg_by_price_color" : {
    
    
            "value" : 194300.0
          },
          "group_by_brand" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "标志",
                "doc_count" : 2,
                "avg_by_price_brand" : {
    
    
                  "value" : 194300.0
                }
              }
            ]
          }
        },
        {
    
    
          "key" : "红色",
          "doc_count" : 1,
          "avg_by_price_color" : {
    
    
            "value" : 218000.0
          },
          "group_by_brand" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "奥迪",
                "doc_count" : 1,
                "avg_by_price_brand" : {
    
    
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
    
    
          "key" : "黑色",
          "doc_count" : 3,
          "avg_by_price_color" : {
    
    
            "value" : 1462000.0
          },
          "group_by_brand" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "大众",
                "doc_count" : 1,
                "avg_by_price_brand" : {
    
    
                  "value" : 1998000.0
                }
              },
              {
    
    
                "key" : "奥迪",
                "doc_count" : 2,
                "avg_by_price_brand" : {
    
    
                  "value" : 1194000.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

先根据color聚合分组,在组内根据brand再次聚合分组,这种操作可以称为下钻分析。(即嵌套定义)
aggs也可水平定义,、格式如下。

GET /index_name/type_name/_search
{
    
    
"aggs" : {
    
    
"分组名称1" : {
    
    },
"分组名称2" : {
    
    }
}
}

举例:

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color"
      }
    },
    "avg_by_price_color": {
    
    
      "avg": {
    
    
        "field": "price"
      }
    }
  }

}

结果

{
    
    
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "avg_by_price_color" : {
    
    
      "value" : 671700.0
    },
    "group_by_color" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "黑色",
          "doc_count" : 3
        },
        {
    
    
          "key" : "白色",
          "doc_count" : 2
        },
        {
    
    
          "key" : "金色",
          "doc_count" : 2
        },
        {
    
    
          "key" : "红色",
          "doc_count" : 1
        }
      ]
    }
  }
}

④统计不同color中的最大和最小价格、总价
查询

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_by_color": {
    
    
      "terms": {
    
    
        "field": "color"
      },
      "aggs": {
    
    
        "max_price": {
    
    
          "max": {
    
    
            "field": "price"
          }
        },
        "min_price": {
    
    
          "min": {
    
    
            "field": "price"
          }
        },
        "sum_price": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "group_by_color" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "黑色",
          "doc_count" : 3,
          "max_price" : {
    
    
            "value" : 1998000.0
          },
          "min_price" : {
    
    
            "value" : 489000.0
          },
          "sum_price" : {
    
    
            "value" : 4386000.0
          }
        },
        {
    
    
          "key" : "白色",
          "doc_count" : 2,
          "max_price" : {
    
    
            "value" : 239800.0
          },
          "min_price" : {
    
    
            "value" : 148800.0
          },
          "sum_price" : {
    
    
            "value" : 388600.0
          }
        },
        {
    
    
          "key" : "金色",
          "doc_count" : 2,
          "max_price" : {
    
    
            "value" : 258000.0
          },
          "min_price" : {
    
    
            "value" : 123000.0
          },
          "sum_price" : {
    
    
            "value" : 381000.0
          }
        },
        {
    
    
          "key" : "红色",
          "doc_count" : 1,
          "max_price" : {
    
    
            "value" : 218000.0
          },
          "min_price" : {
    
    
            "value" : 218000.0
          },
          "sum_price" : {
    
    
            "value" : 218000.0
          }
        }
      ]
    }
  }
}

⑤统计不同品牌汽车中价格排名最高的车型
查询

GET cars/_search
{
    
    
  "size": 0,
  "aggs": {
    
    
    "group_by_brand": {
    
    
      "terms": {
    
    
        "field": "brand"
      },
      "aggs": {
    
    
        "top_car": {
    
    
          "top_hits": {
    
    
            "size": 1,
            "sort": [
              {
    
    
                "price": {
    
    
                  "order": "desc"
                }
              }
            ],
            "_source": {
    
    
              "includes": [
                "model",
                "price"
              ]
            }
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    
    
    "group_by_brand" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "大众",
          "doc_count" : 3,
          "top_car" : {
    
    
            "hits" : {
    
    
              "total" : {
    
    
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
    
    
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "UYR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
    
    
                    "price" : 1998000,
                    "model" : "大众辉腾"
                  },
                  "sort" : [
                    1998000
                  ]
                }
              ]
            }
          }
        },
        {
    
    
          "key" : "奥迪",
          "doc_count" : 3,
          "top_car" : {
    
    
            "hits" : {
    
    
              "total" : {
    
    
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
    
    
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "VIR_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
    
    
                    "price" : 1899000,
                    "model" : "奥迪A 8"
                  },
                  "sort" : [
                    1899000
                  ]
                }
              ]
            }
          }
        },
        {
    
    
          "key" : "标志",
          "doc_count" : 2,
          "top_car" : {
    
    
            "hits" : {
    
    
              "total" : {
    
    
                "value" : 2,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
    
    
                  "_index" : "cars",
                  "_type" : "_doc",
                  "_id" : "T4R_-4cBUF6rBrkiDpRJ",
                  "_score" : null,
                  "_source" : {
    
    
                    "price" : 239800,
                    "model" : "标志508"
                  },
                  "sort" : [
                    239800
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}

2.1聚合操作之histogram 区间统计

histogram类似terms,也是进行bucket分组操作的,是根据一个field,实现数据区间分组。
例如:以100万为一个范围,统计不同范围内车辆的销售量和平均价格。那么使用histogram的聚合的时候,field指定价格字段price。区间范围是100万(即interval : 1000000)。这个时候ES会将price价格区间划分为: [0, 1000000), [1000000, 2000000), [2000000, 3000000)等,依次类推。在划分区间的同时,histogram会类似terms进行数据数量的统计(count),可以通过嵌套aggs对聚合分组后的组内数据做再次聚合分析。

查询

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "histogram_by_price": {
    
    
      "histogram": {
    
    
        "field": "price",
        "interval": 1000000
      },
      "aggs": {
    
    
        "avg_by_price": {
    
    
          "avg": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "histogram_by_price" : {
    
    
      "buckets" : [
        {
    
    
          "key" : 0.0,
          "doc_count" : 6,
          "avg_by_price" : {
    
    
            "value" : 246100.0
          }
        },
        {
    
    
          "key" : 1000000.0,
          "doc_count" : 2,
          "avg_by_price" : {
    
    
            "value" : 1948500.0
          }
        }
      ]
    }
  }
}

2.2date_histogram区间分组

date_histogram可以对date类型的field执行区间聚合分组,如每月销量,每年销量等。
如:以月为单位,统计不同月份汽车的销售数量及销售总金额。这个时候可以使用date_histogram实现聚合分组,其中field来指定用于聚合分组的字段,interval指定区间范围(可选值有:year、quarter、month、week、day、hour、minute、second),format指定日期格式化,min_doc_count指定每个区间的最少document(如果不指定,默认为0,当区间范围内没有document时,也会显示bucket分组),extended_bounds指定起始时间和结束时间(如果不指定,默认使用字段中日期最小值所在范围和最大值所在范围为起始和结束时间)。

举例:统计2021年到2022年这个区间统计总价。
es7.x之前版本的语法

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "histogram_by_date": {
    
    
      "date_histogram": {
    
    
        "field": "sold_date",
        "interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
    
    
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
    
    
        "sum_by_price": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

#! Deprecation: [interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future.
{
    
    
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "histogram_by_date" : {
    
    
      "buckets" : [
        {
    
    
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 239800.0
          }
        },
        {
    
    
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 148800.0
          }
        },
        {
    
    
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 1998000.0
          }
        },
        {
    
    
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 258000.0
          }
        },
        {
    
    
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
    
    
            "value" : 341000.0
          }
        },
        {
    
    
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 489000.0
          }
        },
        {
    
    
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

es7.x版本之后的语法
查询
把关键字interval换成calendar_interval

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "histogram_by_date": {
    
    
      "date_histogram": {
    
    
        "field": "sold_date",
        "calendar_interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 1,
        "extended_bounds": {
    
    
          "min": "2021-01-01",
          "max": "2022-12-31"
        }
      },
      "aggs": {
    
    
        "sum_by_price": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "histogram_by_date" : {
    
    
      "buckets" : [
        {
    
    
          "key_as_string" : "2021-05-01",
          "key" : 1619827200000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 239800.0
          }
        },
        {
    
    
          "key_as_string" : "2021-07-01",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 148800.0
          }
        },
        {
    
    
          "key_as_string" : "2021-08-01",
          "key" : 1627776000000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 1998000.0
          }
        },
        {
    
    
          "key_as_string" : "2021-10-01",
          "key" : 1633046400000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 258000.0
          }
        },
        {
    
    
          "key_as_string" : "2021-11-01",
          "key" : 1635724800000,
          "doc_count" : 2,
          "sum_by_price" : {
    
    
            "value" : 341000.0
          }
        },
        {
    
    
          "key_as_string" : "2022-01-01",
          "key" : 1640995200000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 489000.0
          }
        },
        {
    
    
          "key_as_string" : "2022-02-01",
          "key" : 1643673600000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 1899000.0
          }
        }
      ]
    }
  }
}

2.3_global bucket

在聚合统计数据的时候,有些时候需要对比部分数据和总体数据。
例如:
统计某品牌车辆平均价格和所有车辆平均价格。global是用于定义一个全局bucket,这个bucket会忽略query的条件,检索所有document进行对应的聚合统计。
查询

GET /cars/_search
{
    
    
  "size": 0,
  "query": {
    
    
    "match": {
    
    
      "brand": "大众"
    }
  },
  "aggs": {
    
    
    "volkswagen_of_avg_price": {
    
    
      "avg": {
    
    
        "field": "price"
      }
    },
    "all_avg_price": {
    
    
      "global": {
    
    },
      "aggs": {
    
    
        "all_of_price": {
    
    
          "avg": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    
    
    "all_avg_price" : {
    
    
      "doc_count" : 8,
      "all_of_price" : {
    
    
        "value" : 671700.0
      }
    },
    "volkswagen_of_avg_price" : {
    
    
      "value" : 793000.0
    }
  }
}

2.4 aggs+order(聚合+排序)

对聚合统计数据进行排序。
例如:
统计每个品牌的汽车销量和销售总额,按照销售总额的降序排列。
查询

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_of_brand": {
    
    
      "terms": {
    
    
        "field": "brand",
        "order": {
    
    
          "sum_of_price": "desc"
        }
      },
      "aggs": {
    
    
        "sum_of_price": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "group_of_brand" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "奥迪",
          "doc_count" : 3,
          "sum_of_price" : {
    
    
            "value" : 2606000.0
          }
        },
        {
    
    
          "key" : "大众",
          "doc_count" : 3,
          "sum_of_price" : {
    
    
            "value" : 2379000.0
          }
        },
        {
    
    
          "key" : "标志",
          "doc_count" : 2,
          "sum_of_price" : {
    
    
            "value" : 388600.0
          }
        }
      ]
    }
  }
}

如果有多层aggs,执行下钻聚合的时候,也可以根据最内层聚合数据执行排序。(即外层排序的内容可以使用里层的别名进行排序)
例如
统计每个品牌中每种颜色车辆的销售总额,并根据销售总额降序排列。这就像SQL中的分组排序一样,

只能组内数据排序,而不能跨组实现排序。

查询

GET /cars/_search
{
    
    
  "aggs": {
    
    
    "group_by_brand": {
    
    
      "terms": {
    
    
        "field": "brand"
      },
      "aggs": {
    
    
        "group_by_color": {
    
    
          "terms": {
    
    
            "field": "color",
            "order": {
    
    
              "sum_of_price": "desc"
            }
          },
          "aggs": {
    
    
            "sum_of_price": {
    
    
              "sum": {
    
    
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "VIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 1899000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A 8",
          "sold_date" : "2022-02-12",
          "remark" : "很贵的大A6。。。"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "group_by_brand" : {
    
    
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
    
    
          "key" : "大众",
          "doc_count" : 3,
          "group_by_color" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "黑色",
                "doc_count" : 1,
                "sum_of_price" : {
    
    
                  "value" : 1998000.0
                }
              },
              {
    
    
                "key" : "金色",
                "doc_count" : 2,
                "sum_of_price" : {
    
    
                  "value" : 381000.0
                }
              }
            ]
          }
        },
        {
    
    
          "key" : "奥迪",
          "doc_count" : 3,
          "group_by_color" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "黑色",
                "doc_count" : 2,
                "sum_of_price" : {
    
    
                  "value" : 2388000.0
                }
              },
              {
    
    
                "key" : "红色",
                "doc_count" : 1,
                "sum_of_price" : {
    
    
                  "value" : 218000.0
                }
              }
            ]
          }
        },
        {
    
    
          "key" : "标志",
          "doc_count" : 2,
          "group_by_color" : {
    
    
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
    
    
                "key" : "白色",
                "doc_count" : 2,
                "sum_of_price" : {
    
    
                  "value" : 388600.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

2.5search+aggs (条件查询+聚合)

聚合类似SQL中的group by子句,search类似SQL中的where子句。在ES中是完全可以将search和aggregations整合起来,执行相对更复杂的搜索统计。
例如:
统计某品牌车辆每个季度的销量和销售额。
查询

GET /cars/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "brand": "大众"
    }
  },
  "aggs": {
    
    
    "histogram_by_date": {
    
    
      "date_histogram": {
    
    
        "field": "sold_date",
        "calendar_interval": "quarter",
        "min_doc_count": 1
      },
      "aggs": {
    
    
        "sum_by_price": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "histogram_by_date" : {
    
    
      "buckets" : [
        {
    
    
          "key_as_string" : "2021-07-01T00:00:00.000Z",
          "key" : 1625097600000,
          "doc_count" : 1,
          "sum_by_price" : {
    
    
            "value" : 1998000.0
          }
        },
        {
    
    
          "key_as_string" : "2021-10-01T00:00:00.000Z",
          "key" : 1633046400000,
          "doc_count" : 2,
          "sum_by_price" : {
    
    
            "value" : 381000.0
          }
        }
      ]
    }
  }
}

2.6filter+aggs(过滤+聚合)

filter也可以和aggs组合使用实现过滤聚合分析。
例如:
统计10万–50万之间的车辆的平均价格。

GET /cars/_search
{
    
    
  "query": {
    
    
    "constant_score": {
    
    
      "filter": {
    
    
        "range": {
    
    
          "price": {
    
    
            "gte": 100000,
            "lte": 500000
          }
        }
      }
    }
  },
  "aggs": {
    
    
    "avg_by_price": {
    
    
      "avg": {
    
    
        "field": "price"
      }
    }
  }
}

结果

{
    
    
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "T4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 239800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志508",
          "sold_date" : "2021-05-18",
          "remark" : "标志品牌全球上市车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UIR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 148800,
          "color" : "白色",
          "brand" : "标志",
          "model" : "标志408",
          "sold_date" : "2021-07-02",
          "remark" : "比较大的紧凑型车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UoR_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 218000,
          "color" : "红色",
          "brand" : "奥迪",
          "model" : "奥迪A4",
          "sold_date" : "2021-11-05",
          "remark" : "小资车型"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "U4R_-4cBUF6rBrkiDpRJ",
        "_score" : 1.0,
        "_source" : {
    
    
          "price" : 489000,
          "color" : "黑色",
          "brand" : "奥迪",
          "model" : "奥迪A6",
          "sold_date" : "2022-01-01",
          "remark" : "政府专用?"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "avg_by_price" : {
    
    
      "value" : 246100.0
    }
  }
}

2.7聚合中使用filter

filter也可以使用在aggs句法中,filter的范围决定了其过滤的范围。
如:统计某品牌汽车最近一年的销售总额。将filter放在aggs内部,代表这个过滤器只对query搜索得到的结果执行filter过滤。如果filter放在aggs外部,过滤器则会过滤所有的数据。

①12M/M 表示 12 个月。
②1y/y 表示 1年。
③d 表示天

查询

GET /cars/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "brand": "大众"
    }
  },
  "aggs": {
    
    
    "count_last_year": {
    
    
      "filter": {
    
    
        "range": {
    
    
          "sold_date": {
    
    
            "gte": "now-12M"
          }
        }
      },
      "aggs": {
    
    
        "sum_of_price_last_year": {
    
    
          "sum": {
    
    
            "field": "price"
          }
        }
      }
    }
  }
}

结果

{
    
    
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : {
    
    
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.9444616,
    "hits" : [
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "TYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 258000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众迈腾",
          "sold_date" : "2021-10-28",
          "remark" : "大众中档车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "ToR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 123000,
          "color" : "金色",
          "brand" : "大众",
          "model" : "大众速腾",
          "sold_date" : "2021-11-05",
          "remark" : "大众神车"
        }
      },
      {
    
    
        "_index" : "cars",
        "_type" : "_doc",
        "_id" : "UYR_-4cBUF6rBrkiDpRJ",
        "_score" : 0.9444616,
        "_source" : {
    
    
          "price" : 1998000,
          "color" : "黑色",
          "brand" : "大众",
          "model" : "大众辉腾",
          "sold_date" : "2021-08-19",
          "remark" : "大众最让人肝疼的车"
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "count_last_year" : {
    
    
      "meta" : {
    
     },
      "doc_count" : 0,
      "sum_of_price_last_year" : {
    
    
        "value" : 0.0
      }
    }
  }
}

猜你喜欢

转载自blog.csdn.net/xiaobai_july/article/details/131386527