【Elasticsearch教程13】Mapping字段类型之nested

一、简介

在上一篇博客Mapping字段类型之object中,已经验证了用object类型存放对象数组是不可取的,因为object会把多个对象进行扁平化存储。

nested类型能够存放对象数组,每一个对象会单独存储,所以可以在nested类型上进行查询聚合排序等操作。

然而nested类型也是有一些限制的,ES有如下默认设置:

  • 一个文档最多有50个nested类型的字段
  • 一个文档所有nested类型的字段存储文档最大数量是10000条

二、插入测试数据

1 创建nested字段的mapping

创建一个文档,存储班级和学生信息

PUT /pigg_test_nested/_mapping/
{
    
    
    "properties":{
    
    
        "class":{
    
    
            "type":"keyword"
        },
        "student":{
    
    
            "type":"nested",
            "properties":{
    
    
                "name":{
    
    
                    "type":"keyword"
                },
                "sex":{
    
    
                    "type":"keyword"
                },
                "age": {
    
    
                    "type":"integer"
                }
            }
        }
    }
}

2 插入2个班级的数据

PUT pigg_test_nested/_doc/1
{
    
    
    "class":"高三(1)班",
    "student":[
        {
    
    
            "name":"亚瑟王",
            "sex":"男",
            "age":20
        },
        {
    
    
            "name":"程咬金",
            "sex":"男",
            "age":30
        },
        {
    
    
            "name":"安其拉",
            "sex":"女",
            "age":18
        }
    ]
}

PUT pigg_test_nested/_doc/2
{
    
    
    "class":"高三(2)班",
    "student":[
        {
    
    
            "name":"孙策",
            "sex":"男",
            "age":20
        },
        {
    
    
            "name":"小乔",
            "sex":"女",
            "age":16
        },
        {
    
    
            "name":"大乔",
            "sex":"女",
            "age":18
        }
    ]
}

三、nested 查询

查询符合age=16 and name='女'学生的班级,返回id=2的文档

1 Query DSL

GET /pigg_test_nested/_search
{
    
    
  "query": {
    
    
    "nested": {
    
               # nested关键字指定是在nested字段上做查询                            
      "path": "student",  # path指定需查询的字段名称
      "query": {
    
    		  # query指定查询体
        "bool": {
    
    
          "must": [
            {
    
    "term": {
    
    "student.age":  16 } }, # 写全路径名称,不能只是age 
            {
    
    "term": {
    
    "student.sex": "女"} }
          ]
        }
      }
    }
  }
}

2 Java API

  • QueryBuilders.nestedQuery指定是nested查询,第一参数填nested字段名称
  • 在查询条件中termQuery方法的第一参数要写全路径名称student.agestudent.sex
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.termQuery("student.age", 16));
boolQueryBuilder.must(QueryBuilders.termQuery("student.sex", "女"));

NestedQueryBuilder queryBuilder = QueryBuilders.nestedQuery(
        "student",
        boolQueryBuilder,
        ScoreMode.None);

四、nested 排序

创建一个index,其中examsnested类型

PUT pigg_test_page/_mapping
{
    
    
    "properties":{
    
    
        "exams":{
    
    
            "type":"nested",
            "properties":{
    
    
                "course":{
    
    
                    "type":"keyword"
                },
                "score":{
    
    
                    "type":"long"
                }
            }
        },
        "name":{
    
    
            "type":"keyword"
        }
    }
}

插入2个学生的成绩

PUT pigg_test_page/_doc/1
{
    
    
  "name": "name1",
  "exams": [
      {
    
    
        "course": "语文",
        "score": 98
      },
      {
    
    
        "course": "数学",
        "score": 100
      }
    ]
}

PUT pigg_test_page/_doc/2
{
    
    
  "name": "name2",
    "exams": [
      {
    
    
        "course": "语文",
        "score": 88
      },
      {
    
    
        "course": "数学",
        "score": 76
      }
    ]
}

要按照语文成绩由高到低排序:

GET pigg_test_page/_search
{
    
    
  "sort": [
    {
    
    
      "exams.score": {
    
    
        "order": "desc",
        "nested": {
    
    
          "path": "exams",
          "filter": {
    
    
            "term": {
    
    "exams.course": "语文"}
          }
        }
      }
    }
  ]
}

五、nested 聚合

统计每个班中,男生和女生的人数

1 Query DSL

GET pigg_test_nested/_search
{
    
    
  "aggs": {
    
    
    "group_by_class": {
    
    			# 先按找班级class分组
      "terms": {
    
    
        "field": "class",
        "size": 10
      },
      "aggs": {
    
    
        "count_by_sex": {
    
    		
          "nested": {
    
               # nested指定是在nested字段上聚合
            "path": "student"   # path指定nested字段名称student
          },
          "aggs": {
    
                 
            "group_by_sex": {
    
       # 按照性别sex分组
              "terms": {
    
    
                "field": "student.sex",  # 写全路径名称,不能只是sex
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

返回如下:

"buckets" : [
        {
    
    
          "key" : "高三(1)班",
          "doc_count" : 1,
          "count_by_sex" : {
    
    
            "doc_count" : 3,
            "group_by_sex" : {
    
    
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                {
    
    
                  "key" : "男",
                  "doc_count" : 2
                },
                {
    
    
                  "key" : "女",
                  "doc_count" : 1
                }
              ]
            }
          }
        },
        {
    
    
          "key" : "高三(2)班",
          "doc_count" : 1,
          "count_by_sex" : {
    
    
            "doc_count" : 3,
            "group_by_sex" : {
    
    
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                {
    
    
                  "key" : "女",
                  "doc_count" : 2
                },
                {
    
    
                  "key" : "男",
                  "doc_count" : 1
                }
              ]
            }
          }
        }
]

2 Java API

  • 在用Java API时,也要注意用AggregationBuilders.nested指定nested字段名称
  • sex上做聚合时,要写全路径student.sex
AggregationBuilder agg = AggregationBuilders.terms("group_by_class").field("class")
        .subAggregation(
                AggregationBuilders.nested("count_by_sex", "student")
                        .subAggregation(
                                AggregationBuilders.terms("group_by_sex").field("student.sex")
                        )
        );

六、nested的应用

我常用nested类型用作自关联的嵌套结构,就是一个表中,有个parentId指向同一个表中的另外一条数据的ID。这就形成了树形结构,当然只有parentId是不够灵活的,所以每条记录要保存从根节点到自己的所有路径节点的id信息。具体可以参考我之前的博客ES 存储树形结构 整合Spring Data Elasticsearch

PUT /pigg_tree/_mapping/_doc
{
    
    
    "properties":{
    
    
        "id":{
    
    
            "type":"keyword"
        },
        "level":{
    
    
            "type":"keyword"
        },
        "name":{
    
    
            "type":"keyword"
        },
        "parentId":{
    
    
            "type":"keyword"
        },
        "path":{
    
    
            "type":"nested",
            "properties":{
    
    
                "id":{
    
    
                    "type":"keyword"
                },
                "level":{
    
    
                    "type":"keyword"
                }
            }
        }
    }
}

猜你喜欢

转载自blog.csdn.net/winterking3/article/details/126616103