一、查询指定id的文档：

1、使用pretty方式展示文档信息：

GET http://$user:$passwd@$host:$port/$index/$type/$id?pretty

2、只查文档的中的所有字段：

GET http://$user:$passwd@$host:$port/$index/$type/$id?_source

3、只查文档的指定字段field1，field2：

GET http://$user:$passwd@$host:$port/$index/$type/$id?_source=$field1,$field2

以上请求如果存在指定id的文档，则返回200 ok的状态码，否则返回404 Not Found的状态码。

二、_search查询：

1、轻量查询：

在查询字符串query中传入所有的参数，比如：

1>查询所有文档：

GET http://$user:$passwd@$host:$port/$index/$type/_search

或者

GET http://$user:$passwd@$host:$port/$index/_search

2>查询多个索引下的所有文档：

GET http://$user:$passwd@$host:$port/$index1,$index2/_search

3>查询多个索引下多个类型的所有文档：

GET http://$user:$passwd@$host:$port/$index1,$index2/$type1,$type2/_search

4>查询以a，b开头的索引的所有文档：

GET http://$user:$passwd@$host:$port/a*,b*/_search

5>查询所有索引中类型是$type1和$type2的所有文档：

GET http://$user:$passwd@$host:$port/_all/$type1,$type2/_search

6>分页查询：

GET http://$user:$passwd@$host:$port/$index/_search?size=5&from=0

7>查询字段中包含$keyword的文档，不管是哪个字段：

GET http://$user:$passwd@$host:$port/$index/$type/_search?q=$keyword

8>查询某个字段中包含$keyword的文档。如果查询条件是多个字段，则多个字段条件用空格隔开。字段名的前缀+表示该查询条件必须匹配，不匹配时对应的文档不返回。字段名的前缀-表示该查询条件必须不匹配，否则对应的文档也不返回。如果字段名前既没有+也没有-，说明该条件是可选的，不是必输的，可能是用来增加相关性评分的：

GET http://$user:$passwd@$host:$port/$index/$type/_search?q=+$field1:$keyword1 -$field2:$keyword2

9>查询某个字段中包含$keyword1或者$keyword2，如果不是查某个字段而且全部字段，那么则可以省去字段名：

GET http://$user:$passwd@$host:$port/$index/$type/_search?q=+$field:($keyword1 $keyword2) +($keyword1 $keyword2)

10>查询某个字段$field要大于$keyword的文档：

GET http://$user:$passwd@$host:$port/$index/$type/_search?q=+$field:>$keyword

11>查询设置超时时间，当超过这个时间时各分片停止继续检索，只把截止到当前时间的检索结果返回，所以返回的数据不全或可能为空：

GET http://$user:$passwd@$host:$port/$index/$type/_search?timeout=10ms

2、请求体查询：

es支持请求体放在GET请求中，但因为带请求体的GET请求并不被广泛支持，所以es同时支持请求体放在POST请求中。

1>空查询：

GET http://$user:$passwd@$host:$port/$index/$type/_search

或者

GET http://$user:$passwd@$host:$port/$index/$type/_search
{}

2>分页：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
  "from": 0,
  "size": 20
}

3>自定义排序：

默认使用_socre降序排序，如果要先按指定字段count进行排序，然后按相关性排序，可以：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "sort": [
        {
            "count": {
                "order": "asc"
            }
        },
        {
            "_score":{
                "order": "desc"
            }
        }
    ]
}

4>高亮：

对查询结果的内容进行高亮显示，一般都与query查询条件结合使用。高亮有三种方式，分别是：

①unified：使用Lucene的统一高亮器，是ES默认的高亮方式，可以将文本分成句子，支持精确、模糊、前缀，正则等高亮，并使用BM25算法对单个句子进行评分；

②plain：使用Lucene的标准高亮器，为了准确反映查询逻辑，它会创建一个很小的内存索引，所以会相对耗性能；

③fvh：即快速矢量高亮器，mapping字段的属性“term_vector”设置为"with_positions_offsets"的字段上才能使用，不支持范围匹配高亮。

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "highlight": {
        "order": "score",
        "no_match_size": 150,
        "pre_tags": "<span>",
        "post_tags": "</span>",
        "fields": [
            {
                "title": {
                    "type": "fvh",
                    "number_of_fragments": 0
                }
            }
        ]
    }
}

高亮器的属性分为全局属性和局部属性，字段中的局部属性会把全局属性覆盖。常用属性有：

①number_of_fragments：返回的高亮片断数，这些片断可能不是完整的句子。如果设置为0，则如果有高亮时将整个字段的内容都返回，可以避免高亮片断不完整的问题，但如果当前字段上无高亮，则该字段不返回高亮片断；

②fragment_size：返回的每个高亮片断的长度，如果number_of_fragments为0，则fragment_size不生效；

③no_match_size：如果无高亮片断时从该字段开始位取截指定长度，保证highlight中有内容，但不高亮；

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/search-request-highlighting.html

5>查询表达式：

①match_all：查询所有文档：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "match_all": {}
    }
}

②match：如果字段类型是keyword，进行整个字段的完全匹配。如果字段类型是text，则根据指定的分析器进行分词后对这些分词完全匹配，boost表示相关性得分的增加倍数。

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query" : {
        "match" : {
            "title" : "示例" 
        }
    }
}

或者：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "match": {
            "title": {
                "query": "示例",
                "boost": 5
            }
        }
    }
}

③multi_match：类似于match，但可以在多个字段上执行相同的match查询，^5表示相关性得分的增加倍数：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "multi_match": {
            "query": "示例",
            "fields": ["title^5","content"]
        }
    }
}

④match_phrase：短语匹配查询，slop表示分词的跨度，指分词和分词之间可以相隔多少个词，缺失了这些词仍然可以查到结果：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "match_phrase": {
            "title": {
                "query": "示例成功",
                "slop": 4,
                "boost": 5
            }
        }
    }
}

⑤term：精确匹配单个值，被查字段的类型必须是数字、时间、布尔或者keyword的字符串，区分大小写的。如果查询字段会分词，即使内容相同也不会被查到：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "term": {
            "title": {
                "value": "示例结果",
                "boost": 5
            }
        }
    }
}

⑥terms：精确匹配多个值，类似于term，但terms规定只要被查字段包含指定数组中任何一个值就算符合条件：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "terms": {
            "title": ["百度","阿里"],
            "boost": 5
        }
    }
}

⑦prefix：分词的前缀匹配，会扫描全部的倒排索引，倒排索引一般都经过分词器转成了小写，而查询字符串本身不会被拆词，所以不会转成小写，如果大小写不一致也不会返回结果：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "prefix": {
            "title": {
                "value": "示例",
                "boost": 5
            }
        }
    }
}

⑧wildcard：分词的通配符（*表示任何字符串，?表示任何单个字符）匹配，会扫描全部的倒排索引，所以不区分大小写，找到符合通配符格式的分词：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "wildcard": {
            "title": {
                "value": "*例",
                "boost": 5
            }
        }
    }
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-wildcard-query.html

⑨regexp：分词的正则匹配，会扫描全部的倒排索引，倒排索引一般都经过分词器转成了小写，而查询字符串本身不会被拆词，所以不会转成小写，如果大小写不一致也不会返回结果：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "regexp": {
            "title": {
                "value": ".*例",
                "boost": 5
            }
        }
    }
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-regexp-query.html

⑩fuzzy：分词的近似匹配，fuzziness用于控制levenshtein距离

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "fuzzy": {
            "title": {
                "value": "requeab",
                "fuzziness": 2,
                "boost": 5
            }
        }
    }
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-fuzzy-query.html

⑪range：数值类型的区间匹配，可以使用：gt（大于）、gte（大于等于）、lt（小于）、lte（小于等于）：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "range": {
            "count": {
                "gte": 200,
                "lt": 250
            }
        }
    }
}

⑫exist：匹配指定字段是非空的文档：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "exists": {
            "field": "title",
            "boost": 5
        }
    }
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-exists-query.html

⑬多条件组合查询：使用bool将must，must_not，should以及filter组合进行查询，must，must_not，should以及filter中又可以嵌套多层bool组成更复杂的查询。其中：

❶must：必须满足这些条件；

❷must：必须不满足这些条件；

❸should：可以满足也可以不满足这些条件，不要求必须满足，一般用于控制相关性得分，比如满足该条件时增加相关性得分权重；

❹filter：过滤，用于控制符合条件的文档要不要被查询出来，而不能控制查询出来的文档相关性得分。也就是说只是用来控制有没有，而不能控制得分高不高；

为了更好的控制查询结果，bool中有一些常用的属性控制查询策略：

Ⅰminimum_should_match：控制should条件至少要匹配的数量，因为should条件不是必须满足，所以不加该参数时可能都不满足；

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "query": {
        "bool": {
            "should": [
                {
                    "term": {
                        "title.keyword": {
                            "value": "示例",
                            "boost": 5
                        }
                    }
                },
                {
                    "match_phrase": {
                        "title": {
                            "query": "示例",
                            "boost": 10
                        }
                    }
                },
                {
                    "bool": {
                        "must": [
                            {
                                "match_phrase": {
                                    "title": {
                                        "query": "示例"
                                    }
                                }
                            },
                            {
                                "prefix": {
                                    "title": {
                                        "value": "示例",
                                        "boost": 15
                                    }
                                }
                            }
                        ]
                    }
                },
                {
                    "multi_match": {
                        "fields": [
                            "title^5",
                            "content"
                        ],
                        "query": "示例"
                    }
                }
            ],
            "filter": [
                {
                    "term": {
                        "count": "999"
                    }
                }
            ],
            "minimum_should_match": 1
        }
    }
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/client/net-api/6.x/bool-query-usage.html

6>聚合查询：

用于分组、汇总等操作，可与查询表达式结合使用。根据不同的汇总用途，主要分为4种不同的聚合。这4种聚合可以并列使用，用不同的自定义聚合结果名称区分。这4种聚合分别是：

①Metric：指标聚合，主要用于数值汇总计算，可以作为子聚合，但不能包含子聚合；

❶平均值avg：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_avg": {//自定义聚合结果名称
            "avg": {
                "field": "count"
            }
        }
    }
}

❷最大值max：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_max": {//自定义聚合结果名称
            "max": {
                "field": "count"
            }
        }
    }
}

❸最小值min：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_min": {//自定义聚合结果名称
            "min": {
                "field": "count"
            }
        }
    }
}

❹总和sum：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_sum": {//自定义聚合结果名称
            "sum": {
                "field": "count"
            }
        }
    }
}

❺记录总数、最小值、最大值、均值、求和，5种统计信息：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_stats": {//自定义聚合结果名称
            "stats": {
                "field": "count"
            }
        }
    }
}

❻去重后数据统计总数：

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_cardinality": {
            "cardinality": {
                "field": "count"
            }
        }
    }
}

②Bucketing：桶分聚合，主要用于进行数据分组，可以作为子聚合，也可以包含子聚合；

❶terms：按指定的字段的值不同，分成不同的组，字段不能是text类型

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_group": {
            "terms": {
                "field": "count",
                "size":2 //只查前两个组
            }
        }
    }
}

❷range/date_range：按自定义的区间，分成不同的组，date_range用于对日期进行分组；

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_group": {
            "range": {
                "field": "count",
                "ranges": [
                    {
                        "from": 100,
                        "to": 300
                    },
                    {
                        "from": 200,
                        "to": 500
                    }
                ]
            }
        }
    }
}

❸filter/filters：经过单过滤/多过滤后进行聚合

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "filter_group": {
            "filter": {
                "term": {
                    "type": "3"
                }
            },
            "aggs": {
                "count_avg": {
                    "avg": {
                        "field": "count"
                    }
                }
            }
        }
    }
}

❹histogram/date_histogram：将指定字段的值以指定的值逐级递增，每级递增的区间作为分组，类似直方图。date_histogram用于对日期的直方图；

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "count_interval_group": {
            "histogram": {
                "field": "count",
                "interval":20
            }
        }
    }
}

③Pipeline：管道聚合，处理其他聚合输出的结果。管道聚合不能包含子聚合，但是某些类型的管道聚合可以链式使用。根据输入层级不同，主要分为2类：

❶parent：此类管道聚合的输入是是其父聚合的输出，一般不生成新的桶，而是对父聚合桶信息的增强。

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "sum_per_count": {
            "histogram": {
                "field": "count",
                "interval": 10
            },
            "aggs": {
                "count_sum_per_count": {
                    "sum": {
                        "field": "count"
                    }
                },
                "count_cumulative_sum_per_bucket": {//管道聚合：对权值在每一个桶中求所有之前的桶的该值累计的和
                    "cumulative_sum": {
                        "buckets_path": "count_sum_per_count"
                    }
                }
            }
        }
    }
}

❷sibling：此类管道聚合的输入是是其兄弟聚合的输出，并能在同级上计算新的聚合。

GET http://$user:$passwd@$host:$port/$index/$type/_search
{
    "size": 0,
    "aggs": {
        "sum_per_count": {
            "histogram": {
                "field": "count",
                "interval": 10
            },
            "aggs": {
                "count_sum_per_count": {
                    "sum": {
                        "field": "count"
                    }
                }
            }
        },
        "count_avg_per_bucket": {//管道聚合：对所有桶取平均值
            "avg_bucket": {
                "buckets_path": "sum_per_count>count_sum_per_count"
            }
        }
    }
}

④Matrix：矩阵聚合，此功能是测试版栖，在将来的版本中可能会完全更改或删除，暂不介绍。

由于查询参数是字符串，需要对一些特殊字符进行过滤，这些特殊字符有：\,~,*,?,|,+,(,),[,",{,&，如果不过滤，这些字符会全查或直接报语法错误。

三、多重获取_mget：

通过同时指定多个索引、类型和id，获取多个文档。

1、各文档属于不同的索引和类型：

GET http://$user:$passwd@$host:$port/_mget
{
    "docs": [
        {
            "_index": $index1,
            "_type": $type1,
            "_id": $id1
        },
        {
            "_index": $index2,
            "_type": $type2,
            "_id": $id2
        }
    ]
}

2、各文档属于相同的索引和类型：

GET http://$user:$passwd@$host:$port/$index/$type/_mget
{
    "docs": [
        {
            "_id": $id1
        },
        {
            "_id": $id2
        }
    ]
}

或者：

GET http://$user:$passwd@$host:$port/$index/$type/_mget
{
    "ids": [$id1,$id2]
}

详情请参考：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/docs-multi-get.html

聚合查询参考：https://blog.csdn.net/zyc88888/article/details/83016513

鹤啸九天-西木

发布了108 篇原创文章 · 获赞 31 · 访问量 16万+

私信关注

Elasticsearch之数据查询