前言

运行系统：windows10

JDK版本：1.8

Elasticsearch版本：5.6.6

插件：kibana、elasticsearch-head

工具：postman

本文介绍高级搜索命令，建议首先参考简单的命令：点击打开链接

五、高级查询

高级查询有很多种，下面介绍常用的几种：

1. 精准查询（短语搜索）

与简单的查询点击打开链接里面的4.3的模糊查询（全文搜索）正好相反，这个是精确匹配

POST /people/_search
{
  "query": {
    "match_phrase": {
      "name": "叶良辰"
    }
  }
}

2. 多字段匹配查询

注意：这样的查询也是模糊查询，会把“叶良辰”拆分成“叶”、“良”和“辰”进行查询；并且在多个字段里查询！

POST /people/_search
{
  "query": {
    "multi_match": {
      "query": "叶良辰",
      "fields": ["name","desc"]
    }
  }
}

3. 语法查询

注意：“叶良辰 AND 风”会先把“叶良辰”拆分成“叶”、“良”和“辰”，然后后面必须有“风”；

POST /people/_search
{
  "query": {
    "query_string": {
      "query": "叶良辰 AND 风"
    }
  }
}

下面的语法匹配，自行脑补

POST /people/_search
{
  "query": {
    "query_string": {
      "query": "(叶良辰 AND 火) OR (赵日天 AND 风)",
      "fields": ["name","desc"]
    }
  }
}

4. 字段查询（结构化查询）

下面的是精准查询：

POST /people/_search
{
  "query": {
    "term": {
      "name": "叶良辰"
    }
  }
}

5. 分页查询

GET /people/_search
{
  "query": {
    "match_all": {}
  },
  "from": 1,
  "size": 1
}

6. 范围查询

6.1 数字类型

POST /people/_search
{
  "query": {
    "range": {
      "age": {
        "gt": 16,
        "lte": 30
      }
    }
  }
}

6.2 日期类型

POST /people/_search
{
  "query": {
    "range": {
      "birthday": {
        "gte": "2013-01-01",
        "lte": "now"
      }
    }
  }
}

7. 过滤查询

两种方式：一种是通过constant_score：

POST /people/man/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "age": {
            "gte": 20,
            "lte": 30
          }
        }
      },
      "boost": 1.2
    }
  }
}

另一种是通过bool：

POST /people/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "age": 18
        }
      }
    }
  }
}

8. 分数查询

POST /people/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "match": {
          "name": "叶良辰"
        }
      },
      "boost": 2
    }
  }
}

9. 布尔查询

注意：条件可以是数组[ ]，也可以是单个条件{ }

9.1 should查询

注意：should相当于或，里面的match也是模糊匹配

POST /people/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "name": "叶良辰"
          }
        },
        {
          "match": {
            "desc": "赵日天"
          }
        }
      ]
    }
  }
}

9.2 must查询

注意：两个条件都要满足，并且这里也会把must里面的“叶良辰”拆分成“叶”、“良”和“辰”进行查询；“赵日天”拆分成“赵”、“日”、和“天”！

POST /people/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "叶良辰"
          }
        },
        {
          "match": {
            "desc": "赵日天"
          }
        }
      ]
    }
  }
}

9.3 must与filter相结合

这里也会把must里面的“叶良辰”拆分成“叶”、“良”和“辰”进行查询

POST /people/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "叶良辰"
          }
        },
        {
          "match": {
            "desc": "赵日天"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "age": 18
          }
        }
      ]
    }
  }
}

9.4 must_not

注意：下面语句是精准匹配

POST /people/_search
{
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "name": "叶良辰"
        }
      }
    }
  }
}

10. 高亮查询

10.1 默认标签<em>

GET people/_search
{
  "query": {
    "match": {
      "name": "叶良辰"
    }
  },
  "highlight": {
    "fields": {
      "name": {}
    }
  }
}

查询结果中会有<em>标签标识，可以结合上面的查询方式查看

10.2 指定标签

设置pre_tags和post_tags的属性

GET people/_search
{
  "query": {
    "match": {
      "name": "叶良辰"
    }
  },
  "highlight": {
    "pre_tags": ["<b>"], 
    "post_tags": ["</b>"], 
    "fields": {
      "name": {}
    }
  }
}

11 聚合查询

11.1 根据字段类型查询

GET /people/man/_search
{
  "size": 0, 
  "aggs": {
    "group_by_age": {
      "terms": {
        "field": "age"
      }
    }
  }
}

11.2 查询总体值

POST /people/_search
{
  "aggs": {
    "grads_age": {
      "stats": {
        "field": "age"
      }
    }
  }
}

11.3 查询最小值

POST /people/_search
{
  "aggs": {
    "grads_age": {
      "min": {
        "field": "age"
      }
    }
  }
}

11.4 先分组后计算

根据国家分组，然后计算年龄平均值：

GET /people/man/_search
{
  "size": 0, 
  "aggs": {
    "group_by_age": {
      "terms": {
        "field": "country"
      },
      "aggs": {
        "avg_age": {
          "avg": {
            "field": "age"
          }
        }
      }
    }
  }
}

执行之后，发现报错：

解决：上面的reason里面说的很清楚，将fielddata设置为true就行了：

POST /people/_mapping/man
{
  "properties": {
    "country": {
      "type": "text",
      "fielddata": true
    }
  }
}

然后重新执行命令，得到下面的结果：

12.排序查询

排序查询通常没有排到我们想要的结果，因为字段分词后，有很多单词，再排序跟我们想要的结果又出入

解决办法：把需要排序的字段建立两次索引，一个排序，另一个不排序。

如下面的案例：把title.raw的fielddata设置为true，是排序的；而title的fielddata默认是false，可以用来搜索

index: true 是在title.raw建立索引可以被搜索到，

fielddata: true是让其可以排序

PUT /blog
{
  "mappings": {
    "article": {
      "properties": {
        "auther": {
          "type": "text"
        },
        "title": {
          "type": "text",
          "fields": {
            "raw":{
              "type": "text",
              "index": true,
              "fielddata": true
            }
          }
        },
        "content":{
          "type": "text",
          "analyzer": "english"
        },
        "publishdate": {
          "type": "date"
        }
      }
    }
  }
}

对应的搜索命令：

GET /blog/article/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "title.raw": {
        "order": "desc"
      }
    }
  ]
}

13 scroll查询

当搜索量比较大的时候，我们在短时间内不可能一次性搜索完然后展示出来

这个时候，可以使用scroll进行搜索

比如下面的案例，可以先搜索3条数据，然后结果中会有一个_scroll_id，下次搜索就可以直接用这个_scroll_id进行搜索了：

step1 搜索：

GET test_index/test_type/_search?scroll=1m
{
  "query": {
    "match_all": {}
  },
  "sort": "_doc", 
  "size": 3
}

step2 复制结果中的_scroll_id：

step3 把scroll_id粘贴到下面的命令中再次搜索

GET _search/scroll
{
  "scroll": "1m",
  "scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAA6FnZPSl9sbVR4UVVDU1NLb2wxVXJlbWcAAAAAAAAAPhZ2T0pfbG1UeFFVQ1NTS29sMVVyZW1nAAAAAAAAADsWdk9KX2xtVHhRVUNTU0tvbDFVcmVtZwAAAAAAAAA8FnZPSl9sbVR4UVVDU1NLb2wxVXJlbWcAAAAAAAAAPRZ2T0pfbG1UeFFVQ1NTS29sMVVyZW1n"
}

后面的操作重复step3就OK了！

14 分词查询

14.1 随便写一个句子进行分词

GET _analyze
{
  "analyzer": "standard",
  "text": "this is a test for analyzer"
}

14.2 自定义分词器

char_filter是映射字符过滤器，该例子是把&符号转化为and

filter中使用停止词过滤器，移除自定义的停止词列表中包含的词

PUT /my_index
{
  "settings": {
    "analysis": {
      "char_filter": {
        "&_to_and": {
          "type": "mapping",
          "mappings": ["&=> and "]
        }
      },
      "filter": {
        "my_stopwords": {
          "type": "stop",
          "stopwords": ["the","a"]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "char_filter": ["html_strip","&_to_and"],
          "tokenizer": "standard",
          "filter": ["lowercase","my_stopwords"]
        }
      }
    }
  }
}

然后就可以使用下列命令来测试该分词器：

GET my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "Tom&Jerry are the best friends"
}

结果：大写转为了小写；&转为了and；the已经被去掉

使用kibana或postman操作Elasticsearch的高级搜索命令

前言