Elasticsearch Rest API

一. 运维类命令

1. 健康度检查

curl 'localhost:9200/_cat/health?v'

2. 查看集群中节点列表

curl 'localhost:9200/_cat/nodes?v'

二. 索引类命令

3. 列举所有索引

curl 'localhost:9200/_cat/indices?v'

4. 创建索引

curl -XPUT 'localhost:9200/students?pretty'

5. 删除索引

curl -XDELETE 'localhost:9200/customer?pretty'

三. 文档类命令

6. 索引并查询一个文档

curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '

{"name":"John Doe"}'

curl -XGET 'localhost:9200/customer/external/1?pretty'

7. 更新文档

es底层并不支持原地更新, 它会先删除旧文档, 再索引一个更新过的新文档.

例1:

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '

{

"doc":{"name":"David.Liu"}

例2:

更改name, 并新增字段age:

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '

{

"doc":{"name":"DavidAngelfish", "age":31}

例3:

更新也可以使用简单的脚步来进行, 下面使用一个脚步将age+5

8. 删除文档

a). 根据文档id删除

curl -XDELETE 'localhost:9200/customer/external/AVgjuLOxqPgQLLFyGwIp?pretty'

{

"found" : true,

"_index" : "customer",

"_type" : "external",

"_id" : "AVgjuLOxqPgQLLFyGwIp",

"_version" : 2

}

b). 根据条件一次删除符合查询条件的多个文档.(如: 删除名字中包含John的所有的客户)

curl -XDELETE 'localhost:9200/customer/external/_query?pretty' -d'

{

"query":{"match":{"name":"John"}}

{

"_indices" : {

"customer" : {

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

}

9. 批处理

除对单个文档进行索引, 更新, 删除操作外, es也提供了批处理命令, 通过_bulk api来实现.

提供高效的机制来尽可能多得完成多个操作 ,也尽可能少减少网络传输.

例1:

一次bulk操作索引2个文档(ID 6 - John Doe and ID 7 - Jane Doe)

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d'

> {"index":{"_id":"6"}}

> {"name":"John Doe"}

> {"index":{"_id":"7"}}

> {"name":"Jane Doe"}

> '

{

"took" : 7,

"errors" : false,

"items" : [ {

"index" : {

"_index" : "customer",

"_type" : "external",

"_id" : "6",

"_version" : 1,

"status" : 201

}

}, {

"index" : {

"_index" : "customer",

"_type" : "external",

"_id" : "7",

"_version" : 1,

"status" : 201

}

} ]

}

例2:

一次bulk操作中, 首先更新第一个文档(ID为6), 然后删除第二个文档(ID为7)

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d'

> {"update":{"_id":"6"}}

> {"doc":{"name":"John Doe becomes David.liu"}}

> {"delete":{"_id":"7"}}

> '

{

"took" : 5,

"errors" : false,

"items" : [ {

"update" : {

"_index" : "customer",

"_type" : "external",

"_id" : "6",

"_version" : 2,

"status" : 200

}

}, {

"delete" : {

"_index" : "customer",

"_type" : "external",

"_id" : "7",

"_version" : 2,

"status" : 200,

"found" : true

}

} ]

}

四. 搜索类命令

构造数据集.

在(https://github.com/bly2k/files/blob/master/accounts.zip?raw=true) 上下载样本数据集.

加载到集群中

curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary @accounts.json

搜索API

搜索有2种方式:

一. 在REST请求的URI中发送搜索参数;

二. 将搜索参数发送到REST请求体重; (表达能力更强, 推荐)

例如: 请求全量数据

URI语法:

curl 'localhost:9200/bank/_search?q=*&pretty'

请求体语法(查询DSL , 一种json风格的特定领域语言):

curl 'localhost:9200/bank/_search?pretty' -d '

{

"query":{"match_all":{}}

返回第一个文档:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"size": 1
}'

返回第11到第20个文档( 可做分页):

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}'

执行搜索:

1. 只返回指定字段(类似于sql中 select xx,yy from)

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query": {"match_all":{}},

"_source": ["account_number", "balance"]

2. match查询

例a). 返回账户编号为20的文档:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{"match":{"account_number":20}}

例b). 返回地址中包含"mill"的所有文档(忽略大小写):

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{"match":{"address":"mill"}}

例c). 返回地址中包含"mill" 或者包含 "lane" 的所有文档(忽略大小写):

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{"match":{"address":"mill lane"}}

例c). match的变体(match_phase), 严格匹配短语"mill lane"

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{"match_phrase":{"address":"mill lane"}}

3. bool 查询

返回布尔查询允许我们将较小的查询组合成较大的查询.

例a). 组合查询返回包含"mill" 和 "lane"的所有的账户:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{

"bool":{

"must":[

{"match": {"address":"mill"}},

{"match": {"address":"lane"}}

]

}

例b). 组合查询返回包含"mill" 或者 "lane"的所有的账户:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{

"bool":{

"should":[

{"match":{"address":"mill"}},

{"match":{"address":"lane"}}

]

}

例c). 组合查询返回既不包含"mill" 又不包含 "lane"的所有的账户:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{

"bool":{

"must_not":[

{"match":{"address":"mill"}},

{"match":{"address":"lane"}}

]

}

例d). 我们可以在一个bool查询里一起使用must、should、must_not。此外，我们可以将bool查询放到这样的bool语句中来模拟复杂的、多等级的布尔逻辑。

返回40岁, 并且不生活在ID州的人的账户：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{

"bool":{

"must":[

{"match":{"age":"40"}}

"must_not":[

{"match":{"state":"ID"}}

]

}

4. filter 过滤器查询

过滤器

在先前的章节中，我们跳过了文档得分的细节（搜索结果中的_score字段）。这个得分是与我们指定的搜索查询匹配程度的一个相对度量。得分越高，文档越相关，得分越低文档的相关度越低。

Elasticsearch中的所有的查询都会触发相关度得分的计算。对于那些我们不需要相关度得分的场景下，Elasticsearch以过滤器的形式提供了另一种查询功能。过滤器在概念上类似于查询，但是它们有非常快的执行速度，这种快的执行速度主要有以下两个原因

- 过滤器不会计算相关度的得分，所以它们在计算上更快一些
- 过滤器可以被缓存到内存中，这使得在重复的搜索查询上，其要比相应的查询快出许多。

为了理解过滤器，我们先来介绍“被过滤”的查询，这使得你可以将一个查询（像是match_all，match，bool等）和一个过滤器结合起来。作为一个例子，我们介绍一下范围过滤器，它允许我们通过一个区间的值来过滤文档。这通常被用在数字和日期的过滤上。

查询账户余额位于20000和30000之间的账户:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '

{

"query":{

"filtered":{

"query": {

"match_all":{}

"filter":{

"range":{

"balance":{

"gte": 20000,

"lte": 30000

}

分解上面的例子，被过滤的查询包含一个match_all查询（查询部分）和一个过滤器（filter部分）。我们可以在查询部分中放入其他查询，在filter部分放入其它过滤器。在上面的应用场景中，由于所有的在这个范围之内的文档都是平等的（或者说相关度都是一样的），没有一个文档比另一个文档更相关，所以这个时候使用范围过滤器就非常合适了。

通常情况下，要决定是使用过滤器还是使用查询，你就需要问自己是否需要相关度得分。如果相关度是不重要的，使用过滤器，否则使用查询。如果你有SQL背景，查询和过滤器在概念上类似于SELECT WHERE语句， although more so for filters than queries。

除了match_all, match, bool,filtered和range查询，还有很多其它类型的查uxn/过滤器，我们这里不会涉及。由于我们已经对它们的工作原理有了基本的理解，将其应用到其它类型的查询、过滤器上也不是件难事。

5. 聚合操作

除了m聚合提供了分组并统计数据的能力。理解聚合的最简单的方式是将其粗略地等同为SQL的GROUP BY和SQL聚合函数。在Elasticsearch中，你可以在一个响应中同时返回命中的数据和聚合结果。你可以使用简单的API同时运行查询和多个聚合，并以一次返回，这避免了来回的网络通信，这是非常强大和高效的。

作为开始的一个例子，

例a). 我们按照state分组，按照州名的计数倒序排序：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d'

{

"size":0,

"aggs":{

"group_by_state":{

"terms":{

"field":"state"

}

以上查询语句类似于sql:

select count(*) from bank group by state order by count(*) desc

例b). 每个州账户的平均余额：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d'

{

"size":0,

"aggs":{

"group_by_state":{

"terms":{

"field":"state"

"aggs":{

"average_balance":{

"avg":{

"field":"balance"

}

例c). 每个州账户的平均余额, 并按余额进行排序：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d'

{

"size":0,

"aggs":{

"group_by_state":{

"terms":{

"field":"state",

"order":{

"average_balance":"desc"

}

"aggs":{

"average_balance":{

"avg":{

"field":"balance"

}

例d).下面的例子显示了如何使用年龄段（20-29，30-39，40-49）分组，然后在用性别分组，然后为每一个年龄段的每一个性别计算平均账户余额：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
},
"aggs": {
"group_by_gender": {
"terms": {
"field": "gender"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
}
}'

猜你喜欢