elasticsearch操作

添加

类型为employee，该类型位于索引megacorg，每个雇员索引一个文档，该文档包含该雇员的全部信息（面向文档），该雇员的id为1

需要index、type、id

curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1 --data '{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}'

添加更多的雇员
curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/2 --data '{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}'

增加索引的时候，默认会有5个主分片，主分片是在创建索引的时候就要固定的，而副本分片个数随时可修改，比如，创建一个主分片为3，副本分片为1的索引。当往es中put数据时，会按照id进行hash，然后put到对应的分片上。

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp2 --data '{
>    "settings" : {
>       "number_of_shards" : 3,
>       "number_of_replicas" : 1
>    }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 68

{"acknowledged":true,"shards_acknowledged":true,"index":"megacorp2"}

往es添加数据时，也可以不指定id，会自动创建id，需要使用post请求，方式如下：

[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/ --data '{
>    "first_name": "John2",
>    "last_name": "Smith2",
>    "age": 256,
>    "about": "I love to go rock climbing",
>    "interests": [
>       "sports",
>       "music"
>    ]
> }'
HTTP/1.1 201 Created
Location: /megacorp/employee/TfQ8ymMBtknNDl0i3mwi
content-type: application/json; charset=UTF-8
content-length: 179

{"_index":"megacorp","_type":"employee","_id":"TfQ8ymMBtknNDl0i3mwi","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1}

更新文档时，

和添加时是一样的，返回一个version，是一个不同于之前的version。更新时，elasticsearch将旧的文档标记为已删除，并增加一个全新的文档，旧的文档会在后台自动清除，但是不会立即清除。

创建文档

返回409，代表已存在，不能创建，如果不加op_type=create，会更新。也可以在URL最后加上/_create
[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1?op_type=create --data '{
>    "first_name": "John",
>    "last_name": "Smith",
>    "age": 25,
>    "about": "I love to go rock climbing",
>    "interests": [
>       "sports",
>       "music"
>    ]
> }'
HTTP/1.1 409 Conflict
content-type: application/json; charset=UTF-8
content-length: 445

{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"},"status":409}

检索文档：

根据需要index、type、id，返回某个文档

[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 249

{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}

pretty

在请求参数中加上pretty，会使返回更加可读，但是source不会，会按照我们添加时候的格式返回
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 294

{
  "_index" : "megacorp",
  "_type" : "employee",
  "_id" : "1",
  "_version" : 3,
  "found" : true,
  "_source" : {
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing",
    "interests" : [
      "sports",
      "music"
    ]
  }
}

返回部分字段

只返回部分字段
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?_source=first_name,last_name
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 128

{"_index":"megacorp","_type":"employee","_id":"1","_version":3,"found":true,"_source":{"last_name":"Smith","first_name":"John"}}

只返回source部分

只返回source里面的值
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1/_source
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 162

{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}

返回全部文档，默认最多十个：

 /_search
    在所有的索引中搜索所有的类型 
/gb/_search
    在 gb 索引中搜索所有的类型 
/gb,us/_search
    在 gb 和 us 索引中搜索所有的文档 
/g*,u*/_search
    在任何以 g 或者 u 开头的索引中搜索所有的类型 
/gb/user/_search
    在 gb 索引中搜索 user 类型 
/gb,us/user,tweet/_search
    在 gb 和 us 索引中搜索 user 和 tweet 类型 
/_all/user,tweet/_search
    在所有的索引中搜索 user 和 tweet 类型

//该例子是返回索引为megacorp，类型为employee的全部文档
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 611

{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}]}}

分页

GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10

分页会在每个分片进行排序然后返回，分页过深会使成本成指数上升

全文搜索，

返回与该词相关的文档，并返回相关系数

写法一，这样在URL中写不能使用空格等特殊符号：

curl -X GET -i 'http://focuson1:9200/megacorp/employee/_search?q=about:like'

查询条件前面+表示前缀必须与可选条件匹配,-标示前缀一定不与查询条件匹配，没有+-就是其他情况。

http://focuson1:9200/megacorp/employee/_search?q=-about:to%20go

写法二，使用match：

[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search  -H 'Content-Type: application/json'  --data '{
>     "query" : {
>         "match" : {
>             "about" : "rock climbing"
>         }
>     }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 629

{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}},{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}}]}}

短语搜索，

match_phrase只搜索使用这个短语的

[root@focuson1 ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     }
> }'
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}]}}

高亮搜索，

让用户知道为何匹配到该文档，在json请求和返回中会有highlight部分

[root@focuson1 ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     },
>     "highlight": {
>         "fields" : {
>             "about" : {}
>         }
>     }
> }
> '
{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
},"highlight":{"about":["I love to go <em>rock</em> <em>climbing</em>"]}}]}}

聚合，分析。

查询last_name为Smith，年龄大于30（gt表示grant_than大于）

[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search -H 'Content-Type: application/json' --data '{
>     "query" : {
>         "bool": {
>             "must": {
>                 "match" : {
>                     "last_name" : "smith" 
>                 }
>             },
>             "filter": {
>                 "range" : {
>                     "age" : { "gt" : 30 } 
>                 }
>             }
>         }
>     }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 388

{"took":153,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}}]}}

删除文档和索引。不会立即删除，只会标记为删除状态。

删除文档
curl -X DELETE -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 160

{"_index":"megacorp","_type":"employee","_id":"1","_version":5,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":2}

删除索引
[root@focuson1 ~]# curl -X DELETE -i http://focuson1:9200/megacorp
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 21

{"acknowledged":true}

查看集群健康状况

[root@focuson1 ~]# curl http://focuson1:9200/_cluster/health
{"cluster_name":"elasticsearch","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":5,"active_shards":5,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":5,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

更新丢失问题

在数据库层面，存在悲观锁和乐观锁，悲观锁是认为每次更新都存在更新丢失的可能性，会在每次读取数据之后就加上锁，其他就不能再操作了，知道锁释放之后，别的线程才能操作；乐观锁认为在每次读取时都不存在更新丢失的问题，但是会有一个版本号，查询时查得这个版本号，在更新时，查得该版本号并更新他，发现被别人更新时，就不再更新，这样也能方式更新丢失；所以乐观锁效率更高。

而elasticsearch明显可以使用乐观锁，因为他里面有版本号。比如在web界面加载所有的es里信息时，每条信息都有版本号，更新或删除时，会在条件中加上版本号为加载时的版本号，如果不是，则更新失败。

例子如下：

[root@focuson1 ~]# curl -X GET  http://focuson1:9200/megacorp/employee/2
{"_index":"megacorp","_type":"employee","_id":"2","_version":2,"found":true,"_source":{  
    "first_name" :  "Jane",  
    "last_name" :   "Smith",  
    "age" :         32,  
    "about" :       "I like to collect rock albums",  
    "interests":  [ "music" ]  
}}

查得该条数据版本为2，则更新该条数据时，加上在version=2的基础上更新，如下：

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{  
>     "first_name" :  "Jane3",  
>     "last_name" :   "Smith3",  
>     "age" :         32,  
>     "about" :       "I like to collect rock albums",  
>     "interests":  [ "music" ]  
> }' 
{"_index":"megacorp","_type":"employee","_id":"2","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1}

此时version变成了3，如果再使用version=2更新，则会失败，返回状态409失败：

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{  
    "first_name" :  "Jane3",  
    "last_name" :   "Smith3",  
    "age" :         32,  
    "about" :       "I like to collect rock albums",  
    "interests":  [ "music" ]  
}' 
{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"},"status":409}

使用外部的版本号

新增文档时：

如果该版本号比123小，则更新成123，如果比123大或等于，则返回409
[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2?version=123&version_type=external' --data '{  
>     "first_name" :  "Jane",  
>     "last_name" :   "Smith",  
>     "age" :         32,  
>     "about" :       "I like to collect rock albums",  
>     "interests":  [ "music" ]  
> }'  
{"_index":"megacorp","_type":"employee","_id":"2","_version":123,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1}

文档部分更新

在doc内添加一些字段，存在的更新，不存在的新增

[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0
>    }
> }'  
{"_index":"megacorp","_type":"employee","_id":"2","_version":125,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":6,"_primary_term":1}

使用脚本部分更新文档，把nimei字段加1
[root@focuson1 ~]# curl -X POST "http://focuson1:9200/megacorp/employee/2/_update" -H 'Content-Type: application/json' -d'
> {
>    "script" : "ctx._source.nimei+=1"
> }
> '
{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":8,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/2?pretty
{
  "_index" : "megacorp",
  "_type" : "employee",
  "_id" : "2",
  "_version" : 127,
  "found" : true,
  "_source" : {
    "doc" : {
      "tags" : [
        "testing"
      ],
      "views" : 0
    },
    "views" : 0,
    "tags" : [
      "testing"
    ],
    "nimei" : 1234567891
  }
}

upsert更新的文档不存在先创建他

[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0,
>       "nimei":1234567890
>    },
>    "upsert": {}
> }' 
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
//下面结果可以看出，不存在会创建，但是不能把doc里面的进行更新
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"found":true,"_source":{}}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0,
>       "nimei":1234567890
>    },
>    "upsert": {}
> }' 
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"found":true,"_source":{"nimei":1234567890,"views":0,"tags":["testing"]}}

更新重试

在程序中，我们可以使用乐观锁控制，每次传入version，这样就不会存在冲突的情况，但是当我们不存入version时，每次更新时会先检索，拿出version，然后重建索引，此时，可能会存在冲突。此时可以通过一个参数重试。retry_on_conflict,默认是0次。

curl -X POST "localhost:9200/website/pageviews/1/_update?retry_on_conflict=5" -H 'Content-Type: application/json' -d'
{
   "script" : "ctx._source.views+=1",
   "upsert": {
       "views": 0
   }
}
'

取回多个文档

[root@focuson1 ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/_mget' --data '{
>    "docs" : [
>       {
>          "_index" : "megacorp",
>          "_type" :  "employee",
>          "_id" :    1
>       },
>       {
>          "_index" : "megacorp",
>          "_type" :  "employee",
>          "_id" :    2,
>          "_source": "first_name"
>       }
>    ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{  
   "first_name": "John",  
   "last_name": "Smith",  
   "age": 25,  
   "about": "I love to go rock climbing",  
   "interests": [  
      "sports",  
      "music"  
   ]  
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}

如果在一个index或一个type中，可以把index或type写到URL中
[root@focuson1 ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/_mget' --data '{
>    "docs" : [
>       {
>          "_id" :    1
>       },
>       {
>          "_id" :    2,
>          "_source": "first_name"
>       }
>    ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{  
   "first_name": "John",  
   "last_name": "Smith",  
   "age": 25,  
   "about": "I love to go rock climbing",  
   "interests": [  
      "sports",  
      "music"  
   ]  
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}

批量操作（bulk）

有下面几个动作：create（创建文档）、index（创建一个文档或替换一个现有文档）、update（更新文档）、delete