elasticsearch操作

添加

类型为employee,该类型位于索引megacorg,每个雇员索引一个文档,该文档包含该雇员的全部信息(面向文档),该雇员的id为1

需要index、type、id

curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1 --data '{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}'

添加更多的雇员
curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/2 --data '{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}'

增加索引的时候,默认会有5个主分片,主分片是在创建索引的时候就要固定的,而副本分片个数随时可修改,比如,创建一个主分片为3,副本分片为1的索引。当往es中put数据时,会按照id进行hash,然后put到对应的分片上。

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp2 --data '{
>    "settings" : {
>       "number_of_shards" : 3,
>       "number_of_replicas" : 1
>    }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 68

{"acknowledged":true,"shards_acknowledged":true,"index":"megacorp2"}

往es添加数据时,也可以不指定id,会自动创建id,需要使用post请求,方式如下:

[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/ --data '{
>    "first_name": "John2",
>    "last_name": "Smith2",
>    "age": 256,
>    "about": "I love to go rock climbing",
>    "interests": [
>       "sports",
>       "music"
>    ]
> }'
HTTP/1.1 201 Created
Location: /megacorp/employee/TfQ8ymMBtknNDl0i3mwi
content-type: application/json; charset=UTF-8
content-length: 179

{"_index":"megacorp","_type":"employee","_id":"TfQ8ymMBtknNDl0i3mwi","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1}

更新文档时,

和添加时是一样的,返回一个version,是一个不同于之前的version。更新时,elasticsearch将旧的文档标记为已删除,并增加一个全新的文档,旧的文档会在后台自动清除,但是不会立即清除。

创建文档

返回409,代表已存在,不能创建,如果不加op_type=create,会更新。也可以在URL最后加上/_create
[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1?op_type=create --data '{
>    "first_name": "John",
>    "last_name": "Smith",
>    "age": 25,
>    "about": "I love to go rock climbing",
>    "interests": [
>       "sports",
>       "music"
>    ]
> }'
HTTP/1.1 409 Conflict
content-type: application/json; charset=UTF-8
content-length: 445

{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"},"status":409}

检索文档:

  • 根据需要index、type、id,返回某个文档
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 249

{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}
  • pretty
在请求参数中加上pretty,会使返回更加可读,但是source不会,会按照我们添加时候的格式返回
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 294

{
  "_index" : "megacorp",
  "_type" : "employee",
  "_id" : "1",
  "_version" : 3,
  "found" : true,
  "_source" : {
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing",
    "interests" : [
      "sports",
      "music"
    ]
  }
}
  • 返回部分字段
只返回部分字段
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?_source=first_name,last_name
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 128

{"_index":"megacorp","_type":"employee","_id":"1","_version":3,"found":true,"_source":{"last_name":"Smith","first_name":"John"}}
  • 只返回source部分
只返回source里面的值
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1/_source
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 162

{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}

  • 返回全部文档,默认最多十个:
 /_search
    在所有的索引中搜索所有的类型 
/gb/_search
    在 gb 索引中搜索所有的类型 
/gb,us/_search
    在 gb 和 us 索引中搜索所有的文档 
/g*,u*/_search
    在任何以 g 或者 u 开头的索引中搜索所有的类型 
/gb/user/_search
    在 gb 索引中搜索 user 类型 
/gb,us/user,tweet/_search
    在 gb 和 us 索引中搜索 user 和 tweet 类型 
/_all/user,tweet/_search
    在所有的索引中搜索 user 和 tweet 类型 
//该例子是返回索引为megacorp,类型为employee的全部文档
[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 611

{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}]}}
  • 分页

GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10
分页会在每个分片进行排序然后返回,分页过深会使成本成指数上升

全文搜索,

返回与该词相关的文档,并返回相关系数

写法一,这样在URL中写不能使用空格等特殊符号:

curl -X GET -i 'http://focuson1:9200/megacorp/employee/_search?q=about:like'
查询条件前面+表示前缀必须与可选条件匹配,-标示前缀一定不与查询条件匹配,没有+-就是其他情况。
http://focuson1:9200/megacorp/employee/_search?q=-about:to%20go

写法二,使用match:

[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search  -H 'Content-Type: application/json'  --data '{
>     "query" : {
>         "match" : {
>             "about" : "rock climbing"
>         }
>     }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 629

{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}},{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}}]}}

短语搜索,

match_phrase只搜索使用这个短语的

[root@focuson1 ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     }
> }'
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
}}]}}

高亮搜索,

让用户知道为何匹配到该文档,在json请求和返回中会有highlight部分

[root@focuson1 ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     },
>     "highlight": {
>         "fields" : {
>             "about" : {}
>         }
>     }
> }
> '
{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
   "first_name": "John",
   "last_name": "Smith",
   "age": 25,
   "about": "I love to go rock climbing",
   "interests": [
      "sports",
      "music"
   ]
},"highlight":{"about":["I love to go <em>rock</em> <em>climbing</em>"]}}]}}

聚合,分析。

查询last_name为Smith,年龄大于30(gt表示grant_than大于)

[root@focuson1 ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search -H 'Content-Type: application/json' --data '{
>     "query" : {
>         "bool": {
>             "must": {
>                 "match" : {
>                     "last_name" : "smith" 
>                 }
>             },
>             "filter": {
>                 "range" : {
>                     "age" : { "gt" : 30 } 
>                 }
>             }
>         }
>     }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 388

{"took":153,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}}]}}

删除文档和索引。不会立即删除,只会标记为删除状态。

删除文档
curl -X DELETE -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 160

{"_index":"megacorp","_type":"employee","_id":"1","_version":5,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":2}
删除索引
[root@focuson1 ~]# curl -X DELETE -i http://focuson1:9200/megacorp
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 21

{"acknowledged":true}

查看集群健康状况

[root@focuson1 ~]# curl http://focuson1:9200/_cluster/health
{"cluster_name":"elasticsearch","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":5,"active_shards":5,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":5,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

更新丢失问题

在数据库层面,存在悲观锁和乐观锁,悲观锁是认为每次更新都存在更新丢失的可能性,会在每次读取数据之后就加上锁,其他就不能再操作了,知道锁释放之后,别的线程才能操作;乐观锁认为在每次读取时都不存在更新丢失的问题,但是会有一个版本号,查询时查得这个版本号,在更新时,查得该版本号并更新他,发现被别人更新时,就不再更新,这样也能方式更新丢失;所以乐观锁效率更高。

而elasticsearch明显可以使用乐观锁,因为他里面有版本号。比如在web界面加载所有的es里信息时,每条信息都有版本号,更新或删除时,会在条件中加上版本号为加载时的版本号,如果不是,则更新失败。

例子如下:

[root@focuson1 ~]# curl -X GET  http://focuson1:9200/megacorp/employee/2
{"_index":"megacorp","_type":"employee","_id":"2","_version":2,"found":true,"_source":{  
    "first_name" :  "Jane",  
    "last_name" :   "Smith",  
    "age" :         32,  
    "about" :       "I like to collect rock albums",  
    "interests":  [ "music" ]  
}}

查得该条数据版本为2,则更新该条数据时,加上在version=2的基础上更新,如下:

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{  
>     "first_name" :  "Jane3",  
>     "last_name" :   "Smith3",  
>     "age" :         32,  
>     "about" :       "I like to collect rock albums",  
>     "interests":  [ "music" ]  
> }' 
{"_index":"megacorp","_type":"employee","_id":"2","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1}

此时version变成了3,如果再使用version=2更新,则会失败,返回状态409失败:

[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{  
    "first_name" :  "Jane3",  
    "last_name" :   "Smith3",  
    "age" :         32,  
    "about" :       "I like to collect rock albums",  
    "interests":  [ "music" ]  
}' 
{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"},"status":409}

使用外部的版本号

新增文档时:

如果该版本号比123小,则更新成123,如果比123大或等于,则返回409
[root@focuson1 ~]# curl -X PUT -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2?version=123&version_type=external' --data '{  
>     "first_name" :  "Jane",  
>     "last_name" :   "Smith",  
>     "age" :         32,  
>     "about" :       "I like to collect rock albums",  
>     "interests":  [ "music" ]  
> }'  
{"_index":"megacorp","_type":"employee","_id":"2","_version":123,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1}

文档部分更新

在doc内添加一些字段,存在的更新,不存在的新增
[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0
>    }
> }'  
{"_index":"megacorp","_type":"employee","_id":"2","_version":125,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":6,"_primary_term":1}
使用脚本部分更新文档,把nimei字段加1
[root@focuson1 ~]# curl -X POST "http://focuson1:9200/megacorp/employee/2/_update" -H 'Content-Type: application/json' -d'
> {
>    "script" : "ctx._source.nimei+=1"
> }
> '
{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":8,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/2?pretty
{
  "_index" : "megacorp",
  "_type" : "employee",
  "_id" : "2",
  "_version" : 127,
  "found" : true,
  "_source" : {
    "doc" : {
      "tags" : [
        "testing"
      ],
      "views" : 0
    },
    "views" : 0,
    "tags" : [
      "testing"
    ],
    "nimei" : 1234567891
  }
}

upsert更新的文档不存在先创建他

[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0,
>       "nimei":1234567890
>    },
>    "upsert": {}
> }' 
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
//下面结果可以看出,不存在会创建,但是不能把doc里面的进行更新
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"found":true,"_source":{}}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{  
>    "doc" : {
>       "tags" : [ "testing" ],
>       "views": 0,
>       "nimei":1234567890
>    },
>    "upsert": {}
> }' 
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}[root@focuson1 ~]# 
[root@focuson1 ~]# 
[root@focuson1 ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"found":true,"_source":{"nimei":1234567890,"views":0,"tags":["testing"]}}

更新重试

在程序中,我们可以使用乐观锁控制,每次传入version,这样就不会存在冲突的情况,但是当我们不存入version时,每次更新时会先检索,拿出version,然后重建索引,此时,可能会存在冲突。此时可以通过一个参数重试。retry_on_conflict,默认是0次。

curl -X POST "localhost:9200/website/pageviews/1/_update?retry_on_conflict=5" -H 'Content-Type: application/json' -d'
{
   "script" : "ctx._source.views+=1",
   "upsert": {
       "views": 0
   }
}
'

取回多个文档

[root@focuson1 ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/_mget' --data '{
>    "docs" : [
>       {
>          "_index" : "megacorp",
>          "_type" :  "employee",
>          "_id" :    1
>       },
>       {
>          "_index" : "megacorp",
>          "_type" :  "employee",
>          "_id" :    2,
>          "_source": "first_name"
>       }
>    ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{  
   "first_name": "John",  
   "last_name": "Smith",  
   "age": 25,  
   "about": "I love to go rock climbing",  
   "interests": [  
      "sports",  
      "music"  
   ]  
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}
如果在一个index或一个type中,可以把index或type写到URL中
[root@focuson1 ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/_mget' --data '{
>    "docs" : [
>       {
>          "_id" :    1
>       },
>       {
>          "_id" :    2,
>          "_source": "first_name"
>       }
>    ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{  
   "first_name": "John",  
   "last_name": "Smith",  
   "age": 25,  
   "about": "I love to go rock climbing",  
   "interests": [  
      "sports",  
      "music"  
   ]  
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}
批量操作(bulk)

有下面几个动作:create(创建文档)、index(创建一个文档或替换一个现有文档)、update(更新文档)、delete

例子如下:

[root@focuson1 ~]# curl -X POST "http://focuson1:9200/_bulk" -H 'Content-Type: application/json' -d'
> { "delete": { "_index": "megacorp", "_type": "employee", "_id": "123" }} 
> { "create": { "_index": "megacorp", "_type": "employee", "_id": "123" }}
> { "title":    "My first blog post" }
> { "index":  { "_index": "megacorp", "_type": "employee" }}
> { "title":    "My second blog post" }
> { "update": { "_index": "megacorp", "_type": "employee", "_id": "123", "_retry_on_conflict" : 3} }
> { "doc" : {"title" : "My updated blog post"} }
> '
{"took":87,"errors":false,"items":[{"delete":{"_index":"megacorp","_type":"employee","_id":"123","_version":1,"result":"not_found","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1,"status":404}},{"create":{"_index":"megacorp","_type":"employee","_id":"123","_version":2,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}},{"index":{"_index":"megacorp","_type":"employee","_id":"8iZmy2MBAdBddqEKxy1b","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":9,"_primary_term":1,"status":201}},{"update":{"_index":"megacorp","_type":"employee","_id":"123","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1,"status":200}}]}
这样有一个问题,每一个操作都要制定index、type,这样有点过分,可以在URL中制定index和type,这样,在每个操作中默认使用URL中的,如果自己指定,那么使用自己的。
[root@focuson1 ~]# curl -X POST "http://focuson1:9200/megacorp/employee/_bulk" -H 'Content-Type: application/json' -d'
> { "delete": { "_id": "123" }} 
> { "create": { "_id": "123" }}
> { "title":    "My first blog post" }
> { "index":  {}}
> { "title":    "My second blog post" }
> { "update": {"_id": "123", "_retry_on_conflict" : 3} }
> { "doc" : {"title" : "My updated blog post"} }
> '
{"took":31,"errors":false,"items":[{"delete":{"_index":"megacorp","_type":"employee","_id":"123","_version":4,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1,"status":200}},{"create":{"_index":"megacorp","_type":"employee","_id":"123","_version":5,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1,"status":201}},{"index":{"_index":"megacorp","_type":"employee","_id":"8yZpy2MBAdBddqEKSi1M","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":10,"_primary_term":1,"status":201}},{"update":{"_index":"megacorp","_type":"employee","_id":"123","_version":6,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":1,"status":200}}]}




猜你喜欢

转载自blog.csdn.net/focuson_/article/details/80571070