Elasticsearch---学习记录(2)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/D_iRe_Wol_F/article/details/83113789

仅供自己作学习笔记,详情请移步es官方文档

9.记录------sql插件

安装sql插件以后,就有两种方式查询数据

  • 还是url里面直接使用_sql+"sql查询语句"

    curl -XPOST http://172.16.150.149:29200/_sql?pretty -d "SELECT * FROM facebook"
          {
            "took" : 1,
            "timed_out" : false,
            "_shards" : {
          "total" : 3,
          "successful" : 3,
          "failed" : 0
            },
            "hits" : {
          "total" : 4,
          "max_score" : 1.0,
          "hits" : [ {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "pretty",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "AWZ668ZcHFL4sAFl7IMI",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "AWZ67I_dHFL4sAFl7IMJ",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "123",
            "_score" : 1.0,
            "_source" : {
          "title" : "change version num",
          "text" : "changing...",
          "views" : 0,
          "tags" : [ "testing" ]
            }
          } ]
            }
          }
    
  • sql插件可视化界面

10.记录------GET多个文档

mget API 要求有一个 docs 数组作为参数,每个 元素包含需要检索文档的元数据, 包括 _index 、 _type 和 _id 。

当_index,_type相同的情况下,直接就传一个ids数组

curl -i -XGET http://172.16.150.149:29200/facebook/blog/_mget?pretty -d " {"ids":["123","888"]}"
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 504

{
  "docs" : [ {
"_index" : "facebook",
"_type" : "blog",
"_id" : "123",
"_version" : 121,
"found" : true,
"_source" : {
  "title" : "change version num",
  "text" : "changing...",
  "views" : 0,
  "tags" : [ "testing" ]
}
  }, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "888",
"_version" : 1,
"found" : true,
"_source" : {
  "title" : "website",
  "text" : "new test is made",
  "date" : "2018/10/17"
}
  } ]
}

11.记录------bulk批量操作

为什么需要换行?
肯定是要从性能消耗的角度上看.以每条指令,作为一个数据源操作,直接读取,减少JVM的消耗.

bulk API 按如下步骤顺序执行:

客户端向 Node 1 -master发送 bulk 请求。

Node 1 为每个节点创建一个批量请求,并将这些请求并行转发到每个包含主分片的节点主机。

主分片一个接一个按顺序执行每个操作。当每个操作成功时,主分片并行转发新文档(或删除)到副本分片,然后执行下一个操作。 一旦所有的副本分片报告所有操作成功,该节点将向协调节点报告成功,协调节点将这些响应收集整理并返回给客户端。

由这个也可以看出是bulk的操作是非原子性的.

自己遇到的问题是怎么换行,而不是续行?

在github上面看到了解决方案(自己使用ubuntu进行测试),加入-H 'Content-Type: application/json'

 curl -H 'Content-Type: application/json' -i -XPOST http://172.16.150.149:29200/_bulk -d '
{"create":{"_index":"twitter","_type":"newtype","_id":970}}
{ "create": { "_index": "user", "_type": "doc", "_id": "2" }}
'

然后就可以愉快地随意换行了,结尾注意',其实忘记输入,直接回车,也只会有另起一行.

12.了解------routing的作用

文档中讲了es的存储方式,这里就简单了解记录.

shard = hash(routing) % number_of_primary_shards

routing 是一个可变值,默认是文档的 _id ,也可以设置成一个自定义的值。 routing 通过 hash 函数生成一个数字,然后这个数字再除以 number_of_primary_shards (主分片的数量)后得到 余数 。这个分布在 0 到 number_of_primary_shards-1 之间的余数,就是我们所寻求的文档所在分片的位置。

这就解释了为什么我们要在创建索引的时候就确定好主分片的数量 并且永远不会改变这个数量:因为如果数量变化了,那么所有之前路由的值都会无效,文档也再也找不到了。

13.记录-------空搜索

不指定查询语句

GET /_search

  curl -XGET http://172.16.150.149:29200/facebook/_search?pretty
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
  },
  "hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "pretty",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "888",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "new test is made",
"date" : "2018/10/17"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "AWZ668ZcHFL4sAFl7IMI",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "AWZ67I_dHFL4sAFl7IMJ",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "123",
  "_score" : 1.0,
  "_source" : {
"title" : "change version num",
"text" : "changing...",
"views" : 0,
"tags" : [ "testing" ]
  }
} ]
  }
}

主要字段含义

took:查询消耗时间.

timeout:设定一个时间来等待各个节点,分片返回的结果,过时就关闭连接.

hits:记录查询的总数信息,以及各个索引的信息_index,_type,_id等.
shards:分片信息.

猜你喜欢

转载自blog.csdn.net/D_iRe_Wol_F/article/details/83113789