ElasticSearch

前期准备

我是使用virtual box安装的deepin虚拟机进行的操作；

跟着下面的安装流程，安装完成后，默认会在 http://localhost:5601/ 运行 Kibana

安装ELK (Elastic Stack)

安装完成后，会有提示你没有数据，是否加入一些sample，我选择加入了 commercial的数据。用于后面的练习。

简单的CRUD

在官方文档中学习了简单的CRUD，现在记录下来。

打开左侧导航栏中 小扳手 进行练习

Rest风格 + JSON 来处理文档

新增

新增一条

给customer 文档加上index

PUT /customer/_doc/1
{
  "name": "John Doe"
}

返回的json，我们可以得到很多信息

{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 26,
  "_primary_term" : 4
}

_doc 后面的 1 是指的 _id
注意是 _id 不是 id
如果自己不指定，则会默认生成一个

修改

ES的修改其实是覆盖；但是使用时，es帮我们做了处理，我们感觉和修改一样。

POST /test/_update/1
{
  "doc": {
    "name": "111Jushis",
    "sex": "MALE"
  }
}

修改的内容需要用 doc来包装一下

删除

删除一条

DELETE /test/1

查询

最简单的查询，在url上加入id，就可以返回信息

id查询

GET /customer/_doc/1

简单查询

查看所以内容,并且以 customer_id 排序

query

match_phrase 匹配分词
match_all 全部匹配

GET /kibana_sample_data_ecommerce/_search
{
  "query": { "match_all": {} },
  "sort": [
    { "customer_id": "asc" }
  ]
}

得到

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 636,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "kibana_sample_data_ecommerce",
        "_type" : "_doc",
        "_id" : "hfgLzG8B-tTLcXKpP4ph",
        "_score" : null,
        "_source" : {
          "category" : [
            "Women's Clothing"
          ],
          "currency" : "EUR",
          "customer_first_name" : "Mary",
          "customer_full_name" : "Mary Bailey",
          "customer_gender" : "FEMALE",
          "customer_id" : 20,
          "customer_last_name" : "Bailey",
          "customer_phone" : "",
          "day_of_week" : "Sunday",
          "day_of_week_i" : 6,
        ... 
        }
    ]
    }
  }
}

我们可以看到，
took 表示查询花费的毫秒数
max_score 文档匹配的最大分数
total里面表示查询得到的数量
hits 表示匹配记录
_score 表示该文档的分数
_source 里面是查询得到的数据

排序

当然，我们也可以加入排序

GET /kibana_sample_data_ecommerce/_search
{
  "query": { "match_all": {} },
  "sort": [
    { "customer_id": "asc" }
  ],
  "from": 10,
  "size": 10
}

但是注意，排序字段如果是 text 直接使用全字段的话
会报错 Fielddata is disabled on text fields by default.

可以改为使用 字段名.keyword

bool 查询

GET /kibana_sample_data_ecommerce/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "customer_id": 20
          }
        }
        
      ],
      "must_not": [
        {
          "match_phrase": {
            "customer_first_name": "Mary"
          }
        }
      ],
      "filter": {
        "range": {
          "customer_id": {
            "gte": 10,
            "lte": 20
          }
        }
      }
    }
  }
}

这样结果应该是查不到数据的；
其中bool查询适合多个查询条件拼接在一起

must 必须符合
must_not 必须不符合
filter 可以看做 must_not 的一个别名

在其中，也可加入query条件

统计分析

ES来做统计，聚合函数是离不开的；
aggregations 在es中可以简写为 aggs

之后会写一关于 aggregations的文章

terms 作用是分组与sql中的 group by类似；
但是用法上有不同；

ES中默认只返回前10条数据

语句写法上
sql： group by a,b
在 es中则是aggs嵌套,记得指定size大小（默认是10）

{
  "aggs": {
    "group_by_day": {
      "terms": {
        "field": "day_of_week",
        "size": 3
      },
      "aggs": {
        "group_by_firstName":{
          "terms": {
            "field": "customer_first_name.keyword",
            "size": 10
          }
        }
      }
    }
  }
}

分组之后，我们可根据需求进行求平均值，求和等操作；

{
    "aggs": {
        "sum_price":{
          "sum": {
            "field": "taxful_total_price"
          }
        }
      }
}

在es中有很多关于统计分析的方法；
具体的可以看文档。

search-aggregations

居十四

发布了92 篇原创文章 · 获赞 18 · 访问量 6万+

私信关注

ElasticSearch简单的增删改查+统计