Elasticsearch common basic syntax

According to the online information, I sorted out the basic grammar of Elastcisearch.
Reference link:
https://blog.csdn.net/afeiqiang/article/details/83021144
https://www.itsvse.com/forum.php? mod=viewthread&tid=6334&extra=&ordertype=1
https://www.cnblogs.com/cxygg/p/9471372.html
https://blog.csdn.net/u011499747/article/details/78877922
https://www.cnblogs .com/heshun/articles/10658745.html

Basic Data Operation

In Elasticsearch, it contains multiple indexes (Index), each corresponding index can contain multiple types (Type), each of these different types can store multiple documents (Document), and each document has multiple attributes . An index An index is similar to a database in a traditional relational database and is a place to store relational documents. The plural of index is indices or indexes.

adding data

PUT /megacorp/employee/1
{
    
    
    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        25,
    "about" :      "I love to go rock climbing",
    "interests": [ "sports", "music" ]
}

The 1 behind the URI represents the ID of this piece of data, which can also be a string. If you don't want to specify the ID yourself, you don't need to pass it, but you must use POST to add it. In this case, Elasticsearch will generate a random string for this data.

If you want to update this piece of data, you can request the URI again. The key is to specify the ID, and then modify the json content, so that this piece of data can be updated.

retrieve data

A specific piece of data is retrieved according to the ID:

GET /megacorp/employee/1

result:

{
    
    
    "_index": "megacorp",
    "_type": "employee",
    "_id": "1",
    "_version": 1,
    "found": true,
    "_source": {
    
    
        "first_name": "John",
        "last_name": "Smith",
        "age": 25,
        "about": "I love to go rock climbing",
        "interests": [
            "sports",
            "music"
        ]
    }
}

Among them, _source is the json information we store, and other fields are very clear.

将HTTP动词由PUT改为GET可以用来检索文档,同样的,可以使用DELETE命令来删除文档,以及使用HEAD指令来检查文档是否存在。如果想更新已存在的文档,只需再次PUT。由此可见,Elasticsearch的作者深谙restful。

The easiest search

GET /megacorp/employee/_search

_search is the keyword of es, which means full-text search. Search all documents of Type under the specified Index. By default, only 10 items are displayed per page. You can change this setting through the size field, and you can also specify the displacement through the from field (the default is from start at position 0). The taken field of the returned result indicates the time-consuming of the operation (in milliseconds), the timed_out field indicates whether it times out, and the hits field indicates the hit record.

simple search

Search for data with last_name=Smith:

GET /megacorp/employee/_search?q=last_name:Smith

conditional search

GET /megacorp/employee/_search
{
    
    
    "query" : {
    
    
        "match" : {
    
    
            "last_name" : "Smith"
        }
    }
}

This query is the same as the above example, but the parameter has changed from a simple parameter to a complex json, but the advantage of complexity is stronger control, and we can make more fine-grained control over the query.

complex search

Search by last_name and only care about age > 30:

GET /megacorp/employee/_search
{
    
    
    "query" : {
    
    
        "bool": {
    
    
            "must": {
    
    
                "match" : {
    
    
                    "last_name" : "smith" 
                }
            },
            "filter": {
    
    
                "range" : {
    
    
                    "age" : {
    
     "gt" : 30 } 
                }
            }
        }
    }
}

Several es keywords are involved here, explaining one by one
bool : combined query, must be used in conjunction with must , should , must_not , filter : keywords.
must : The condition field that must be matched
should : There will be more than one condition under should, if at least one condition is met, this document meets should
must_not : Must not match
filter : Filter

The following query will find that the title field contains "how to make millions" and the "tag" field is not marked as spam. If there is a tag "starred" or a publication date before 2014, then these matching documents will be ranked higher than similar websites:

{
    
     
    "bool": {
    
     
        "must":     {
    
     "match": {
    
     "title": "how to make millions" }}, 
        "must_not": {
    
     "match": {
    
     "tag":   "spam" }}, 
        "should": [ 
            {
    
     "match": {
    
     "tag": "starred" }}, 
            {
    
     "range": {
    
     "date": {
    
     "gte": "2014-01-01" }}} 
        ] 
    } 
}

range is to find documents where the specified field contains a value (date, number or string) within the specified range, gt means _greater than (_great than)

phrase search

GET /megacorp/employee/_search
{
    
    
    "query" : {
    
    
        "match" : {
    
    
            "about" : "rock climbing"
        }
    }
}

The above search will return data that contains rock or climbing in about, that is, the default relationship between keywords is or. What if you want to match this exact phrase?

GET /megacorp/employee/_search
{
    
    
    "query" : {
    
    
        "match_phrase" : {
    
    
            "about" : "rock climbing"
        }
    }
}

Just use match_phrase query.

highlight search

Store data first

PUT /_search
{
    
    
    "title":"中信泰富的并购融<em>资</em>";
    "summary":"它利用非金融性资产能源源不断地在证券市场上融资,采取发行新股和引入风险投资相结合收购恒昌企业,结果各方均取得了满意的结果。"
}

Highlight the title and summary characters and the characters of the query key, the default is to useCome package.

GET /_search
{
    
    
    "query" : {
    
    
        "multi_match": {
    
     "query": "投资" }
    },
    "highlight" : {
    
    
        "fields" : {
    
    
            "title": {
    
    },
            "summary" : {
    
    }
        }
    }
}

In the returned result, each hit has an additional part like this:

"highlight" : {
    
    
          "summary" : [
            "它利用非金融性<em>资</em>产能源源不断地在证券市场上融<em>资</em>,采取发行新股和引入风险<em>投</em><em>资</em>相结合收购恒昌企业,结果各方均取得了满意的结果。",
            "其融<em>资</em>的方式主要有发行新股、可换股债券、引入风险<em>投</em><em>资</em>等。而这些巨额的融<em>资</em>行动是和<em>投</em><em>资</em>银行紧密充分的合作分不开的。从中得出一些对我国上市公司并购融<em>资</em>有益的启示,可作为并购融<em>资</em>实践的参考。"
          ],
          "title" : [
            "中信泰富的并购融<em>资</em>"
          ]
        }

simple aggregation

Before aggregation, some modifications need to be made, because Elasticsearch does not support data aggregation of text type by default, so it needs to be enabled first :

PUT zhifou/doc/1
{
    
    
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

PUT zhifou/doc/2
{
    
    
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美,娇憨可爱",
  "tags":["白", "富","美"]
}

PUT zhifou/doc/3
{
    
    
  "name":"龙套偏房",
  "age":22,
  "from":"gu",
  "desc":"mmp,没怎么看,不知道怎么形容",
  "tags":["造数据", "真","难"]
}


PUT zhifou/doc/4
{
    
    
  "name":"石头",
  "age":29,
  "from":"gu",
  "desc":"粗中有细,狐假虎威",
  "tags":["粗", "大","猛"]
}

PUT zhifou/doc/5
{
    
    
  "name":"魏行首",
  "age":25,
  "from":"广云台",
  "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
  "tags":["闭月","羞花"]
}

avg

The current requirement is to query the average age of people whose from is gu.

GET zhifou/doc/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "from": "gu"
    }
  },
  "aggs": {
    
    //表示聚合函数开始
    "my_avg": {
    
    //起的别名
      "avg": {
    
    //聚合函数类型
        "field": "age"//key与value
      }
    }
  },
  "_source": ["name", "age"]//_source表示只显示指定字段
}

The query results are as follows

{
    
    
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    //hits表示命中多少记录
    "total" : 3,
    "max_score" : 0.6931472,
    "hits" : [
      {
    
    
        "_index" : "zhifou",
        "_type" : "doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
    
    
          "name" : "石头",
          "age" : 29
        }
      },
      {
    
    
        "_index" : "zhifou",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
    
    
          "name" : "顾老二",
          "age" : 30
        }
      },
      {
    
    
        "_index" : "zhifou",
        "_type" : "doc",
        "_id" : "3",
        "_score" : 0.2876821,
        "_source" : {
    
    
          "name" : "龙套偏房",
          "age" : 22
        }
      }
    ]
  },
  "aggregations" : {
    
    
    "my_avg" : {
    
    
      "value" : 27.0
    }
  }
}

Although we have used _source to filter the fields, it is not enough. What should I do if I don't want to see the data but only the average value? Don't forget the size!

GET zhifou/doc/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "from": "gu"
    }
  },
  "aggs": {
    
    
    "my_avg": {
    
    
      "avg": {
    
    
        "field": "age"
      }
    }
  },
  "size": 0, 
  "_source": ["name", "age"]
}

The query results are as follows

{
    
    
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    
    
    "my_avg" : {
    
    
      "value" : 27.0
    }
  }
}

In the query results, we see that the total value under hits is 3, indicating that there are three pieces of data that match the result. The average value returned at the end is 27.

max

Replace avg in the above example with max

GET zhifou/doc/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "from": "gu"
    }
  },
  "aggs": {
    
    
    "my_max": {
    
    
      "max": {
    
    
        "field": "age"
      }
    }
  },
  "size": 0
}

The query results are as follows

{
    
    
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    
    
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    
    
    "my_max" : {
    
    
      "value" : 30.0
    }
  }
}

min、sum

The usage is the same as above, not in the description

Group query

simple grouping


Insert data grouped by age

PUT zhifou/doc/1
{
    
    
  "name":"顾老二",
  "age":25,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

PUT zhifou/doc/2
{
    
    
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美,娇憨可爱",
  "tags":["白", "富","美"]
}

PUT zhifou/doc/3
{
    
    
  "name":"龙套偏房",
  "age":22,
  "from":"gu",
  "desc":"mmp,没怎么看,不知道怎么形容",
  "tags":["造数据", "真","难"]
}


PUT zhifou/doc/4
{
    
    
  "name":"石头",
  "age":29,
  "from":"gu",
  "desc":"粗中有细,狐假虎威",
  "tags":["粗", "大","猛"]
}

PUT zhifou/doc/5
{
    
    
  "name":"魏行首",
  "age":25,
  "from":"广云台",
  "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
  "tags":["闭月","羞花"]
}
GET  user/student/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "aggs": {
    
    
    "fz": {
    
    
      "terms": {
    
    
        "field": "age"
      }
    }
  }
  
}

search result

{
    
    
  "took": 8,
  "timed_out": false,
  "_shards": {
    
    
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    
    
    "total": 5,
    "max_score": 1,
    "hits": [
      {
    
    …………},//这里省略命中结果
  "aggregations": {
    
    
    "fz": {
    
    
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
    
    
          "key": 25,
          "doc_count": 2//年龄为25的有两个记录
        },
        {
    
    
          "key": 18,
          "doc_count": 1
        },
        {
    
    
          "key": 22,
          "doc_count": 1
        },
        {
    
    
          "key": 29,
          "doc_count": 1
        }
      ]
    }
  }
}

range grouping

Now I want to query the age group of all people, group by 15 20, 20 25, 25~30, and calculate the average age of each group.
To analyze the requirements, first of all, we should make the groups first.

GET zhifou/doc/_search
{
    
    
  "size": 0, //表示显示0条命中记录,因为我们的目标是分组结果,所以命中记录不是我们关心的,只要数量对就行了,不需要在把记录显示出来
  "query": {
    
    
    "match_all": {
    
    }
  },
  "aggs": {
    
    
    "age_group": {
    
    //别名
      "range": {
    
    
        "field": "age",//分组字段
        "ranges": [//范围
          {
    
    
            "from": 15,
            "to": 20
          },
          {
    
    
            "from": 20,
            "to": 25
          },
          {
    
    
            "from": 25,
            "to": 30
          }
        ]
      }
    }
  }
}

The query results are as follows

{
    
    
  "took": 5,
  "timed_out": false,
  "_shards": {
    
    
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    
    
    "total": 5,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    
    
    "age_group": {
    
    
      "buckets": [
        {
    
    
          "key": "15.0-20.0",
          "from": 15,
          "to": 20,
          "doc_count": 1
        },
        {
    
    
          "key": "20.0-25.0",
          "from": 20,
          "to": 25,
          "doc_count": 1
        },
        {
    
    
          "key": "25.0-30.0",
          "from": 25,
          "to": 30,
          "doc_count": 3
        }
      ]
    }
  }
}

Then, next, we need to average the age of the data in each group.

GET zhifou/doc/_search
{
    
    
  "size": 0, 
  "query": {
    
    
    "match_all": {
    
    }
  },
  "aggs": {
    
    
    "age_group": {
    
    
      "range": {
    
    
        "field": "age",
        "ranges": [
          {
    
    
            "from": 15,
            "to": 20
          },
          {
    
    
            "from": 20,
            "to": 25
          },
          {
    
    
            "from": 25,
            "to": 30
          }
        ]
      },
      "aggs": {
    
    
        "my_avg": {
    
    
          "avg": {
    
    
            "field": "age"
          }
        }
      }
    }
  }
}

The query results are as follows

{
    
    
  "took": 28,
  "timed_out": false,
  "_shards": {
    
    
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    
    
    "total": 5,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    
    
    "age_group": {
    
    
      "buckets": [
        {
    
    
          "key": "15.0-20.0",
          "from": 15,
          "to": 20,
          "doc_count": 1,
          "my_avg": {
    
    
            "value": 18
          }
        },
        {
    
    
          "key": "20.0-25.0",
          "from": 20,
          "to": 25,
          "doc_count": 1,
          "my_avg": {
    
    
            "value": 22
          }
        },
        {
    
    
          "key": "25.0-30.0",
          "from": 25,
          "to": 30,
          "doc_count": 3,
          "my_avg": {
    
    
            "value": 26.333333333333332
          }
        }
      ]
    }
  }
}

Note: The use of aggregation functions must first find out the results, and then use the aggregation functions to process the results

count

GET /_count
{
    
    
    "query": {
    
    
        "match_all": {
    
    }//  "last_name" : "Smith"对指定字段计数
    }
}

The query results are as follows

{
    
    
    "count": 12,
    "_shards": {
    
    
        "total": 20,
        "successful": 20,
        "skipped": 0,
        "failed": 0
    }
}

monitor

cluster health

GET _cluster/health

Monitor a single node

GET _nodes/stats

index statistics

GET my_index/_stats

GET my_index,another_index/_stats

GET _all/_stats

pending tasks

GET _cluster/pending_tasks

Guess you like

Origin blog.csdn.net/qq_15098623/article/details/103413188