Elasticsearch query operation (DSL statement mode)

Description: This article introduces the query operation of documents on the visual interface of kibana and es;

adding data

First use the API to create an index library, and check out the data from MySQL and transfer it to ES, refer to (http://t.csdn.cn/NaTHg)

insert image description here

index library (student) structure;

insert image description here

1. Fuzzy query

Fuzzy query refers to fields whose field type is "text" and participate in word segmentation, such as name and all fields;

(1) All queries;

Format:

# 1.1 全部查询
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  }
}

It can be seen that all 13 documents have been queried;

insert image description here

(2) Single-field query;

Format:

# 1.2 单字段查询
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "字段名": "字段值"
    }
  }
}

search result:

insert image description here

(3) Multi-field query;

Format:

# 1.3 多字段查询
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "multi_match": {
    
    
      "query": "字段值",
      "fields": ["字段名1","字段名2"]
    }
  }
}

search result:

insert image description here

2. Accurate query

Precise query, used for equivalent judgment documents, that is, the query value is equal to the value of the corresponding field in the document, there are two types, namely term and range;

(1) term query;

Format:

GET /索引库/_search
{
    
    
  "query": {
    
    
    "term": {
    
    
      "字段名": {
    
    
        "value": "字段值"
      }
    }
  }
}

search result:

insert image description here

(2) range query;

Format:

# 2.2 精确查询之range:job>=1 and job < 3
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "range": {
    
    
      "字段名": {
    
    
        "gte": 字段值≥,
        "lt": 字段值<
      }
    }
  }
}

search result:

insert image description here

3. Geographical coordinate query

es provides geographic coordinate data types (such as geo_point). If there is location data consisting of latitude and longitude in the document, you can query the latitude and longitude coordinates in the document. There are two ways:

(1) Rectangular range;

According to the two positions provided, draw a rectangle and query the documents within the rectangle;

insert image description here

Format:

GET /索引库名/_search
{
    
    
  "query": {
    
    
    "geo_bounding_box":{
    
    
      "location":{
    
    
        "top_left":{
    
    
          "lat":左上角位置纬度,
          "lon":左上角位置经度
        },
        "bottom_right":{
    
    
          "lat":右下角位置纬度,
          "lon":右下角位置经度
        }
      }
    }
  }
}

(2) Radius range;

According to a provided location and a distance, with the location as the center and the distance as the radius, query the documents within the radius of the location;

insert image description here

Format:

# 3.2 地理查询值geo_distance
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "geo_distance":{
    
    
      "distance":"距离",
      "location":"纬度,经度"
    }
  }
}

The distance can be written in any length unit, such as 15km, 15000m;

4. Compound query

(1) score query;

Each document queried will have a score. This score is calculated by ES based on the BM25 formula. The query results will be sorted according to the score from high to low. We can manually adjust the score according to the conditions in the document, so that the documents with high scores are ranked at the top;

insert image description here

For example, we manually modify the score of the document with ID 13 so that it is ranked at the top;

Format:

# 4.1 算分查询
GET /student/_search
{
    
    
  "query": {
    
    
    "function_score": {
    
    
      "query": {
    
    
      	# 查询
        "match_all": {
    
    }
      },
      "functions": [
        {
    
    
          "filter": {
    
    
          	# 过滤
            "term": {
    
    
              "id": "13"
            }
          },
          # 设置权重
          "weight": 10
        }
      ],
      # 加权模式,即最终分值 = 查询分值 ? 权重的运算符,multiply为乘
      "boost_mode": "multiply"
    }
  }
}

Boost_mode commonly has multiply (multiply), sum (add), replace (replace, that is, replace the query score with weight);

insert image description here

(2) Boolean query;

Boolean query is also called compound query, which refers to multi-condition query. There are the following four subqueries under this query, which can be added according to actual needs:

  • must: A subquery that must match, similar to "and";

  • should: Selective matching subquery, similar to "or";

  • must_not: must not match, does not participate in scoring, similar to "not";

  • filter: must match, not involved in scoring;

For example, if the query gender is "1", the job is not in the range (2,4], and the id is 11, the all field can be 123456. The DSL statement is as follows:

# 4.2 布尔查询
GET /student/_search
{
    
    
  "query": {
    
    
    "bool": {
    
    
      # 必须匹配的子查询
      "must": [
        {
    
    
          "match": {
    
    
            "gender": "1"
          }
        }
      ],
      # 必须不能匹配的子查询
      "must_not": [
        {
    
    
          "range": {
    
    
            "job": {
    
    
              "gt": 2,
              "lte": 4
            }
          }
        }
      ],
      # 可以匹配的子查询
      "should": [
        {
    
    
          "match": {
    
    
            "all": "123456"
          }
        }
      ],
      # 必须匹配的子查询
      "filter": [
        {
    
    
          "term": {
    
    
            "id": "11"
          }
        }
      ]
    }
  }
}

must and should participate in scoring, that is, the score determines the order before and after sorting;

must_not and filter do not participate in the calculation of points, and it does not matter whether the score is high or low;

5. Sorting

(1) Sort by position

If there are fields related to position/coordinates in the document field, the query results can be sorted by position, and the closer the distance, the higher the sorting;

Indicates that according to the provided position, the closer to the position, the higher the sorting, of course, it depends on whether the order is in ascending order (asc);

# 5.1 按照坐标排序
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  }
  , "sort": [
    {
    
    
      "_geo_distance": {
    
    
        # 文档中位置相关的字段名、字段值
        "字段名": {
    
    
          "lat": 纬度,
          "lon": 经度
        },
        # 升序
        "order": "asc",
        # 距离单位
        "unit": "km"
      }
    }
  ]
}

(2) Sort by field value

For example, according to the descending order of the job value, the higher the order, the higher the order;

Format:

# 5.2 按照字段值排序
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "sort": [
    {
    
    
      "字段名": {
    
    
        "order": "desc"
      }
    }
  ]
}

Query results, the higher the job, the higher the ranking;

insert image description here

(3) Sort by multiple field values

If there are multiple values ​​involved in sorting, multiple values ​​can be written in order in sort;

If the job is in descending order, if the job is the same, then it is in ascending order by gender;

Format:

# 5.3 按照多个字段值排序
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "sort": [
    {
    
    
      "字段值1": {
    
    
        "order": "desc"
      },
      "字段值2": {
    
    
        "order": "asc"
      }
    }
  ]
}

It can also be written inside curly braces outside, the effect is the same, as follows:

# 5.3 按照多个字段值排序
GET /索引库名/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "sort": [
    {
    
    
      "字段值1": {
    
    
        "order": "desc"
      }
    },
    {
    
    
      "字段值2": {
    
    
        "order": "asc"
      }
    }
  ]
}

6. Pagination

The default query of es only displays the first 10 items, and you can use paged output to output more content; currently, there are two commonly used paging methods for es:

  • Method 1: specify the starting point (from), the number of entries and paging (size);

  • Method 2: Based on the result of the previous query (search_after), take the value as a parameter, set the number of entries, and query by page;

(1) Method 1:

For example, query the content of the document whose starting ID is 5 and the number of records is 5;

# 6.1 分页:方式一
GET /student/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "from": 起始位置,
  "size": 条数
}

Note that the value of from is calculated from 0, so - 1 is required;

insert image description here

(2) Method 2

For example, in method 1, the ID value in the query result can be taken out as the parameter of search_after, and 4 items can be queried later

# 6.2 分页:方式二
GET /student/_search
{
    
    
  "query": {
    
    
    "match_all": {
    
    }
  },
  "size": 4,
  "search_after":["5"],
  "sort": [
    {
    
    
      "id": {
    
    
        "order": "asc"
      }
    }
  ]
}

search result:

insert image description here

summary

  • Method 1 query, you need to pay attention that from + size cannot exceed 10000 , otherwise an error will be reported;

(4 + 9996 = 10000 will not report an error)

insert image description here

(4 + 9997 = 10001 more than 10000 error)

insert image description here

Method 2 (search_after) pagination query, it should be noted that the field should use the primary key field or unique field as much as possible. Otherwise, when setting the value of the last query, if there are more than one value in the result of the last query, the last document will be selected as the position of search_after, which may skip some documents.

7. Highlight

Highlighting refers to the special display of optional fields for the query results. For example, in the query results in Baidu, keywords will be displayed in red fonts, and the specific implementation is to add a css style before and after the keywords;

insert image description here

For example, here I display the result of the query, the name field in italics, as follows:

# 7. 高亮
GET /student/_search
{
    
    
  "query": {
    
    
    "match": {
    
    
      "all":"鲁迅"
    }
  },
  "highlight": {
    
    
    "fields": {
    
    
      "name":{
    
    
      	# 关键词前
        "pre_tags": "<em>",
        # 关键词后
        "post_tags": "</em>",
        # 此字段是否为match匹配的字段,选择false,因为我上面没有按照name进行匹配查找
        "require_field_match": "false"
      }
    }
  }
}

As you can see from the query results, Lu Xun was wrapped by em tags before and after

insert image description here

Summarize

The above DSL statements cannot be directly copied and used

Guess you like

Origin blog.csdn.net/qq_42108331/article/details/131915064