elasticsearch query

Elasticsearch's query consists of two parts: query and filter.

The main difference between the two is that the filter does not calculate the correlation and can cache at the same time. Therefore, filter speed is faster than query.

First record the various queries provided by es.

The following content is only for reading notes. For more details, please refer to http://www.elasticsearch.org/guide/

Part 1: query

Use query when full-text-search is required and correlation needs to be calculated. And filter can not meet the demand.

(1)match query and multi-match query //and match-all query and minimum should match query

Match queries do not have a "query parsing" process, and fields do not support advanced features such as wildcards and prefixes. They only analyze and execute queries with reference to the specified text, so the probability of failure is extremely small, which is suitable for search-box.

The analyzed type of query, so the analyzer can be specified

operator can specify or/and

zero-terms-query can specify none/all

cutoff-frequency can specify absolute value or relative value

The match-phase query can specify the slot value, see the subsequent search-in-depth

match-phase-prefix query can specify max_expansion

(2)multi-match query

Execute the match query for a single field respectively. Therefore, the calculation rules for the final _score value vary.

fields can specify the fields that need to be queried for execution, fields can support advanced features such as wildcards (match query is not supported), and fields can support (^) specify the boost weight of each field

types can specify the following values ​​to differentiate between different query behaviors:

best _fields: The _score is determined by the match-clause with the highest score. field-centric

most_fields: All match-clauses are taken into account. field-centric

cross-fields: Treat fileds as a big-fields. term-centric

phase and phase-prefix: each field executes the corresponding query, combine the score

The above all have specific application scenarios and detailed calculation rules. For details, please refer to the subsequent search-in-depth.

(3)bool query

A compound query that wraps other types of queries. The following three logical relationships are supported.

must: AND   

must_not:NOT

should:OR

(4)boosting query

A compound query, divided into positive sub-query and negative sub-query, the query structure of both will be returned.

The score of the positive subquery remains unchanged, and the value of the negative subquery will be reduced accordingly according to the value of negative_boost.

(5)common term query

A slightly advanced query that fully considers the low priority of stop-words to improve query accuracy.

The terms are divided into two types: more-importent (low-frequency) and less important (high-frequency). less-important such as stop-words, eg: the and.

The grouping criterion is determined by cutoff_frequence. Two sets of queries constitute a bool query. must be applied to low_frequence, and should be applied to high_frequence.

The operator and mini_should_match can be specified inside each group.

If there is only one group after group, it will degenerate to a single-group subquery by default.

In query execution, first match the doc of the more-import group, then match less-import on this basis, and calculate only the matched score. Efficiency is guaranteed, and leverage is also fully considered.

(6)constant score query

A query that does not calculate relevance. Use the score specified in the index process,.

(7)dismax query

Union is performed on the results of the subquery, and the score uses the maximum value of the subquery score. This kind of query is widely used in muti-field query. For details, please refer to the follow-up update search-in-depth

(8)filtered query

combine another query with any fillter。

If query is not specified, it defaults to match_all. When applying multiple fitlers, the strategy attribute, expert-level, can be specified.

(9)fuzzy query and fuzzy like this query and fuzzy like this field query

fuzzy query : Mainly perform matching distance query based on fuzzyiniess and prefix_length. The distance calculation is different according to the type.

The distance of the numeric type is similar to the interval, and the string type is based on the Levenshtein distance, that is, the minimum number of letters that need to be transformed from one stringA to another stringB.

If AUTO is specified, the following rules apply based on the length of the term:

0-1: Exact match

1-4:1

>4:2

It is recommended to specify prefix_length, indicating that the characters in this range need to be matched exactly. If the prefix_lengh and fuzziniess parameters are not specified, the query burden is heavy.

(10)function score query

Define a function to change the doc's score

(11)geoshape query

Geo-based queries

(12)has child query and has parent query and top children query

By default, like filter, query is a filter that wraps a constant_score. There is also support for related scores.

has_child: Matches the child field and returns the result of the corresponding parent.

has_parent: matches the parent field and returns the result matching the corresponding child.

top_children query: a type of has_child query, which also queries the child field, but adds controllable parameters, and determines the number of sub-queries by factor, incremental_factor and the size of the query until it is satisfied

size, therefore, multiple iterations of subqueries may be required, so total_hits may be inaccurate.

(13)ids query

Query the specified id.

(14)indices query

To query among multiple indexes, it is allowed to provide an index parameter to specify the index to be queried and related queries, and specify no_match_query to query in indexes other than indecs and return the result.

(15)more like this and more like this field query

According to the specified like_text, several term-based should queries are generated through analysis and combined into a bool query.

min_term_freq/max_term_freq/max_term_num:限制interesting term。

percentage_terms_to_match: Limits the percentage of terms that should query should satisfy.

More like this query can specify multiple fields, and more like this field query queries on one field.

(16)nested query

Inline type query, specifying the full path.

(17)prefix query

Prefix query.

(18)query string query and simple query string query

Query based on lucence query syntax, specifying fields/term/boost, etc.

simple query string query is similar to query string, this is the part that will automatically give up the invalid and will not throw an exception.

The default field is _all.

(19)range query and regrex query and wildcard query

range query: range query, date/string/num.

regrex query: Regular query.

wildcard query: wildcard query.

(20)span-*query

(21)term query and terms query

Term-based queries.

(22)template query

Register a query template, specifying the template query.

--------------------------

Follow-up plan update:

(1) Comparison of some special queries. For example, fuzzy and more_like.

(2)search-in-depth

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326322593&siteId=291194637