Elasticsearch article describes the Search API

Elasticsearch article describes the Search API

1 SearchAPI Overview

Es realize the data stored in the analysis, endpoint is _search, as follows:
Here Insert Picture Description
Query There are two main forms of
Here Insert Picture Description

2 URISearch explain and demonstrate

Achieved by url query search parameters, the following common parameters:

  • q specify the query statement, the syntax for the Query String Syntax
  • df q default field queries without the specified field, if not specified, es queries all fields
  • sort sort
  • timeout specified timeout, no timeout default
  • from, size used for pagination
    Here Insert Picture Description
  • term 与 phrase
    • alfred way equivalent to alfred or way
    • "Alfred way" words in the query, the requirements have to sort
  • Pan inquiry
    • alfred equivalent to the term to match in all fields
  • The specified field
    • name:aflred
  • Group set packet, a matching rule specified parentheses
    • (quick OR brown)AND fox
    • status: (active OR pending) title : (full text search)
      to create an index, generate test documents
PUT my_index_search
{
  "settings": 
  {
    "number_of_shards": "5",
    "number_of_replicas": "0"
  }
}
POST my_index_search/doc/_bulk
{"index":{"_id": "1"}}
{"username": "alfred way","job": "java engineer","age": 18,"birth": "1990-01-02","isMarried":false}
{"index":{"_id": "2"}}
{"username": "alfred","job": "java senior and java specialist","age": 28,"birth": "1980-05-07","isMarried":true}
{"index":{"_id": "3"}}
{"username": "lee","job": "java and ruby engineer","age": 22,"birth": "1985-08-07","isMarried":false}
{"index":{"_id": "4"}}
{"username": "alfred junior way","job": "ruby engineer","age": 23,"birth": "1989-08-02","isMarried":false}
# 查询所有字段中有alfred的文档
GET my_index_search/_search?q=alfred
# 设置profile可以看具体的查询语句
GET my_index_search/_search?q=alfred
{
  "profile": true
}
GET my_index_search/_search?q=username:alfred
GET my_index_search/_search?q=username:alfred
{
  "profile": true
}
# username:alfred和way是OR的关系
GET my_index_search/_search?q=username:alfred way
{
  "profile": true
}
# PhraseQuery词语的查询
GET my_index_search/_search?q=username:"alfred way"
{
  "profile": true
}
# "description": "username:alfred username:way" 下面描述
GET my_index_search/_search?q=username:(alfred way)
{
  "profile": true
}
  • Boolean operators
  • AND(&&) OR(||) NOT(!)
  • name:(tom NOT lee)
  • Note uppercase, lowercase can not
  • + - and respectively must must_not
  • name:(tom +lee -alfred) 或者 name:((lee && !alfred)||(tome && lee && !alfred))
  • + In the url will be resolved to a space, before they can be used encode for% 2B
GET my_index_search/_search?q=username:alfred AND way
{
  "profile": true
}
GET my_index_search/_search?q=username:(alfred AND way)
{
  "profile": true
}
GET my_index_search/_search?q=username:(alfred NOT way)
{
  "profile": true
}
GET my_index_search/_search?q=username:(alfred +way)
{
  "profile": true
}
GET my_index_search/_search?q=username:(alfred %2Bway)
{
  "profile": true
}
  • Range queries, support and value date
    • Writing interval, a closed interval [] {} open interval
      • age: [1 TO 10] I为 1 <= age <= 10
      • age: [1 TO 10} I为 1 <= age <10
      • age: [1 TO] I为 age> = 1
      • age: {* TO 10] I为 age <= 10
    • Arithmetic sign writing
      • age:>=1
      • age:(>=1 && <=10)或者 age:(+>=1 +<=10)
GET my_index_search/_search?q=username:alfred age:>20
GET my_index_search/_search?q=username:alfred AND age:>20
GET my_index_search/_search?q=birth:(>1980 AND <1990)
  • Wildcard queries
  • ? On behalf of one character, * represents 0 or more characters
    • name:t?m
    • name:tom*
    • name:t*m
  • Low wildcard matching efficiency, and take up more memory, not recommended
  • If no special needs, do not? / * On the front
GET my_index_search/_search?q=username:alf*
  • Regular expression match
GET my_index_search/_search?q=username:/[a]?l.*/
  • Fuzzy matching fuzzy query
  • name:roam~1
  • A matching difference roam character word, such as foam, roams etc.
  • Approximation proximity search query
  • “fox quick”~5
  • To compare the difference in term units, such as "quick fox" "quick brown fox" will be matched
GET my_index_search/_search?q=username:alfed~1
GET my_index_search/_search?q=job:"java engineer"~2

3 QueryDSL Profile

The query is sent to es via http request body, contains the following main parameters:

  • query syntax in line with Query DSL query
  • from、size
  • timeout
  • sort

  • Here Insert Picture Description
  • JSON-based query language defined, mainly includes the following two types:
    • Query class field
      • Such as term, math, range, etc., only a query against a field
    • Match the query
      • The bool query, comprising one or more fields or queries type compound queries

4 Introduction class field inquiries and match-query

  • Query class field include the following categories:

  • Full match

    • Full-text search for text type of field, the query would first be word processing, such as match, match_phrase and other query types
  • Word Match

    • Will not do word processing on the query, go directly to the inverted index matching fields, such as term, terms, range and other query types
  • On the field for full-text search, the most basic and common type of query, API examples are as follows:
    Here Insert Picture Description

GET my_index_search/_search
{
  "query": {
    "match": {
      "username": "alfred way"
    }
  }
}
# 查看查询语句
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match": {
      "username": "alfred way"
    }
  }
}

Here Insert Picture Description

  • You can control the relationship between words by matching operator parameters, options or and and
    Here Insert Picture Description
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match": {
      "username": {
        "query": "alfred way",
        "operator": "and"
      }
    }
  }
}
  • Minmun_should_match parameters can be controlled by the need to match the number of words
    Here Insert Picture Description
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match": {
      "job": {
        "query": "java ruby engineer",
        "minimum_should_match": 2
      }
    }
  }
}

Correlation calculation points 5

Here Insert Picture Description

  • Correlation count points refers to the degree of correlation between the document and the query, English as relevance
    • You can get a list of documents that match the query by inverted index
    • Is essentially a scheduling problem, ordering is based on the correlation count points
      Here Insert Picture Description
  • Several important concepts of count points as follows:
    • Term Frequency (TF) word frequency, that is, the number of occurrences of the word in the document. The higher the word frequency, the higher the degree of correlation
    • Document Frequency (DF) document frequency, that word appears in the document tree
    • Inverse Document Frequency (IDF) inverse document frequency, document frequency contrast, simply understood as 1 / DF. That is, the fewer the number of documents occurrences of the word, the more relevant
    • Field-length Norm document shorter, more relevant
  • ES At present, there are two related points of calculation models, such as:
    • TF / IDF model
      Here Insert Picture Description
      Here Insert Picture Description
    • The default model after model BM25 5.x
      Here Insert Picture Description
      Here Insert Picture Description

6 match-phrase-query

  • Field for retrieval, sequential requirements, as an example of the API
    Here Insert Picture Description
    Here Insert Picture Description
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match_phrase": {
      "job": {
        "query": "java engineer"
      }
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match_phrase": {
      "job": {
        "query": "engineer java"
      }
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match_phrase": {
      "job": {
        "query": "java engineer",
        "slop": 1
      }
    }
  }
}

GET my_index_search/_search
{
  "profile": true,
  "query": {
    "match_phrase": {
      "job": {
        "query": "java engineer",
        "slop": 2
      }
    }
  }
}

7 query-string-query

Here Insert Picture Description
Here Insert Picture Description

GET my_index_search/_search
{
  "query": {
    "query_string": {
      "default_field": "username",
      "query": "alfred AND way"
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "query_string": {
      "fields": [
          "username",
          "job"
        ],
      "query": "alfred OR (java AND ruby)"
    }
  }
}

8 simple-query-string-query

  • Similarly Query String, but ignores the query syntax error, and supports only part of the query syntax
  • As commonly used systems which can not use AND, OR, NOT, etc. Keywords:
    • On behalf of that AND +
    • | On behalf of that OR
    • - on behalf of that NOT
      Here Insert Picture Description
# 必须包含away,可以包含alfred
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "simple_query_string": {
      "query": "alfred +way",
      "fields": ["username"]
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "simple_query_string": {
      "query": "alfred +way AND java",
      "fields": ["username"]
    }
  }
}

Comparison of query_string and simple_query_string

GET my_index_search/_search
{
  "profile": true,
  "query": {
    "query_string": {
      "fields": ["username"],
      "query": "alfred OR (\"java AND ruby)"
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "simple_query_string": {
      "query": "alfred +way AND \"java",
      "fields": ["username"]
    }
  }
}

9 term/terms-query

  • The term-query query query as a whole word, that word did not query processing, as follows:
    Here Insert Picture Description
  • terms-query once passed more than one word query, as follows:
    Here Insert Picture Description
# term query
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "term": {
      "username": "alfred"
    }
  }
}
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "term": {
      "username": "alfred way"
    }
  }
}
# terms query
GET my_index_search/_search
{
  "profile": true,
  "query": {
    "terms": {
      "username": [
        "alfred",
        "way"
      ]
    }
  }
}

10 range-query

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

GET my_index_search/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 10,
        "lte": 20
      }
    }
  }
}
GET my_index_search/_search
{
  "query": {
    "range": {
      "birth": {
        "gte": "1980-01-01"
      }
    }
  }
}
GET my_index_search/_search
{
  "query": {
    "range": {
      "birth": {
        "gte": "now-35y"
      }
    }
  }
}

11 complex queries Introduction and ConstantScore

12 bool-query

13 count-and-source-filtering

Published 77 original articles · won praise 33 · views 30000 +

Guess you like

Origin blog.csdn.net/qq_39337886/article/details/103943424