ES query operation

1, the prefix inquiry

To enter data:

PUT /my_index/address/1
{ "postcode": "W1 3DG" }

PUT /my_index/address/2
{ "postcode": "W2F 8HW" }

PUT /my_index/address/3
{ "postcode": "W1 7HW" }

PUT /my_index/address/4
{ "postcode": "WC1N 1LZ" }

PUT /my_index/address/5
{ "postcode": "SW5 0BE" }   

To find all  W1 Zip beginning, you can use a simple  prefix query:

类似于SQL: select * from table where xx like 'xx%';

GET /my_index/address/_search
{
    "query": {
        "prefix": {
            "postcode": "W1"
        }
    }
}

 

2, phrase matching the query (match_phrase)

     When performing phrase matching the query, ElasticSearch engine first analysis (analyze) the query string, construct the query phrase from the text after the analysis, which means you must match all word phrases, and ensure the relative position of each word is the same:

POST /_search -d
{  
   "from":1,
   "size":100,
   "fields":[ "eventname"],
   "query":{  
      "match_phrase":{  
         "eventname":"Open Source"
      }
   }
}

3, the phrase prefix matching the query (match_phrase_prefix)

    In addition to the last word prefix matching query text only, match_phrase_prefix and match_phrase essentially the same query, parameters max_expansions control the last word will be re-written the number of prefixes, that is, to control the number prefix extension component of the word, default is 50. The more extended prefix number, the greater the number of documents found; too little if the prefix number of extensions may not find the appropriate documents, missing data. As shown in the code, the document can be found eventname include "Open Source Hack Night" is.

POST /_search -d
{  
   "from":1,
   "size":100,
   "fields":[ "eventname" ],
   "query":{  
      "match_phrase_prefix":{  
         "eventname":{  
            "query":"Open Source hac",
            "max_expansions":50
         }
      }
   }
}

Use match performance tend to be very high, W1-> scanning the inverted index -> Once the scan is to W1, it can be stopped, because it is two doc with W1 has been found -> no need to continue to search for another term a;

 4, wildcard and regular expression queries

      With  prefix similar characteristics prefix query  wildcard wildcard queries based underlayer also a search term, the prefix is that it allows different query matches the specified regular expression. It uses the standard shell wildcard queries:  ? matches any character  * matches zero or more characters.

  This query contains matches  W1F 7HW and  W2F 8HW documents:

GET /my_index/address/_search
{
   "query": {
       "wildcard": {
           "postcode": "W?F*HW" 
       }
   }
}

 

? Match  1 and  2 ,  * with the space and  7 and  8 match.

Imagine if we want to match  W all the zip code, prefix matching area will include  WC all the zip code, problems encountered at the beginning of the wildcard match is similar to, if you want to match only  W the beginning and follow a zip code for all numbers,  regexp regular expressions allow write queries such a more complex patterns:

 

GET /my_index/address/_search
{
   "query": {
       "regexp": {
           "postcode": "W[0-9].+" 
       }
   }
}

 

 

 QueryBuilders.regexpQuery("postcode", "W[0-9].+");

    This regular expression requires word must  W begin with, followed by any number from 0-9, and then take one or more of the other characters.

     wildcard and regexp, consistent with the prefix principle, will scan the entire index, poor performance; pre-processing the data in the index helps to improve the efficiency of the prefix match, and wildcards and regular expression queries can only be done at query time, although these queries have their scenarios, but with still cautious.

Guess you like

Origin www.cnblogs.com/JimShi/p/11520621.html