ElasticSearch: Fuzzy query, is it match, fuzzy or wildcard? The difference between ™ and SQL like

Introduction: Is DSL the same as fuzzy query in SQL?

Hello, everyone, I’m Ma’er.
Today I will talk about fuzzy query. When we use relational database, fuzzy query uses like, plus wildcard

Wildcard Description
% Any character containing 0 or more characters
_ (Underscore) Any 1 character

So what is fuzzy query in ElasticSearch? We know that term is exact query. In some places, match is fuzzy, in some places, wildcard is fuzzy, and even fuzzy, etc. It literally means'fuzzy' sentence, what do they have? The difference

Fuzzy query in ElasticSearch

For example, we have the
name field in the listofhistoricalfigures index listofhistoricalfigures as follows

  1. Zhang San
  2. Zhang Sanfeng
  3. Zhang Fei
  4. Santoko
  5. Zhang Erfeng
  6. Sun Quan
  7. Ma Sanfeng

The structure is as follows, text supports word segmentation query, and keyword supports precise query. For
details, please refer to this article about the difference between .keyword and not added when using term in ElasticSearch

 
 "name": {
    
    
            "type": "text",
            "fields": {
    
    
              "keyword": {
    
    
                "type": "keyword"
              }
            }
         }

match word segmentation matching search

match
British [mætʃ] beautiful [mætʃ]
n. match; match; competition; rival; well-matched people
v. general match; match; same; similar; consistent; find a match (or related) person (or thing); match

match literally similar; consistent; find proportionate (or related) persons (or objects); paired

GET listofhistoricalfigures/_search 
'{
    
    
    "query": {
    
    
        "match": {
    
    
            "name": "张三"
        }
    }
}

We use match and the default tokenizer to segment Zhang San into Zhang, San, and Zhang San for retrieval
. The matching results are

张三
张三丰
张飞
三德子
张二丰
马三丰

wildcard wildcard search

wildcard
美[ˈwaɪldˌkɑrd]
n. unknown; unknown factor; (given those who do not have normal qualifications to participate in the competition) "wild card"; "wild card" player;
(used to replace any character or string) wildcard

wildcard literally wildcard

GET listofhistoricalfigures/_search 
'{
    
    
    "query": {
    
    
        "wildcard": {
    
    
            "name.keyword": "张三*"
        }
    }
}

Using wildcard is equivalent to SQL's like, which can be spliced ​​before and after *, which means to match 0 to more arbitrary characters.
Add .keyword is to match a complete word
. The result will be:

张三
张三丰

fuzzy/error correction retrieval

fuzzy
English [ˈfʌzi] American [ˈfʌzi]
adj. covered with fluff; hairy; tightly curled; fisted; (shape or sound) fuzzy

fuzzy literal meaning is vague

GET listofhistoricalfigures/_search 
'{
    
    
    "query": {
    
    
        "fuzzy": {
    
    
            "name.keyword": "张三"
        }
    }
}

Using fuzzy is the same as Baidu. You can enter "Deng Ziqi" and you can find out "Deng Ziqi". It has a certain error correction ability.
The keyword is to match the complete word
. The result will be

张三
张三丰
张飞
张二丰
马三丰

in conclusion

1.match word segmentation matching search, you can segment the query conditions, find more matching content, combine different tokenizers, you can get different effects

2. The wildcard search function is just like the traditional SQL like. If the data is in es and you want to get the traditional "fuzzy query" structure, use wildcard

3. Fuzzy error correction retrieval, so that the input conditions are fault-tolerant

Guess you like

Origin blog.csdn.net/weixin_43859729/article/details/108134329