table of Contents
Introduction: Is DSL the same as fuzzy query in SQL?
Hello, everyone, I’m Ma’er.
Today I will talk about fuzzy query. When we use relational database, fuzzy query uses like, plus wildcard
Wildcard | Description |
---|---|
% | Any character containing 0 or more characters |
_ (Underscore) | Any 1 character |
So what is fuzzy query in ElasticSearch? We know that term is exact query. In some places, match is fuzzy, in some places, wildcard is fuzzy, and even fuzzy, etc. It literally means'fuzzy' sentence, what do they have? The difference
Fuzzy query in ElasticSearch
For example, we have the
name field in the listofhistoricalfigures index listofhistoricalfigures as follows
- Zhang San
- Zhang Sanfeng
- Zhang Fei
- Santoko
- Zhang Erfeng
- Sun Quan
- Ma Sanfeng
The structure is as follows, text supports word segmentation query, and keyword supports precise query. For
details, please refer to this article about the difference between .keyword and not added when using term in ElasticSearch
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
match word segmentation matching search
match
British [mætʃ] beautiful [mætʃ]
n. match; match; competition; rival; well-matched people
v. general match; match; same; similar; consistent; find a match (or related) person (or thing); match
match literally similar; consistent; find proportionate (or related) persons (or objects); paired
GET listofhistoricalfigures/_search
'{
"query": {
"match": {
"name": "张三"
}
}
}
We use match and the default tokenizer to segment Zhang San into Zhang, San, and Zhang San for retrieval
. The matching results are
张三
张三丰
张飞
三德子
张二丰
马三丰
wildcard wildcard search
wildcard
美[ˈwaɪldˌkɑrd]
n. unknown; unknown factor; (given those who do not have normal qualifications to participate in the competition) "wild card"; "wild card" player;
(used to replace any character or string) wildcard
wildcard literally wildcard
GET listofhistoricalfigures/_search
'{
"query": {
"wildcard": {
"name.keyword": "张三*"
}
}
}
Using wildcard is equivalent to SQL's like, which can be spliced before and after *, which means to match 0 to more arbitrary characters.
Add .keyword is to match a complete word
. The result will be:
张三
张三丰
fuzzy/error correction retrieval
fuzzy
English [ˈfʌzi] American [ˈfʌzi]
adj. covered with fluff; hairy; tightly curled; fisted; (shape or sound) fuzzy
fuzzy literal meaning is vague
GET listofhistoricalfigures/_search
'{
"query": {
"fuzzy": {
"name.keyword": "张三"
}
}
}
Using fuzzy is the same as Baidu. You can enter "Deng Ziqi" and you can find out "Deng Ziqi". It has a certain error correction ability.
The keyword is to match the complete word
. The result will be
张三
张三丰
张飞
张二丰
马三丰
in conclusion
1.match word segmentation matching search, you can segment the query conditions, find more matching content, combine different tokenizers, you can get different effects
2. The wildcard search function is just like the traditional SQL like. If the data is in es and you want to get the traditional "fuzzy query" structure, use wildcard
3. Fuzzy error correction retrieval, so that the input conditions are fault-tolerant