I found a problem online today, which leads to thinking about the difference between query_string and match.
curl -XGET 'http://localhost:9200/*/offer/_search?pretty' -d '{
"from" : 0,
"size" : 10,
"fields" : ["title"],
"query": {
"query_string" : {
"query" : "100CrMo7 +圆钢",
"fields" : ["title"]
}
}
}' | grep "100CrMo7"
In this example, search for data that contains 100CrMo7 and round steel in the title.
curl -XGET 'http://localhost:9200/alias-offer/offer/_search?pretty' -d '{
"from" : 0,
"size" : 10,
"fields" : ["title"],
"query": {
"query_string" : {
"query" : "100CrMo7 -圆钢",
"fields" : ["title"]
}
}
}' | grep "100CrMo7"
In this example, search for data that contains 100CrMo7 but does not contain round steel in the title .
Note: The result of replacing with simple_query_string seems unsatisfactory. One or two pieces of data were not filtered out.
And look at the following example
curl -XGET 'http://localhost:9200/*/offer/_search?pretty' -d '{
"size" : 10,
"fields" : ["title"],
"query" : {
"bool" : {
"must" : {
"match" : {
"_all" : {
"query" : "100CrMo7 -圆钢",
"type" : "boolean",
"operator" : "AND",
"boost" : 1.0
}
}
}
}
}
}' | grep "100CrMo7"
No matter whether it is 100CrMo7-round steel or 100CrMo7 + round steel, the results have not changed.
See the documentation to explain as follows:
The match family of queries does not go through a "query parsing" process. It does not support field name prefixes, wildcard characters, or other "advanced" features.
In other words, the latter (match) has not undergone the process of query analysis.
So if you add the following code when using the latter: QueryParser.escape(keyword.trim()), if there is a special symbol like "-" in the keyword, such as 100CrMo7-3 , then your query result may be empty . So when using QueryParser.escape(keyword.trim()), we need to divide the situation.
Let's take a look at its source code:
public static String escape(String s) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
// These characters are part of the query syntax and must be escaped
if (c == ‘\\‘ || c == ‘+‘ || c == ‘-‘ || c == ‘!‘ || c == ‘(‘ || c == ‘)‘ || c == ‘:‘
|| c == ‘^‘ || c == ‘[‘ || c == ‘]‘ || c == ‘\"‘ || c == ‘{‘ || c == ‘}‘ || c == ‘~‘
|| c == ‘*‘ || c == ‘?‘ || c == ‘|‘ || c == ‘&‘ || c == ‘/‘) {
sb.append(‘\\‘);
}
sb.append(c);
}
return sb.toString();
}
Because the symbol -has been translated.