. word segmentation query takes about two steps. Operation: 1. Set the mapping under an index of ES, 2, perform query search for the field. The
premise is to install the word segmentation plug-in of ES first. Reference address: http://ludizhang.iteye.com/blog/2323939
1. Set the index mapping
direction ES sends http PUT request,
url: http://ip:port/indexName
postBody :
{ "mappings": { "testBase2": { "properties": { "field1": { "type": "string", "index": "analyzed", "analyzer": "ik", "search_analyzer": "ik", "store":"yes" }, "field1": { "type": "string", "index": "analyzed", "analyzer": "ik", "search_analyzer": "ik",-- specify the tokenizer "store":"yes" } } } } }
How to judge the success of the setting? You can refer to [img]
[/img]
There is mappingg information in the black box, not the mapping in the setting, but a direct mapping node,
and the page in the picture is based on ES _plugin 2. Word segmentation query of the /head/ plugin
Send the POST request url of the query to ES
: http://ip:port/indexName/typeName/_search
{ "query": { "bool": { "must": [ { "query_string": { "analyzer": "ik",-- analyzer, based on IK word segmentation "default_field": "field1",-- query field "query": "China and the US",-- match content "boost": 6 -- query weight }, "query_string": { "analyzer": "ik", "default_field": "field2", "query": "China and America", "boost": 4 } } ] } }, "size": 10,-- paging settings, the number of entries per page "from": 0-- start index }
查询结果
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,-- 总条数
"max_score": 1,
"hits": [
{
"_index": "index",
"_type": "testBase2",
"_id": "3187",
"_score": 1,
"_source": {
"brandName":"test5",
"classifyId": 23,
"labelWord": "标签3",
"videoName": "Two segments, 9-12,14-16",
"brandId": 6,
"videoDesc": "Video Introduction 5",
"videoId": 3187,
"classifyName": "Life aa",
" keyWord": "Key 4"
}
},
....
]
}
}
******************** Update, use a simpler way to perform complex queries
{ "fields": ["videoName","videoDesc"],//Specify the returned fields "query": { "bool": { "must": [ { "query_string": { "fields": ["videoName^9","videoDesc^1"],//Query field + weight "analyzer": "ik", "query": "Solution" } } ] } },"from": 0, "size": 60 }
The returned result is similar to the format returned by the previous query method, the difference is that our query is the specified field query in the
returned data part
{ "took": 38, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2444, "max_score": 1, "hits": [ { "_index": "index", "_type": "testBase2", "_id": "1230", "_score": 1, "fields": { "videoName": [ "All the way north" ], "videoDesc": [ "Driving a Mercedes-Benz, all the way north. On every road, it's fun to play, if I can change freely, I will run to the end. | Mercedes-Benz" ] } } ] } }
The difference is that the _source: field in the internal hits has become the fields field, and the returned data format has changed from simple json data to json nested jsonArray data, which is a little more troublesome to parse. It depends on personal preference.