Introduction | Remember the thinking process of kibana executing dsl scripts in actual combat
1. Elasticsearch Script History-distributed full-text search-script engine history
In the early versions of ES, MVEL scripts were used, but in order to solve the security risks, Groovy scripts were born.
The ensuing security holes and memory leaks, so on the occasion of ES5.0 version, the painless script was officially announced, and it has been several years since the painless script appeared in front of the developers.
2. Elasticsearch Script ApplyCenarios-distributed full-text search-script engine application scenario
We are all familiar with the fact that the Elasticsearch full-text search engine provides a wealth of dsl syntax in each version series - add, delete, modify and check - here we take the 6.x version series - 6.8.6 as an example.
In more than 80% of business scenarios, it is easy to add, delete, modify and check, but it is applied to relatively complex business scenarios:
Multi-field custom update, custom reindex, custom array field dynamic addition ...
Of course, it is also possible to manually develop plug-ins based on the script engine .
The meaning derived from the painless script is " painless " and has no loopholes , but there are special points that need to be paid attention to - you cannot start es with the root account, and do not disclose the es path to other users.
Judging from the introduction of the use of official scripts, the first is the performance problem, and the second is the use of business scenarios. ebay also reflects the performance optimization practice in the English version , and the Chinese version is also marked here .
Among them, more than 80% of the business scenarios: refer to the compilation of Elasticsearch+Kibana+Dsl-Crud
GET _search
{
"query": {
"match_all": {}
}
}
#节点信息
GET _cat/nodes?v
#各节点机器存储信息
GET _cat/allocation?v
#索引信息
GET _cat/indices?v
#分片信息
GET _cat/shards?v
#注册快照存储库-仓库共享
PUT _snapshot/my_backup
{
"type": "fs",
"settings": {
"location": "/home/user/yxd179/es/backup"
}
}
#查看仓库信息
GET /_snapshot/my_backup?pretty
#查看快照存储库保存结果
GET _snapshot
#创建快照,这个会备份所有打开的索引到my_backup仓库下并命名为snapshot_phr的快照里。这个调用会立刻返回,然后快照会在后台运行。若是希望在脚本中一直等待到完成,可通过添加 wait_for_completion 标记实现,这个会阻塞调用直到快照完成(如果是大型快照,会花很长时间才返回),其中只会备份索引809iJpOmSI2ZmJrUqKRR0Q信息
PUT /_snapshot/my_backup/snapshot_yd?wait_for_completion=true
{
"indices": "809iJpOmSI2ZmJrUqKRR0Q",
"ignore_unavailable": true,
"include_global_state": false,
"metadata": {
"taken_by": "phr",
"taken_because": "backup before upgrading"
}
}
#查看快照
GET /_snapshot/my_backup/snapshot_yd
#查看所有快照
GET /_snapshot/my_backup/_all
#删除快照
DELETE /_snapshot/my_backup/snapshot_yd
#监控快照创建或恢复过程
GET /_snapshot/my_backup/snapshot_yd/_status
#恢复快照
POST /_snapshot/my_backup/snapshot_yd/_restore
#动态模板
PUT /_template/yxd179_tpl
{
"index_patterns": [
"yxd179-2021*"
],
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"yd": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"index": true,
"copy_to": "full_context",
"analyzer": "ik_max_word",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
],
"properties": {
"full_context": {
"type": "text",
"analyzer": "ik_max_word",
"fielddata": true,
"store": true
}
}
}
}
}
#副本分片分配设置
PUT /yxd179-2021/_settings
{
"number_of_replicas": "1"
}
#分页查询
GET /yxd179-2021/yd/_search
{
"from": 0,
"size": 30
}
#根据ID查询
GET /yxd179-2021/yd/647461503271768064
#bool query dsl查询
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"regNumber": "20203030651"
}
}
]
}
},
{
"term": {
"status": "1"
}
}
]
}
},
"sort": [
{
"createTime": {
"order": "desc"
}
}
],
"from": 0,
"size": 10
}
#允许ES最大滚动数目分配设置
PUT /yxd179-2021/_settings
{
"index": {
"max_result_window": 13000000
}
}
#查看字段分词分析过程
POST /yxd179-2021/_analyze
{
"field": "regNumber",
"text": "国械标准20203030651号"
}
#模糊查询匹配
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"wildcard": {
"regNumber.keyword": "*20203030651*"
}
}
]
}
},
{
"term": {
"status": "1"
}
}
]
}
},
"sort": [
{
"createTime": {
"order": "desc"
}
}
],
"from": 0,
"size": 10
}
#对指定字段设置分词器查询
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"hdsd0001004": {
"query": "1828551417",
"analyzer": "char_analyzer"
}
}
}
]
}
},
"from": 0,
"size": 30
}
#模糊查询匹配
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"hdsd0001002.keyword": "*yxd179*"
}
}
]
}
},
"from": 0,
"size": 30
}
#关闭索引:
POST yxd179-2021/_close
#打开索引:
POST yxd179-2021/_open
#对指定字段设置分词器
PUT /yxd179-2021/_mapping/yd
{
"properties": {
"hdsd0001004": {
"type": "text",
"analyzer": "char_analyzer"
}
}
}
#查看mapping结构体信息
GET yxd179-2021/_mapping
#设置分词分析器
PUT yxd179-2021/_settings
{
"analysis": {
"analyzer": {
"char_analyzer": {
"tokenizer": "char_tokenizer",
"filter": "lowercase"
}
},
"tokenizer": {
"char_tokenizer": {
"type": "pattern",
"pattern": "|"
}
}
}
}
#minimum_should_match
GET /yxd179-2021/yd/_search
{
"query": {
"query_string": {
"query": "182855141y7",
"type": "phrase",
"operator": "AND",
"minimum_should_match": "100%",
"fields": [
"hdsd0001004"
]
}
}
}
#显示字段
GET /yxd179-2021/yd/_search
{
"_source": {
"include": [
"id",
"productId"
]
},
"query": {
"bool": {
"must": [
{
"terms": {
"productId": [
636654265306419462
]
}
}
]
}
},
"from": 0,
"size": 30
}
#高亮查询
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": []
}
},
{
"term": {
"status": "1"
}
},
{
"term":{
"id":636662671736099971
}
}
]
}
},
"sort": [
{
"id": {
"order": "asc"
}
}
],
"highlight": {
"pre_tags": [
"<span class='title-key'>"
],
"post_tags": [
"</span>"
],
"fields": {
"commonName": {
"type": "plain"
}
}
},
"from": 0,
"size": 10
}
#read_only_allow_delete
PUT /yxd179-2021/_settings
{
"index":{
"blocks":{
"read_only_allow_delete":"false"
}
}
}
#查询模板
GET /_template
GET /yxd179-2021*/yd/_search
{
"from": 0,
"size": 30
}
#单个字段bool查询
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"id": "636651493706133509"
}
}
]
}
},
"from": 0,
"size": 30
}
#批量
POST /_bulk
{"index":{"_index":"yxd179-2021","_type":"yd","_id":"65965969996688"}}
{"id":"65965969996688","HDSD0001002":"sdff","HDSD0001008":"fsdf","HDSD0001006":"000000000000000000","create_time":"2021-07-29","cancel_flag":0}
{"index":{"_index":"yxd179-2021","_type":"yd","_id":"66049829996688"}}
{"id":"66049829996688","HDSD0001002":"sdgsdg","HDSD0001008":"fsdfsdf","HDSD0001006":"000000000000000000","create_time":"2021-07-29","cancel_flag":1}
#外层交集查询
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"regNumber": "国sd20182642128"
}
}
]
}
},
{
"term": {
"status": "1"
}
}
]
}
},
"sort": [
{
"createTime": {
"order": "desc"
}
}
],
"from": 0,
"size": 10
}
#复杂bool带权重查询-得分排序
GET /yxd179-2021/yd/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": {
"cancelFlag": {
"value": "0",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
{
"bool": {
"should": [
{
"match": {
"yhe": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
},
{
"match": {
"yhr": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
},
{
"match": {
"yht": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
},
{
"match": {
"yhg": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"explain": true,
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
#查询耗时统计分析profile
GET /yxd179-2021/yd/_search
{
"profile": true,
"query":{
"term":{
"tu":6583120
}
}
}
#根据ID修改
POST /yxd179-2021/yd/b00e89b652484b0b8da16e090302e012/_update
{
"doc":{
"fd":"1"
}
}
#修改_update_by_query脚本引擎painless
POST /yxd179-2021/_update_by_query
{
"query":{
"term":{
"fdh":6583120
}
},
"script":{
"lang":"painless",
"source": "ctx._source.cancelFlag=params.cancelFlag;ctx._source.updateTime=params.updateTime",
"params": {
"cancelFlag":"0",
"updateTime":"2021-07-28T01:17:36.000Z"
}
}
}
#交集查询-且保留-全
GET /yxd179-2021/yd/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"cancelFlag": "0"
}
},
{
"bool": {
"must": [
{
"wildcard": {
"hdsd0001002.keyword": "*yxd179*"
}
},
{
"match": {
"hdsd0001003": "2"
}
}
]
}
}
]
}
},
"sort": [
{
"id": {
"order": "desc"
}
}
],
"highlight": {
"pre_tags": [
"<span class='title-key'>"
],
"post_tags": [
"</span>"
],
"fields": {
"hdsd0001002": {
"type": "plain"
}
}
},
"from": 0,
"size": 30
}
#外层交集查询-里层交集查询
GET /yxd179-2021/yd/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": {
"cancelFlag": {
"value": "0",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
{
"bool": {
"must": [
{
"match": {
"hdsd0001002": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
},
{
"match": {
"hdsd0001003": {
"query": "2",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"explain": true
}
#并集查询
GET /yxd179-2021/yd/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"term": {
"cancelFlag": {
"value": "0",
"boost": 1
}
}
}
],
"should": [
{
"match": {
"hdsd0001002": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"explain": true
}
#并集查询-字段显示
GET /yxd179-2021/yd/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"match": {
"cancelFlag": {
"query": "0",
"operator": "AND",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"should": [
{
"match": {
"hdsd0001002": {
"query": "张",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
},
{
"match": {
"hdsd0001002.pinyin": {
"query": "zhang",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"explain": true,
"_source": {
"includes": [
"id",
"th001Id",
"createTime",
"updateTime",
"hdsd0001001",
"hdsd0001002",
"cancelFlag"
],
"excludes": []
}
}
#若需要更频繁的更新,可以使用es api强制更新
GET /yxd179-2021/_refresh
#根据ID删除
DELETE /yxd179-2021/yd/ud6-5XkBwVbB7HKjg5k0
#删除索引
DELETE /yxd179-2021
#删除模板-动态mapping
DELETE /_template/yxd179_tpl
#排序
GET /yxd179-2021/yd/_search
{
"sort": [
{
"createTime": {
"order": "desc"
}
}
],
"from": 0,
"size": 30
}
3. Elasticsearch Script ActualCombat-distributed full-text search-script engine actual combat
Here only Update-By-Query is taken as an example:
Among them, lang specifies the script engine: painless, the source is the script script fragment, and params is the script parameter value.
The reason why passing through params can break through ES’s limitation on script compilation, although the configuration of the parsing upper limit can also be modified by the following operations:
PUT /_cluster/settings
{
"transient": {
"script.max_compilations_per_minute": 40
}
}
Important: For large batches of data, ES requires separate compilation and analysis. When bulk update is performed, if each script is compiled in real time, it is conceivable that the upper limit will be reached soon. Knowing what it is, knowing why, ES will only parse the script for the first time, and there is no need to parse it again afterwards. When there are constant variables in the script, ES will compile the script in real time, so combined with the param function in the script, Try to pass the variables in the script through param, so as to fundamentally solve the problem of script compilation and parsing limitations.
Next, let's see how to build a tcp client based on version 6.8.6 in Java to execute a painless script engine?
Added: A call to the updateByQuery API begins by taking a snapshot of the index, which uses internal versioning to find any documents.
Consider version conflicts when a document changes between the time of the snapshot and the indexing request process. updateByQuery updates the document and increments the version number when the versions match . The above can also be set in order to prevent updateByQueryabortOnVersionConflict(false)
from being aborted due to version conflicts . The reason for doing this is that it is possible that it is trying to fetch online mapping changes, and a version conflict means starting updateByQuery at the same time and trying to update the conflicting document of the document. The update will fetch For online map update, updateByQuery can also use ingest nodes by specifying a pipeline . Among them, the UpdateByQueryRequestBuilder API can support filtering updated documents, limit the total number of documents to be updated , and use scripts to update documents , instantly flash to disk , retry times , etc.
Retry:
When clients A and B obtain the same document almost at the same time, and obtain _version
the version information together, assuming that at this time_version=1。
Next, client A modifies part of the content in the document, and writes the modification into the index. When Elasticsearch writes to the index, it checks the version information of the document submitted by client A (here is still 1) and the version information of the existing document (here is also 1), and after finding the same, executes the write operation and modifies the version number _version=2。
Then client B also modifies part of the content in the document, and its operation writes back to the index at a slightly slower speed. At this time, the writing process is also performed, and ES finds that the version of the document submitted by client B is 1, while the version of the existing document is 2 , that is, a conflict occurs, and this partial update will fail - try again.
Concurrency control strategy: partial update concurrency control strategy - optimistic lock
Small test case: How to specify multiple fields to update through the script engine?
Method No.1:
ctx._source.putAll(params)
Method No.2:
for (k in params.keySet()){if (!k.equals('ctx')){ctx._source.put(k, params.get(k))}}