About several ways to update data in elaticsearch


As a mature framework, Elasticsearch provides rich APIs for manipulating data. In this article, we will learn several ways to update data in es.



(1) Update the document

(1) Partial update:

java api:
````
`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","woshigcs");
        data.put("age",25);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-21", "active", "18");
        urb.setDoc(data);
        urb.execute().actionGet();

        System.out.println("update ok......");
````


Pay attention to the partial update function, the premise is that the index and the data already exist, otherwise the corresponding exception will be thrown, as long as any one of them is not satisfied, the update will fail.

curl:
````
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "doc" : {
        "name" : "new_name"
    }
}
````





(2) Use detect_noop

java api:
````
`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","woshigcs");
        data.put("age",25);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-21", "active", "18");
        urb.setDoc(data);
        urb.setDetectNoop(false);//The default is true
        urb.execute().actionGet();

        System.out.println("update ok......");
````

curl method:

````
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "doc" : {
        "name" : "new_name"
    },
    "detect_noop": false
}'

````
Pay attention to the meaning of detect_noop: by

default detect_noop=true

By default, the index will only be rebuilt if the original source and the new source have different fields. If they are identical, the index will not be rebuilt. If detect_noop=false, regardless of the content The index will be rebuilt if there is any change. This can be found by the change of the value of version. The



updated document must exist in advance. Unless you use the update+script to update, a document
missing exception will be reported.


(2) The script + upset update method:


java api
````

 `       HashMap<String,Object> params=new HashMap<>();
        HashMap<String,Object> data=new HashMap<>();
        data.put("name","12345");
        params.put("source",data);

        StringBuffer sb_json = new StringBuffer("ctx._source=source");
        Script script = new Script(sb_json.toString(), ScriptService.ScriptType.INLINE, "groovy", params);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setScript(script);
        urb.setUpsert(data);
        urb.execute().actionGet();
        System.out.println("Update finished...");

````

curl
````
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "script" : {
        "inline": "ctx._source.counter += count",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 1
    }
}'
````



(3): The usage of scripted_upsert:

The example on the official website did not work. The following is rewritten according to the example above on stackoverflow. It can be passed

in postman. The first is the url java api

of the post request :



````
`       HashMap<String,Object> params=new HashMap<>();
        HashMap<String,Object> data=new HashMap<>();
        data.put("name","12345");

        HashMap<String,Object> newdata=new HashMap<>();
        newdata.put("name","789");

        params.put("data",data);
        params.put("newdata",newdata);


        StringBuffer sb_json = new StringBuffer("if (ctx.op == \"create\") ctx._source=data; else ctx._source=newdata");
        Script script = new Script(sb_json.toString(), ScriptService.ScriptType.INLINE, "groovy", params);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setScript(script);
        urb.setScriptedUpsert(true);
        urb.setUpsert("{}");//This value must be present, otherwise a document missing exception will be reported
        urb.execute().actionGet();
        System.out.println("Update finished...");
````



curl method

````
http://192.168.201.5:9200/active2018-03-11/active/11/_update
````


Then select the raw type as JSON (application/json) in the following body:
````
{
    "scripted_upsert":true,
    "script" : {
        "script":"if (ctx.op == \"create\") ctx._source=data; else ctx._source=newdata ",
        "params" : {
            "data":{
                "ct":11,
                "aid":"a22",
                "tid":"t11"
            },
            "newdata":{
                "ct":1000,
                "aid":"a2qq2",
                "tid":"qq"
            }
        }
        
    },
    "upsert" : {}
}
````
Execute the above script, first check whether the index exists, if not, create a new index, and then judge whether the data with id equal to 11 exists or not, if not, use the data in data as the first inserted data , if it already exists, the original data will be deleted and then the data of newdata will be inserted, which can be understood as an update. It should be noted here that if you use dynamic mapping, you need to pay attention to the type of data. The same field in the two data under dynamic mapping can have different types, which is both flexible and brings risks. Therefore, it is recommended for data of strict types. With static mapping, the type of the field is strictly limited.


(4) doc_as_upsert method:

This method is actually a concise version of the first two, which means that if there is no insertion, it will be overwritten. Note that this is overwriting instead of inserting the original deletion, and if it is dynamic mapping, the field can also be changed. type, but this is not recommended.

java api:

````
`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","234");
        data.put("age",123);
        data.put("address","Beijing Haidian District");
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setDoc(data);
        urb.setDocAsUpsert(true);

        urb.execute().actionGet();//
        System.out.println("The operation succeeded...");
````


curl method:

````
http://192.168.201.5:9200/active2018-03-11/active/12/_update
{
    "doc" : {
        "name" : "6755",
        "age":12,
        "address":"Beijing Chaoyang"
        
    },
    "doc_as_upsert" : true
}
````



Summary:


There are several methods of update operation es above. Generally speaking, the method of using script update is the most powerful, and it can do some operations in complex business scenarios, such as the accumulation of numerical values ​​or the addition or deletion of elements of the operation set object, and several others. The method is suitable for simple update operations.

No matter which update method is used, we all need to consider concurrency issues. Through the introduction of the previous series of articles, we know that updates and deletions in es are pseudo operations, especially updates. The actual processing flow in es is:

(1) Query the old document data

(2) Modify the latest data

(3) Then rebuild the entire document


in the three stages here, if another process is modifying the data at the same time, a conflict will occur, es It is based on the version field to judge whether there is a conflict. In the first step of the above steps, the old data will be queried to get the version field. When writing in the third step, the version field will be returned. At this time, if you find If the version is inconsistent, a conflict will occur and an exception will be thrown, so when you use it, you can give priority to avoiding multi-threaded operations by design. If it is unavoidable, you can use the version field provided in es to control the concurrency problem through optimistic locking. If The operation is simple accumulation or reduction, and a simpler method of conflict retry can be used to solve the concurrency problem. In one sentence, it is a specific analysis of the specific scene. You can organize another article on the es concurrency problem later if you have time.


If you have any questions, you can scan the code and follow the WeChat public account: I am the siege division (woshigcs), leave a message in the background for consultation. Technical debts cannot be owed, and health debts cannot be owed. On the road of seeking the Tao, walk with you.




Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326185104&siteId=291194637