Several ways to update data in elasticsearch

As a mature framework, Elasticsearch provides rich APIs for manipulating data. In this article, we will learn several ways to update data in es.

(1) Update documents

(1) Partial update:

java api:

`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","woshigcs");
        data.put("age",25);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-21", "active", "18");
        urb.setDoc(data);
        urb.execute().actionGet();

        System.out.println("update ok......");

Pay attention to the partial update function, the premise is that the index and the data already exist, otherwise the corresponding exception will be thrown, as long as any one of them is not satisfied, the update will fail.

curl:

curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "doc" : {
        "name" : "new_name"
    }
}

(2) Use detect_noop

java api:

`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","woshigcs");
        data.put("age",25);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-21", "active", "18");
        urb.setDoc(data);
        urb.setDetectNoop(false);//默认是true
        urb.execute().actionGet();

        System.out.println("update ok......");

curl method:

curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "doc" : {
        "name" : "new_name"
    },
    "detect_noop": false
}'

Note the meaning of detect_noop:

By default detect_noop=true

By default, the index will be rebuilt only if the original source and the new source have different fields. If they are exactly the same, the index will not be rebuilt. If detect_noop=false, the index will be rebuilt regardless of whether the content has changed. This can be done by changes in the value of version to discover

The updated document must exist in advance, unless you use the update+script to update, otherwise a document missing exception will be reported

(2) Script + upset update method:

java api


 `       HashMap<String,Object> params=new HashMap<>();
        HashMap<String,Object> data=new HashMap<>();
        data.put("name","12345");
        params.put("source",data);

        StringBuffer sb_json = new StringBuffer("ctx._source=source");
        Script script = new Script(sb_json.toString(), ScriptService.ScriptType.INLINE, "groovy", params);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setScript(script);
        urb.setUpsert(data);
        urb.execute().actionGet();
        System.out.println("更新完事。。。。。。 ");

curl

curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
    "script" : {
        "inline": "ctx._source.counter += count",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 1
    }
}'

(3): scripted_upsert usage:

The examples on the official website did not work. The following is rewritten according to the example above on stackoverflow.

It has been run through in postman:

The first is the url of the post request

java api:

`       HashMap<String,Object> params=new HashMap<>();
        HashMap<String,Object> data=new HashMap<>();
        data.put("name","12345");

        HashMap<String,Object> newdata=new HashMap<>();
        newdata.put("name","789");

        params.put("data",data);
        params.put("newdata",newdata);


        StringBuffer sb_json = new StringBuffer("if (ctx.op == \"create\") ctx._source=data; else ctx._source=newdata");
        Script script = new Script(sb_json.toString(), ScriptService.ScriptType.INLINE, "groovy", params);
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setScript(script);
        urb.setScriptedUpsert(true);
        urb.setUpsert("{}");//必须有这个值,否则会报document missing exception
        urb.execute().actionGet();
        System.out.println("更新完事。。。。。。 ");

curl method

http://192.168.201.5:9200/active2018-03-11/active/11/_update

Then select the raw type as JSON (application/json) in the following body:

{
    "scripted_upsert":true,
    "script" : {
        "script":"if (ctx.op == \"create\") ctx._source=data; else ctx._source=newdata ",
        "params" : {
            "data":{
                "ct":11,
                "aid":"a22",
                "tid":"t11"
            },
            "newdata":{
                "ct":1000,
                "aid":"a2qq2",
                "tid":"qq"
            }
        }
        
    },
    "upsert" : {}
}

Execute the above script, first check whether the index exists, if not, create a new index, and then judge whether the data with id equal to 11 exists or not, if it does not exist, the data in data will be used as the first inserted data , if it already exists, the original data will be deleted and then the data of newdata will be inserted, which can be understood as an update. It should be noted here that if you use dynamic mapping, you need to pay attention to the type of data. The same field in the two data under dynamic mapping can have different types, which is both flexible and brings risks. Therefore, it is recommended for data of strict types. With static mapping, the type of the field is strictly limited.

(4) doc_as_upsert method:

This method is actually a concise version of the first two, which means that if there is no insertion, it will be overwritten. Note that this is overwriting instead of inserting the original deletion, and if it is dynamic mapping, the type of the field can also be changed, but it is not recommended. use.

java api:

`       HashMap<String,Object> data=new HashMap<>();
        data.put("name","234");
        data.put("age",123);
        data.put("address","北京海淀区");
        UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
        urb.setDoc(data);
        urb.setDocAsUpsert(true);

        urb.execute().actionGet();//
        System.out.println("操作成功......");

curl method:

http://192.168.201.5:9200/active2018-03-11/active/12/_update
{
    "doc" : {
        "name" : "6755",
        "age":12,
        "address":"北京朝阳"
        
    },
    "doc_as_upsert" : true
}

Summarize:

There are several methods of updating and operating es above. Generally speaking, the method of using script update is the most powerful. It can do some operations in complex business scenarios, such as the accumulation of numerical values ​​or the addition or deletion of elements of the operation collection object. Several other methods are suitable for Simple update operation.

No matter which update method is used, we all need to consider concurrency issues. Through the introduction of the previous series of articles, we know that updates and deletions in es are pseudo operations, especially updates. The actual processing flow in es is:

(1) Query the old document data

(2) Modify to the latest data

(3) Then rebuild the entire document

In the three stages here, if another process is modifying the piece of data at the same time, a conflict will occur. In es, the version field is used to determine whether there is a conflict. In the first step of the above steps, the old data is queried. The version field will be obtained, and the version field will be returned when writing in the third step. At this time, if the version is found to be inconsistent, a conflict will occur and an exception will be thrown, so you can give priority to avoiding multithreading when using it. Operation, if it is unavoidable, you can use the version field provided in es to control the concurrency problem through optimistic locking. If the operation is a simple accumulation or subtraction, you can use a simpler method to conflict and retry to solve the concurrency problem, a sentence The words are the specific analysis of specific scenarios. You can organize another article on the es concurrency issue later if you have time.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324391758&siteId=291194637