"EalsticSearch from entry to actual combat" - CRUD+JAVA common operations

Table of contents

"EalsticSearch from entry to actual combat"

  1. Install elasticsearch+kibana in windows environment and complete JAVA client query
  2. "EalsticSearch from entry to actual combat" - CRUD+JAVA common operations

foreword

In the previous article " Installing elasticsearch+kibana in windows environment and completing JAVA client query ", we have completed the installation of EalsticSearch+ environment and completed the data query using JAVA Client. This article mainly introduces and analyzes data , , , , and indexes , , and other operationsKibanaEalsticSearch RestFull ApiJAVA Client增加修改删除批量增加批量删除增加删除重建

index management

create index

We use PUT /indexNameto create an index, if there is no index EalsticSearch, we will create an index by default, but the created index may not meet our needs, so it is best to disable automatic index creation. Adding elasticsearch.ymlconfiguration items to disable automatic index creation action.auto_create_index:false.

PUT /article
{
  "settings": {
    "number_of_shards": 6,
    "number_of_replicas": 1,
    "refresh_interval": "1s",
    "max_result_window":"20000"
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "title": {
        "type": "text",
        "analyzer": "standard"
      },
      "tags": {
        "type": "keyword"
      },
      "read_count": {
        "type": "long"
      },
     "like_count": {
        "type": "long"
      },
     "comment_count": {
        "type": "long"
      },
      "rank": {
        "type": "double"
      },
      "location": {
          "type": "geo_point"
        },
      "pub_time": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd HH:mm||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}

We articlesave the configuration parameters into the directory article.jsonof the project resourcesand use JAVAthe operation to create an index:

    @Test
    public void testCreateIndex() throws IOException {
    
    
        URL url = this.getClass().getClassLoader().getResource("article.json");
        var indexSetting= Files.readString(Path.of(url.getPath().substring(1)), Charset.defaultCharset());
        JSONObject json=JSON.parseObject(indexSetting);
        JSONObject settings=json.getJSONObject("settings");
        JSONObject indexMapping=json.getJSONObject("mappings");
        CreateIndexRequest request = new CreateIndexRequest("article").mapping(indexMapping.toString(), XContentType.JSON).settings(settings);
        CreateIndexResponse response = esClient.indices().create(request, RequestOptions.DEFAULT);
    }

After the index is created, we can view the configuration of the index with the following command:

GET /article/_mapping
GET /article/_settings

image.png

Here we need to pay attention to four configuration items number_of_shards, number_of_replicas, refresh_interval, max_result_window, for more configuration, please refer to the official document ( settings configuration )

SeetingsDivided into 静态and 动态, after the static configuration item is set, it cannot be modified, such as number_of_shards, number_of_replicasfor the static configuration item, the dynamic configuration item can be dynamically modified at any time after the index is created, such as refresh_interval,max_result_window

  • number_of_shardsThe number of fragments
    The number of fragments should be equal to the multiple of the data nodes in the cluster environment, so that the index fragments can be evenly distributed to different nodes, and the machine load is relatively balanced. For example, if there are 3 data nodes in your cluster, then you can number_of_shardsset 3, 6, 9, ... 3N.

  • number_of_replicasThe number of copies
    number_of_replicasis the number of backups of the shards. If you do not have high requirements for data security, high write speed requirements, and do not want to spend too much storage cost, you can set it to 0, and it is generally recommended to set it to 1.

  • refresh_intervalThe default refresh interval
    is 10S. The refresh interval after data writing is not high. If the query real-time performance is not high after data writing, the default can be used, and it cannot be set too small. Too frequent refreshes will affect the data writing speed. Here it is set to 1S .

  • max_result_windowThe maximum number of returned items, the default is 10,000.
    This parameter is elasticSearchthe maximum number of data items that can be found when querying data. The default is 10,000. If you customize more than 10,000 items, you need to increase it when "track_total_hits": truequerying java. searchSourceBuilder.trackTotalHits(true), otherwise only 10,000 pieces of data can be found at most.

image.png
Dynamic reference can also be set by Kibanaselecting Index Managementthe corresponding index.

delete index

Deleting an index is relatively simple

DELETE /article

JAVA code

    @Test
    public void deleteIndex() throws IOException {
    
    
        DeleteIndexRequest request = new DeleteIndexRequest("article");
        esClient.indices().delete(request, RequestOptions.DEFAULT);
    }

It is very important to note here that in a production environment, such a small command deletes all the data in the index in an instant, which is a very dangerous operation.

image.png
In order to prevent this dangerous operation of deleting databases, we can Kibanacreate a role configuration operation permission in , and then assign the corresponding role to the user to disable this permission to delete the database.

update index

EalsticSearchThe index Mappingdoes not support modification. If you add fields, you need to rebuild the index. For the specific method of rebuilding the index, please refer to my other article " EalsticSearch Adding Fields and Rebuilding the Index Method "

data management

adding data

Single add:

POST /article/_doc/1
{
    
    
    "comment_count": 600,
    "id": 1,
    "like_count": 2000,
    "location": [
        118.55199,
        24.78144
    ],
    "pub_time": "2023-07-29 09:47",
    "rank": 0,
    "read_count": 10000,
    "tags": [
        "台风",
        "杜苏芮",
        "福建"
    ],
    "title": "台风“杜苏芮”登陆福建晋江 多部门多地全力应对"
}

Bulk add:

POST _bulk
{
    
    "create": {
    
    "_index": "news", "_id": 1}}
{
    
    "comment_count":600,"id":1,"like_count":2000,"location":[118.55199,24.78144],"pub_time":"2023-07-29 09:47","rank":0.0,"read_count":10000,"tags":["台风","杜苏芮","福建"],"title":"台风“杜苏芮”登陆福建晋江 多部门多地全力应对"}
{
    
    "create": {
    
    "_index": "news", "_id": 2}}
{
    
    "comment_count":60,"id":2,"like_count":200,"location":[116.23128,40.22077],"pub_time":"2023-06-29 14:49:38","rank":0.0,"read_count":1000,"tags":["台风","杜苏芮","北京"],"title":"受台风“杜苏芮”影响 北京7月29日至8月1日将有强降雨"}
{
    
    "create": {
    
    "_index": "news", "_id": 3}}
{
    
    "comment_count":6,"id":3,"like_count":20,"location":[120.21201,30.208],"pub_time":"2020-07-29 14:49:38","rank":0.99,"read_count":100,"tags":["台风","杭州"],"title":"杭州解除台风蓝色预警信号"}

JAVA single data insertion:

    public void testAdd() throws IOException {
    
    
        News news=new News();
        news.setId(1L);
        news.setTitle("台风“杜苏芮”登陆福建晋江 多部门多地全力应对");
        news.setTags(Arrays.asList("台风;杜苏芮;福建".split(";")));
        news.setRead_count(10000L);
        news.setLike_count(2000L);
        news.setComment_count(600L);
        news.setRank(0.0);
        news.setLocation(List.of(118.55199,24.78144));
        news.setPub_time("2023-07-29 09:47");

        IndexRequest indexRequest=new IndexRequest("articel");
        indexRequest.id(news.getId().toString());
        indexRequest.source(JSON.toJSONString(news), XContentType.JSON);
        var index = esClient.index(indexRequest, RequestOptions.DEFAULT);
        System.out.println(index.getResult());
}

JAVA batch data insertion:

    @Test
    public  void testBuckAdd() throws IOException {
    
    
        List<News> news=new ArrayList<>();
        BulkRequest bulkRequest = new BulkRequest();
        news.forEach(x-> bulkRequest.add(new IndexRequest("article").id(x.getId().toString()).source(JSON.toJSONString(x), XContentType.JSON)));
        BulkResponse bulk = esClient.bulk(bulkRequest, RequestOptions.DEFAULT);
        System.out.println(bulk.getItems());
    }

There are a few things to note here:

  1. EalsticSearch does not support modification after data insertion
  2. Does not support modifying a field like Mysql
  3. When the amount of data inserted is relatively large, using batch insert will greatly improve the writing speed
  4. If you want to update some fields, you can use the " EalsticSearch Rebuild Index Method After Adding Fields " introduction _update_by_queryto specify to update some fields
POST /bucket_size_alias/_update_by_query
{
    
    
    "query": {
    
    
        "bool": {
    
    
            "must_not": {
    
    
                "exists": {
    
    
                    "field": "bucket_name"
                }
            }
        }
    },
    "script":{
    
    
    "inline" : "ctx._source.bucket_name= 'default_bucket_name'",
    "lang" : "painless"
  }
}

delete data

delete interface

DELETE /article/_doc/1

Note:
Deleting data in bulk can be POST /article/_delete_by_querydone by

Java deletes a single piece of data:

    @Test
    public   void deleteById() throws IOException {
    
    
        DeleteRequest deleteRequest=new DeleteRequest("article","1");
        deleteRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
        esClient.delete(deleteRequest, RequestOptions.DEFAULT);
    }

Java deletes data in batches:

   @Test
    public   void deleteByIds() throws IOException {
    
    
        List<String> ids=new ArrayList<>();
        BulkRequest bulkRequest = new BulkRequest();
        ids.forEach(x-> bulkRequest.add(new DeleteRequest("article",x)));
        BulkResponse bulk = esClient.bulk(bulkRequest, RequestOptions.DEFAULT);
    }

update data

Update a field:

POST /article/_update_by_query
{
    
    
  "script":{
    
    
    "source": "ctx._source['title'] = \"测试只更新标题\""
  }, 
  "query": {
    
    
    "term": {
    
    
      "_id": {
    
    
        "value": "1"
      }
    }
  }
}

Java updates a field:


    @Test
    public   void updateDate() throws IOException {
    
    
        UpdateRequest updateRequest=new UpdateRequest("article","1");
        XContentBuilder builder = XContentFactory.jsonBuilder();
        builder.startObject();
        builder.field("title", "测试只更新标题");
        builder.endObject();
        updateRequest.doc(builder);
        updateRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
        esClient.update(updateRequest, RequestOptions.DEFAULT);
    }

image.png

Note, if we PUTpass only one title, the entire document titlewill be deleted except for other fields. EalsticSearchIt does not support a single field update, update_by_queryand JAVA Client UpdateRequestit just helps you find out the original data EalsticSearchfrom it and merge it with your input, and then Just plug it in.

PUT /article/_doc/1
{
    
    
    "title": "测试只更新标题"
}

image.png

Query data

Query by ID

GET /article/_doc/1

JAVA query operation

    @Test
    public void getById() throws IOException {
        GetRequest request = new GetRequest("article", "1");
        GetResponse getResponse = esClient.get(request, RequestOptions.DEFAULT);
        var sourceAsMap = getResponse.getSourceAsMap();
        sourceAsMap.forEach((k, v) -> System.out.println(k + ":" + v));
    }

JAVA batch query:

    @Test
    public void getByIds() throws IOException {
    
    
        List<String> ids = List.of("1", "2", "3");
        MultiGetRequest request = new MultiGetRequest();
        for (String id : ids) {
    
    
            request.add("article", id);
        }
        List<Map<String, Object>> sourceAsMap = new ArrayList<>();
        MultiGetResponse getResponse = esClient.mget(request,RequestOptions.DEFAULT);
        MultiGetItemResponse[] responses = getResponse.getResponses();
        if (responses != null) {
    
    
            for (MultiGetItemResponse response : responses) {
    
    
                sourceAsMap.add(response.getResponse().getSourceAsMap());
            }
        }
        System.out.println(JSON.toJSONString(sourceAsMap));
    }

Summarize

This article mainly introduces EalsticSearch RestFull Apiand operates JAVA Clienton data 增加, 修改, 删除, 批量增加, 批量删除and indexes 增加, 删除, and so on. 重建At the same time, it also introduces the precautions during the index creation process and the update data process. EalsticSearchThe most important thing is the search, and the next chapters will introduce the use of the search function one by one.

Guess you like

Origin blog.csdn.net/whzhaochao/article/details/132009987