进阶-第85__熟练掌握ES Java API_基于scroll实现月度销售数据批量下载

 

比如说,现在要下载大批量的数据,从es,放到excel中,我们说,月度,或者年度,销售记录,很多,比如几千条,几万条,几十万条

 

其实就要用到我们之前讲解的es scroll api,对大量数据批量的获取和处理

插入测试数据

PUT /car_shop/sales/4

{

    "brand": "宝马",

    "name": "宝马320",

    "price": 320000,

    "produce_date": "2017-01-01",

    "sale_price": 280000,

    "sale_date": "2017-01-25"

}

 

 

就是要看宝马的销售记录

 

2条数据,做一个演示,每个批次下载一条宝马的销售记录,分2个批次给它下载完

Scroll 查询测试

Settings settings = Settings.builder()
      .put("cluster.name", "elasticsearch")
      .build();

TransportClient client = new PreBuiltTransportClient(settings)
      .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));

SearchResponse searchResponse = client.prepareSearch("car_shop")
      .setTypes("sales")
      .setQuery(QueryBuilders.termQuery("brand.keyword", "宝马"))
      .setScroll(new TimeValue(60000))
      .setSize(1)
      .get();

int batchCount = 0;

do {
   for(SearchHit searchHit : searchResponse.getHits().getHits()) {
      System.out.println("batch: " + ++batchCount);
      System.out.println(searchHit.getSourceAsString()); 
      // 每次查询一批数据,比如1000行,然后写入本地的一个excel文件中
      // 如果说你一下子查询几十万条数据,不现实,jvm内存可能都会爆掉
   }
  
   searchResponse = client.prepareSearchScroll(searchResponse.getScrollId())
         .setScroll(new TimeValue(60000))
         .execute()
         .actionGet();
} while(searchResponse.getHits().getHits().length != 0);

client.close();

 

 

猜你喜欢

转载自blog.csdn.net/qq_35524586/article/details/88706217
今日推荐