Analysis of three storage codes
Survive by day and develop by night.
talk for import biz , show your perfect code,full busy,skip hardness,make a better result,wait for change,challenge Survive.
happy for hardess to solve denpendies.
Table of contents
overview
Web crawling is a very common need.
need:
Design ideas
Realize thinking analysis
1. Single batch type
private void processSingle(List<Map<String, Object>> list1) {
//1.遍历
for (int i=0;i<list1.size();i++){
//4.转化对应的map记录
Map<String, Object> dataMap = list1.get(i);
Map<Object,Object> dm=new HashMap<>();
for (Map.Entry<String, Object> entry : dataMap.entrySet()) {
dm.put(lineToHump(entry.getKey()), entry.getValue());
dm.put("description","描述");
dm.put("year",2008);
dm.put("trxId","交易ID");
dm.put("contractNo","12332131");
dm.put("deadline",12332L);
}
ArcDocument arcDocument =arcDocumentConvert.convert(dm);
arcDocumentService.createDoc(arcDocument);
log.info("Do create action, id={}"+" 记录数={}", arcDocument.getId(),i);
}
}
2. Batch type
private void processBatch(List<Map<String, Object>> list1) {
ArrayList<ArcDocument> docList=new ArrayList<>();
//1.遍历
for (int i=0;i<list1.size();i++){
//4.转化对应的map记录
Map<String, Object> dataMap = list1.get(i);
Map<Object,Object> dm=new HashMap<>();
for (Map.Entry<String, Object> entry : dataMap.entrySet()) {
dm.put(lineToHump(entry.getKey()), entry.getValue());
dm.put("description","描述");
dm.put("year",2008);
dm.put("trxId","交易ID");
dm.put("contractNo","12332131");
dm.put("deadline",12332L);
}
ArcDocument arcDocument =arcDocumentConvert.convert(dm);
docList.add(arcDocument);
log.info("batch action, id={}"+" 记录数={}", arcDocument.getId(),i);
}
arcDocumentService.insertBatch(docList);
}
Asynchronous code style:
/**
* Elasticsearch数据导入
*/
public void addElasticsearchData(List<Map<String, Object>> addEsDataMapList) {
//获取连接
RestHighLevelClient client = restHighLevelClient();
try {
//创建请求
BulkRequest bulkRequest = new BulkRequest();
//创建index请求 千万注意,这个写在循环外侧,否则UDP协议会有丢数据的情况,看运气
IndexRequest requestData = null;
Map<Object,Object> dataMap=new HashMap<>();
for (Map<String, Object> addEsDataMap : addEsDataMapList) {
//添加数据
for (Map.Entry<String, Object> entry : addEsDataMap.entrySet()) {
dataMap.put(lineToHump(entry.getKey()), entry.getValue());
dataMap.put("description","描述");
dataMap.put("year",2008);
dataMap.put("trxId","交易ID");
dataMap.put("contractNo","12332131");
dataMap.put("deadline",12332L);
}
ArcDocument arcDocument =arcDocumentConvert.convert(dataMap);
requestData=new IndexRequest(arc_document, "_doc", dataMap.get("arcId").toString()).source(arcDocument, XContentType.JSON);
bulkRequest.add(requestData);
}
log.info("es同步数据数量:{}", bulkRequest.numberOfActions());
//设置索引刷新规则
bulkRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
//分批次提交,数量控制
if (bulkRequest.numberOfActions() >= 1) {
// BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
// log.info("es同步数据结果:{}", bulkResponse.hasFailures());
BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
if(bulkResponse.hasFailures()){
log.info("数据写入失败:{}",bulkResponse.buildFailureMessage());
}else {
log.info("实时消息es写入成功");
}
}
} catch (Exception e) {
e.printStackTrace();
log.error("es同步数据执行失败:{}", addEsDataMapList);
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Extended implementation
Reference here: github: Simple implementation of the above process:
Entry-level implementation:
: [Partial source code implementation]
: Source code implementation
Performance parameter test:
References and Recommended Reading
- no yet
Welcome to read, old irons, if it is helpful to you, please like and follow! ~