1. valid data prepared in advance a single number pool, pulls data processed by a single number
Single Table No. 1 by default
01 findAndModify state number table update unit 2 to read a single read cycle No. 100
02 by waybill number to obtain data batch query Aladin_WayBillStatus table
03 splicing new SQL statement
04 batches submitted to Hbase
05 batch update a single table number 3 states
Advantages as
Simple and crude, development of simple not more than 200 lines of code, should atomic findAndModify N nodes may be deployed.
Shortcoming way
Efficiency is not high, and almost no space to enhance the optimization, the use of multi-threading to obtain a single number but will be more time-consuming.
The efficiency depends on the capacity of acquiring data table.
According to the pressure existing database
Table 2. advance period, the brush data period.
01 acquires a random period findAndModify
02 pulling a batch of data by time period
03 splicing new SQL statement
04 batches submitted to Hbase
05 batch update time table for the state 3
Advantages as
Efficiency will improve a lot more than 01 ways.
Since findAndModify can multi-node deployments.
Shortcoming way
Every time the amount of data acquired is not controllable, the amount of time the peak of the business segment data can be very large, low peak volume of data traffic is very small, the time period will be very troublesome generation rules
According to the pressure existing database
3. Query cursor scan data by mongodb.
The default is to find queries from the oldest data begins.
_id can use $ gt query _id are ordered.
public void test_2(ObjectId o) { DBCursor s; if (o == null) { s = mt.getCollection("orderid").find(); } else { DBObject lisi = new BasicDBObject(); lisi.put("_id", new BasicDBObject("$gt", o)); s = mt.getCollection("orderid").find(lisi); } try { while (s.hasNext()) { DBObject item = s.next(); o = (ObjectId) item.get("_id"); String me = ((BasicDBObject) item).toJson(); mq.send(new Message("mgtomq", me.getBytes(RemotingHelper.DEFAULT_CHARSET))); System.out.println(o); } } catch (Exception e) { test_2(o); } }
Advantages as:
According to the database will not too much pressure.
Data is read from the old to the new data.
Way Disadvantages:
Can not deploy multiple nodes, data acquisition and processing with data processing efficiency is not high.
Solution: By decoupling messaging middleware, read data, production information, processing consumption data provided every 100 messages.
, This approach fails to read data datex way is used, but the idea of the same.
If no data is processed, it can be used directly datex