One, the problems encountered.
In the work project, after the data processing is completed, the merged data is put into a LinkedTransferQueue, and there will be some long threads to continuously get and write from the queue to the database.
This results in the following problems:
1) Because LinkedTransferQueue is unbounded, and using transfer + take to solve part of the problem, but when the number of data processing threads running inside the process is large, there is no limit, and Full GC will occur.
2) LinkedTransferQueue is lock-free, but the internal data structure uses LinkedList. Constantly adding and deleting elements will cause duplicate gc
Based on the above problems, think of two ways:
1) Implement a subclass of LinkedTransferQueue to assist in control at the outer layer. (The idea comes from Motan, Tomcat's thread pool design)
Attached is a modification of LinkedTransferQueue in Motan thread pool.
Reference folding source code for queue in Motan thread pool
class ExecutorQueue extends LinkedTransferQueue<Runnable> {
private static final long serialVersionUID = -265236426751004839L;
StandardThreadExecutor threadPoolExecutor;
public ExecutorQueue() {
super ();
}
public void setStandardThreadExecutor(StandardThreadExecutor threadPoolExecutor) {
this .threadPoolExecutor = threadPoolExecutor;
}
// 注:代码来源于 tomcat
public boolean force(Runnable o) {
if (threadPoolExecutor.isShutdown()) {
throw new RejectedExecutionException( "Executor not running, can't force a command into the queue" );
}
// forces the item onto the queue, to be used if the task is rejected
return super .offer(o);
}
// 注:tomcat的代码进行一些小变更
public boolean offer(Runnable o) {
int poolSize = threadPoolExecutor.getPoolSize();
// we are maxed out on threads, simply queue the object
if (poolSize == threadPoolExecutor.getMaximumPoolSize()) {
return super .offer(o);
}
// we have idle threads, just add it to the queue
// note that we don't use getActiveCount(), see BZ 49730
if (threadPoolExecutor.getSubmittedTasksCount() <= poolSize) {
return super .offer(o);
}
// if we have less threads than maximum force creation of a new
// thread
if (poolSize < threadPoolExecutor.getMaximumPoolSize()) {
return false ;
}
// if we reached here, we need to add it to the queue
return super .offer(o);
}
}
|
2) Jammer.
1. Disruptor adopts a lock-free design. Speed up
Each producer or consumer thread will first apply for the position of the operable element in the segment. After the application is obtained, it will directly write or read data at that position.
2. Disruptor adopts ring structure. Avoid gc, the caching mechanism makes the speed change. By setting RingBuffer to control the buffer size, a series of unbounded problems are avoided.
Avoid, garbage collection, use double instead of linked list. At the same time, the cache mechanism of the processor is more friendly.
3. Positioning of elements.
If the index length is 2, the length is 2^n, and the positioning speed can be accelerated through bit operation. Subscripts are in incremental form. Don't worry about index overlap. .
Second, the use of Disruptor.
1. Demonstration
1) Define the message class to be delivered.
Message class folding original code
public class QueueMessage implements Serializable {
public QueueMessage(){}
private List<String> messageList;
public void setMessageList(List<String> messageList){
this .messageList = messageList;
}
public List<String> getMessageList(){
return this .messageList;
}
}
|
2) Production
Production Folding Original Code
public class DisrutorProducerDemo {
private final RingBuffer<QueueMessage> ringBuffer;
public DisrutorProducerDemo(RingBuffer<QueueMessage> ringBuffer) {
this .ringBuffer = ringBuffer;
}
public void onData(String orderId) {
long sequence = ringBuffer.next();
try {
QueueMessage message = ringBuffer.get(sequence);
List<String> messageList = new ArrayList<String>();
messageList.add(orderId);
message.setMessageList(messageList);
} finally {
ringBuffer.publish(sequence);
}
}
}
|
3) Consumer
Consumer Folding Original Code
public class EventHandlerDemo implements EventHandler<QueueMessage>, WorkHandler<QueueMessage> {
public void onEvent(QueueMessage event, long sequence, boolean endOfBatch) throws Exception {
System.out.println( "data:" + event.getMessageList().get( 0 ) + ", sequence: " + sequence + ", endOfBatch:" + endOfBatch);
}
public void onEvent(QueueMessage event) throws Exception {
System.out.println( "event:" + event.getMessageList().get( 0 ));
}
}
|
4) Combined use:
Disruptor组合使用折叠原码
public class DisrutorMainDemo {
public static void main(String[] args) throws InterruptedException {
Disruptor<QueueMessage> disruptor = new Disruptor<QueueMessage>(
QueueMessage:: new ,
1024 * 1024 ,
Executors.defaultThreadFactory(),
ProducerType.SINGLE,
new YieldingWaitStrategy()
);
//可以传多个handler,多个handler会重复消费每条数据
disruptor.handleEventsWith( new EventHandlerDemo());
//一条数据只能被一个handler消费
//disruptor.handleEventsWithWorkerPool(new EventHandlerDemo());
disruptor.start();
RingBuffer<QueueMessage> ringBuffer = disruptor.getRingBuffer();
DisrutorProducerDemo eventProducer = new DisrutorProducerDemo(ringBuffer);
for ( int i = 0 ; i < 100 ; i++) {
eventProducer.onData(UUID.randomUUID().toString());
}
}
}
|
2,Disruptor等待策略
名称
|
措施
|
适用场景
|
---|---|---|
阻塞等待策略 | 加锁 | CPU资源紧缺,骨折和延迟并不重要的场景 |
BusySpinWait策略 | 自旋 | 通过不断重试,减少切换线程导致的系统调用,而降低延迟。推荐在线程绑定到固定的CPU的场景下使用 |
分阶段回退等待策略 | 自旋+ yield +自定义策略 | CPU资源紧缺,骨折和延迟并不重要的场景 |
睡眠等待策略 | 自旋+屈服+睡眠 | 性能和CPU资源之间有很好的折中。连续不均匀 |
超时阻止等待策略 | 加锁,有超时限制 | CPU资源紧缺,骨折和延迟并不重要的场景 |
屈服等待策略 | 自旋+产量+自旋 | 性能和CPU资源之间有很好的折中。连续比较均匀 |
三,性能对比测试:
1,使用Disruptor前后性能对比结果
|
数据量
|
处理起始时间
|
处理结束时间
|
耗时
|
---|---|---|---|---|
优化前 | 1000W | 2020-08-25 18:57:46 | 2020-08-25 19:02:52 | 5.1分钟 |
优化后 | 1000W | 2020-08-26 11:54:10 | 2020-08-26 11:56:58 | 2.8115分钟 |