Flink implement asynchronous AsyncIO the (source code analysis)

Photo first on the overall understanding of asynchronous io Flink

 

Ali Gong dedicated flink, advantages not say Well, the official line has, is to write library does plunger better performance

Then look at, Flink is divided into two main asynchronous io

  One is ordered Ordered

  One is disordered UNordered

The main difference is output to the downstream sequence of (Note that this order is not sequential write library since both asynchronous sequential write library of nature is not guaranteed), ordered to downstream output will continue to send the order received, who is out of order transmitting the first processed prior to the downstream whoever

Two pictures to know the implementation of these two models

 

Ordered: record data will be written by an asynchronous thread library, Emitter is a daemon that will not stop pulling the head of the queue data, if the data asynchronously write head of the library is completed, Emitter will head to the downstream data transmission, if the head asynchronous write library elements have not yet completed, the plunger      

Disorder: record data will be written by an asynchronous thread library, there are two queue, beginning on uncompleteedQueue, which record when the asynchronous write directly into the library after the success in completedQueue, Emitter is a daemon, completedQueue as long as there is data, pulling queue will continue to transmit data downstream 

    

We can see the principle is very simple, two words to sum over, is the use of asynchronous java queue and thread, and now look at the source code

Here AsyncIO in Flink are designed as an operator in a natural OneInputStreamOperator implementation class to look

Then take a look AsyncWaitOperator.java

  

See it open method (open method will be called when all unified taskmanager starting job, and can turn about the previous article)

Here starts a daemon thread Emitter, look at what has been done specifically thread

 

 At a pull data, at 2 it is conventional to pull the data downstream to emit, Emitter pull data, because there will not speak into ordered and disordered

 This effect has been known here Emitter pulls data is transmitted to the downstream circulation

 Back AsyncWaitOperator.java in its open method initializes Emitter, then it is how to handle the data it received to see if it's ProcessElement () method

 

    

 

 In fact, the main method is three months

First! ! ! The record became a package wrapper class StreamRecordQueueEntry, mainly constructor of this wrapper class, create a CompleteableFuture (this method will actually wait until the complete code is executed when the user of the user to decide when to complete)

At 1 mainly talking about adding elements to the corresponding queue, there are also two types of order and disorder

 

These two modes do not speak here first while adding the difference data

2 is followed by the code of the calling user, the asynchronous network Tell me look io example

 

 Gave a Future as a parameter, the user himself a thread (here, think about why you want to know starting a new thread to perform asynchronous, because if the thread can not afford, then processElement method on the plunger, and not the asynchronous) to write library reading library, etc., and then call the complete method parameters (that is, CompleteableFuture front of the package in the class) and was introduced to a result

Facie complete source method

 

 The record of each resultFuture packaging StreamRecordQueueEntry One of the properties is a CompletableFuture

 It is now clear, after the user code when their logical end would complete the implementation of this asynchronous thread in their starting a new thread and enter the result

 That this why use it

 

Figure see the beginning of the realization of the principles of order and disorder, with a queue orderly, disorderly queue with two correspond respectively to the

OrderedStreamElementQueue class

 

 UnorderedStreamElementQueue class

 

Back to the front there are two places not go into detail, one Emitter two models of how to pull data, and second, two modes of data is how to join OrderedStreamElementQueue

Ordered mode:

1. The first look orderly pattern, pulling Emitter data, and added data

    其tryPut()方法

      

      

     onComplete方法

       

       onCompleteHandler方法

        

  这里比较绕,先将接收的数据加入queue中,然后onComplete()中当上一个异步线程getFuture() 其实就是每个元素包装类里面的那个CompletableFuture,当他结束时(会在用户方法用户调用complete时结束)异步调用传入的对象的 accept方法,accept方法中调用了onCompleteHandler()方法,onCompleteHandler方法中会判断queue是否为空,以及queue的头元素是否完成了用户的异步方法,当完成的时候,就会将headIsCompleted这个对象signalAll()唤醒

 

2.接着看有序模式Emitter的拉取数据

       

   这里有序方式拉取数据的逻辑很清晰,如果为空或者头元素没有完成用户的异步方法,headIsCompleted这个对象会wait住(上面可以知道,当加入元素的到queue且头元素完成异步方法的时候会signalAll())然后将头数据返回,往下游发送

 

这样就实现了有序发送,因为Emitter只拉取头元素且已经完成用户异步方法的头元素

 

无序模式: 

  这里和有序模式就大同小异了,只是变成了,接收数据后直接加入uncompletedQueue,当数据完成异步方法的时候就,放到completedQueue里面去并signalAll(),只要completedqueue里面有数据,Emitter就拉取往下发

 

这样就实现了无序模式,也就是异步写入谁先处理完就直接放到完成队列里面去,然后往下发,不用管接收数据的顺序

 

Guess you like

Origin www.cnblogs.com/ljygz/p/11864176.html