Evaluating EndtoEnd Optimization for Data Analytics Applications in Weld

Reference, see actuators paper from Weld optimization techniques

issues that need resolving,

Current data analysis applications, will use a lot of libraries, such as Numpy, Pandas, TensorFlow, Spark, etc.

Interfaces and data structures of these libraries are not the same, so if you want to improve the performance of your application, you can only go one by one to enhance the performance of each libaries, but this is difficult to optimize the

The lack of an end to end in an optimized manner,

Weld is to do this, of course, Weld only consider performing in stand-alone memory optimization

The general idea, Weld provides a set of IR, IR set according to each bank needs to rewrite the operator, then the execution time is lazy, and only when really needed will really perform

Each library will not really go inside to perform, but to return to IR out, Weld has a RunTime, IR will collect all the libraries, the formation of Combined IR, so it does not matter, and those libraries

Weld there optimizer, Combined IR will be optimized End to End, and finally generate machine code

The idea is good,

But the premise is that every library should porting to Weld on the job, this should be difficult to achieve

 

weld IR

IR design is critical, as is the relational database relational algebra, is the cornerstone of the whole

 

 

IR basic element contains,

Datatype

 

Computation

Operator, here called the builder

 

Here some called builder, and some called merger, there is no essential difference

All builder support, three interfaces

merge, which add data to the builder

result, builder of the results obtained, once called result, destroy the builder is only called once

for, for parallel

IR can see this is really very simple,

Look at an example, is one such,

 

 

 

Benefits Weld IR, is able to express those libraries operator previously mentioned, mentioned here fuse, because the loop and separated from the builder, it is easier to fuse

Look at an example,

For the above map itself contains the loop, so call the two map, we need to loop twice

But with Weld IR, for a trip, only one loop

 

 

Weld Runtime

Stressed that the lazy, IR each of the libraries will be executed, submit it by Runtime API

 

Listed below, RuntimeAPI, and examples of the use of a

Only time will really execute the call to Evaluate

 

 

Weld Optimizer

We can see the optimizer database optimizer and more like

And it is divided into a rule-based adaptive

 

 

Rule-based optimization

The main thing is Fusion, the implementation of optimization is the most common word

 

fusion, there are two, one of which is pipelining, can be seen after pipelining do not need to produce v1

There is also a horizontal Fusion, the same input, output different

 

 

 There is a more crucial optimization, it is to quantify

 

 

 

Adaptive optimization, somewhat similar to the CBO, but in fact we do not get here too much data to determine

Mentioned here, adaptive predication and adaptive data structures

 

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/fxjwind/p/12511081.html