A comprehensive interpretation of the Delta Engine of the bricks

A comprehensive interpretation of the Delta Engine of the bricks

Past memory big data Past memory big data

On the first day of the Spark AI Summit meeting, DNB released the Delta Engine. This engine is 100% compatible with Apache Spark's vectorized query engine, and utilizes a modern CPU architecture to optimize the query optimizer and cache functions of Spark 3.0. These features significantly improve the query performance of Delta Lake. Of course, this engine is currently only available in Databricks Runtime 7.0.

The purpose of developing Delta Engine

In the past ten years, the storage speed has increased from 50MB/s (HDD) to 16GB/s (NvMe); the network speed has increased from 1Gbps to 100Gbps; but the CPU frequency has basically remained unchanged from 3GHz in 2010 to now.
A comprehensive interpretation of the Delta Engine of the bricks

If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

NVM Express (NVMe), or Non-Volatile Memory Host Controller Interface Specification (English: Non-Volatile Memory Host Controller Interface Specification, abbreviation: NVMHCIS), is a logical device interface specification. It is a bus transmission protocol specification based on device logic interface similar to AHCI (equivalent to the application layer in the communication protocol), which is used to access non-volatile memory media attached through the PCI Express (PCIe) bus (for example, using fast Solid-state drive with flash memory), although the PCIe bus protocol is not necessarily required in theory. Historically, most SSDs used interfaces such as SATA, SAS, or Fibre Channel to connect to a computer interface bus. With the popularity of solid state drives in the mass market, SATA has become the most typical way to connect SSDs in personal computers; however, SATA is designed mainly as an interface for mechanical hard disk drives (HDD), and it has become more and more over time. It is difficult to meet the increasing speed of SSD. With the popularity in the mass market, the data rate increase of many solid-state drives has slowed. Unlike mechanical hard drives, some SSDs are already limited by the maximum throughput of SATA.
Before the advent of NVMe, high-end SSDs could only be manufactured using the PCI Express bus, but they needed to use non-standard interfaces. If a standardized SSD interface is used, the operating system only needs one driver to use all SSDs that meet the specification. This also means that each SSD manufacturer does not have to use additional resources to design drivers for specific interfaces.

Excerpted from https://zh.wikipedia.org/zh-hans/NVM_Express

As can be seen from the above figure, the CPU frequency is an important bottleneck for current data analysis.

In addition, as business speeds increase, data teams have less and less time to model data correctly. Poor modeling for better business agility leads to poor query performance. such as

• Most columns do not define "NOT NULL"; • String is very convenient, so many people use String to store dates; • The data is becoming more and more irregular, and the data is constantly being generated.

Delta Engine: High-performance query engine

In order to solve some of the above problems, Digital Brick has specially developed the Delta Engine, which is specially used for accelerating data analysis and flexibly adapting to a variety of workloads. It can be seen from the figure below that Delta Engine mainly includes three components: an improved query optimizer, a cache layer between the execution layer and cloud object storage, and a native vectorized execution engine (Photon) written in C++, which can Accelerate the use of SQL and DataFrame to analyze the workload of Delta Lake.

A comprehensive interpretation of the Delta Engine of the bricks
If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

The query optimizer of Delta Engine extends the existing functions in Spark 3.0, including cost-based optimizer (CBO), adaptive query execution (adaptive query execution) and dynamic runtime filters (dynamic runtime filters), providing more Advanced statistical information provides up to 18 times performance improvement in star mode workloads.

The cache layer of Delta Engine automatically selects the input data to be cached for the user, and transcodes it in a CPU-efficient format to make better use of the increased storage speed of NVMe SSDs. This provides up to 5 times the scanning performance for almost all workloads. It is worth noting that the introduction of caching in the computing engine can be seen in many products, such as Snowflake's data warehouse products (see the paper "The Snowflake Elastic Data Warehouse"), and in China, for example, many cloud products of Alibaba Cloud are also going Developments in this area, such as Alibaba Cloud's Data Lake Analysis (DLA).

Delta Engine's biggest innovation in solving the challenges faced by data teams is the native execution engine, which is called Photon. This engine is completely rewritten in order to make full use of modern cloud hardware to maximize computing performance. This engine brings performance improvements for all types of workloads. What's important is that this engine is fully compatible with the open source Spark API.

Photon: Native vectorized execution engine

The most important Photon in the Delta Engine is completely implemented in C++. It greatly improves the computing power by using data-level parallelism and instruction-level parallelism, greatly improving the Spark SQL query on the Delta Engine, and for structured and unstructured work Loads are optimized to varying degrees.

Although the main frequency of the CPU has not changed for so many years, the degree of parallelism has been improved to varying degrees, mainly including the parallelism at the data-level level and the parallelism at the instruction level.

For example, our query is select sum(value) from table group by key. The underlying implementation of this query becomes the following code:


for(int32_t i = 0; i < batchSize; ++i) {
int32_t bucket = hash(keyCol[i]) % ht->size;

if(ht[bucket].key == keyCol[i]) {
        ht[bucket].value  += valueCol[i];
}
}

The above code accesses memory (ht[bucket]) instruction and calculation hash code (int32_t bucket = hash(keyCol[i])% ht->size;), comparison key (key == keyCol[i]) and addition (Ht[bucket].value += valueCol[i];) instructions are mixed together. And the loop body above is very large, causing the CPU to rarely see memory access instructions. To solve the above problem is to modify the large loop body above into a smaller loop body, as follows:


for(int32_t i = 0; i < batchSize; ++i) {
    buckets[i] = hash(keyCol[i]) % ht->size;
}

for(int32_t i = 0; i < batchSize; ++i) {
    keys[i] = ht[bucket].key;
}

for(int32_t i = 0; i < batchSize; ++i) {
if(keys[i] == keyCol[i]) {
        ht[buckets[i]].value  += valueCol[i];
}
}

After the above instruction level modification, the Delta Engine with Photon has a lot of performance improvements than the traditional engine.

A comprehensive interpretation of the Delta Engine of the bricks

If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

The performance under the test of TPC-DS 30TB data volume has increased by 3.3 times.

A comprehensive interpretation of the Delta Engine of the bricks
If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

String optimization

In addition, Photon has also made a lot of optimizations in the processing of String. For example, after using C++ to implement the execution engine, the operating performance of String's UPPER and SUBSTRING functions has been greatly improved compared to the implementation of JVM, as follows:

A comprehensive interpretation of the Delta Engine of the bricks

If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

Although the performance of string after the implementation of C++ has improved relative to the performance of JVM, the digital brick team has further optimized this, combined with UTF-8 variable-length encoding and ASCII fixed-length encoding, which further improves the operation of the Photon engine on strings :

A comprehensive interpretation of the Delta Engine of the bricks
If you want to learn about Spark, Hadoop or HBase related articles in time, please pay attention to the WeChat public account: iteblog_hadoop

Reference link

1.https://databricks.com/blog/2020/06/24/introducing-delta-engine.html

2.https://zh.wikipedia.org/zh-hans/NVM_Express

3.https://www.iteblog.com/archives/9833.html

Guess you like

Origin blog.51cto.com/15127589/2677642