【Apache 之ORC 介绍】

Apache ORC the smallest, fastest columnar storage for Hadoop workloads.

Apache ORC 文件格式是一种Hadoop生态圈中的列式存储格式，它的产生早在2013年初，最初产生自Apache Hive，用于降低Hadoop数据存储空间和加速Hive查询速度。

ACID Support

Includes support for ACID transactions and snapshot isolation

Built-in Indexes

Jump to the right row with indexes including minimum, maximum, and bloom filters for each column.

Complex Types

Supports all of Hive's types including the compound types: structs, lists, maps, and unions

ORC（OptimizedRC File）存储源自于RC（RecordColumnar File）这种存储格式，RC是一种列式存储引擎，对schema演化（修改schema需要重新生成数据）支持较差，而ORC是对RC改进，但它仍对schema演化支持较差，主要是在压缩编码，查询性能方面做了优化。RC/ORC最初是在Hive中得到使用，最后发展势头不错，独立成一个单独的项目。Hive 1.x版本对事务和update操作的支持，便是基于ORC实现的（其他存储格式暂不支持）。ORC发展到今天，已经具备一些非常高级的feature，比如支持update操作，支持ACID，支持struct，array复杂类型。你可以使用复杂类型构建一个类似于parquet的嵌套式数据架构，但当层数非常多时，写起来非常麻烦和复杂，而parquet提供的schema表达方式更容易表示出多级嵌套的数据类型。

ACID support

Historically, the only way to atomically add data to a table in Hive was to add a new partition. Updating or deleting data in partition required removing the old partition and adding it back with the new data and it wasn’t possible to do atomically.

However, user’s data is continually changing and as Hive matured, users required reliability guarantees despite the churning data lake. Thus, we needed to implement ACID transactions that guarantee atomicity, consistency, isolation, and durability. Although we support ACID transactions, they are not designed to support OLTP requirements. It can support millions of rows updated per a transaction, but it can not support millions of transactions an hour.

Additionally, we wanted to support streaming ingest in to Hive tables where streaming applications like Flume or Storm could write data into Hive and have transactions commit once a minute and queries would either see all of a transaction or none of it.

HDFS is a write once file system and ORC is a write-once file format, so edits were implemented using base files and delta files where insert, update, and delete operations are recorded.

Indexes

ORC provides three level of indexes within each file:

file level - statistics about the values in each column across the entire file

stripe level - statistics about the values in each column for each stripe

row level - statistics about the values in each column for each set of 10,000 rows within a stripe

The file and stripe level column statistics are in the file footer so that they are easy to access to determine if the rest of the file needs to be read at all. Row level indexes include both the column statistics for each row group and the position for seeking to the start of the row group.

Column statistics always contain the count of values and whether there are null values present. Most other primitive types include the minimum and maximum values and for numeric types the sum. As of Hive 1.2, the indexes can include bloom filters, which provide a much more selective filter.

The indexes at all levels are used by the reader using Search ARGuments or SARGs, which are simplified expressions that restrict the rows that are of interest. For example, if a query was looking for people older than 100 years old, the SARG would be “age > 100” and only files, stripes, or row groups that had people over 100 years old would be read.

【Apache 之ORC 介绍】

猜你喜欢