Depth analysis of Delta Lake: Detailed transaction logs

Transaction Log (Transaction log) is to understand the Delta Lake is a key point, a lot of Delta Lake of important features are implemented based on the transaction log, including transactional ACID, extensible metadata processing, and so on back in time. This article will explore what is a transaction log, how the file level, and how elegant solution to the problem of concurrent read and write.

What is a transaction log?

Delta Lake transaction log (referred DeltaLog) is an ordered set of records, the records of all transactions sequential operations to generate Delta Lake from the beginning of the table.

What is the role of the transaction log?

Single source

Delta Lake based Apache Spark constructed to support multiple users simultaneously read and write the same data table. Transaction log as a single source - the user keep track of all of the operating table, so as to provide at any time a correct view.
When the user first accesses Delta Lake of the table, or submit a new query to open a table but the table

Guess you like

Origin yq.aliyun.com/articles/718093