Data warehouse fact table classification

1 ) In the field of data warehouse, there is a concept called Transaction fact table , which is generally translated as " transaction fact table " in Chinese .

Transaction fact table is one of the three basic types of fact tables in dimensionally modeled data warehouses. The other two are periodic snapshot fact tables and cumulative snapshot fact tables. 

Transaction fact tables use the same consistency dimensions as periodic snapshot fact tables and cumulative snapshot fact tables, but they are very different in describing business facts.

The transaction-level facts recorded by the transaction fact table store the most atomic data, also known as " atomic fact table " . The data in the transaction fact table is generated after the transaction event occurs, and the granularity of the data is usually one record per transaction. Once the transaction is committed, the fact table data is inserted, the data is no longer changed, and the update method is incremental update. 

The date dimension of the transaction fact table records the date when the transaction occurred, and the fact it records is the content of the transaction activity. Users can perform particularly detailed analysis of transaction behavior through the transaction fact table.

Through transaction fact tables, aggregate fact tables can also be established to provide users with high-performance analysis.

2 ) In the field of data warehouse, there is a concept called Periodicsnapshot fact table , which is generally translated in Chinese as " periodic snapshot fact table " .

Periodic snapshot fact tables record facts at regular, predictable intervals such as daily, monthly, yearly, and so on. Typical examples are the sales day snapshot table, the inventory day snapshot table, etc.

The granularity of the periodic snapshot fact table is one record per time period, which is usually coarser than that of the transaction fact table. It is an aggregate table built on the transaction fact table. The periodic snapshot fact table has fewer dimensions than the transaction fact table, but records more facts than the transaction fact table.

The date dimension of the periodic snapshot fact table is usually the end date of the recording time period, and the recorded facts are some aggregated fact values ​​in this time period. The data of the fact table cannot be changed once inserted, and the update method is incremental update.

3 ) In the field of data warehouse, there is a concept called Accumulating snapshot fact table , which is generally translated in Chinese as " accumulating snapshot fact table " .

Cumulative snapshot fact table and periodic snapshot fact table are somewhat similar, they both store snapshot information of transaction data. But there is also a big difference between them. The periodic snapshot fact table records data of a certain period, while the cumulative snapshot fact table records data of an indeterminate period. 

A cumulative snapshot fact table represents a time span that completely covers the life cycle of a transaction or product, and typically has multiple date fields to record key points in time throughout the life cycle. Additionally, it will have an additional date field indicating the date of the last update. Since many dates in a fact table are not known when first loaded, surrogate keywords must be used to handle undefined dates, and such fact tables can be updated after the data is loaded to supplement subsequent Know the date information.

for example,

Order Date Scheduled Delivery Date Actual Shipping Date Actual Delivery Date Quantity Amount Freight 





Difference comparison:

According to Kimball 's data warehouse theory, fact tables are divided into three types: transaction fact tables, periodic snapshot fact tables, and cumulative snapshot fact tables. The following are the differences between these types of fact tables.

Features

Transaction Facts

Cycle Snapshot Facts

Cumulative snapshot facts

time/ period

time

period

Multiple points in a short time span

granularity

Each row represents a transaction event

Each row represents a time period

Each row represents a business cycle

Fact table loading

new

new

Add and modify

Fact table update

do not update

do not update

Update when new events occur

time dimension

business date

end of period

Completion dates for multiple business processes

fact

Trading activity

performance over time

Qualify performance within multiple business phases

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326561708&siteId=291194637