Introduction to Data Warehouse Modeling

Dimensional modeling

Dimensional modeling method : build data warehouse according to fact table and dimension table

Modeling is generally done at the DWD layer of the data warehouse.

There are star models, snowflake models, and constellation models.

星形模型:事实表可以关联多个维度表,维度表之间没有关系
雪花模型:事实表关联维度表,然后维度再关联维度表,维度表间有关系
星座模型:多个事实表关联相同的维度表

Typical representative: star model

benefit:

星型模型的设计方式主要带来的好处是能够提升查询效率,因为生成的事实表已经经过预处理,主要的数据都在事实表里面,所以只要扫描事实表就能够进行大量的查询,而不必进行大量的join,其次维表数据一般比较少,在join可直接放入内存进行join以提升效率,除此之外,星型模型的事实表可读性比较好,不用关联多个表就能获取大部分核心信息设计维护相对比较简单。通过大量的冗余来减少表查询的次数从而提升查询效率,星型模型对OLAP的分析引擎支持比较友好,这一点在Kylin中比较能体现。

Dimensional modeling steps:

1. Select the required analysis for business decision-making process.

The business process can be a single business event, such as transaction payment, refund, etc.; it can also be the status of a certain event, such as the current account balance, etc.; it can also be a business process composed of a series of related business events. Please see us for details. What is analyzed is the occurrence of certain events, the current state, or the efficiency of event circulation.

2. Choose the granularity

In event analysis, we need to predict the degree of subdivision of all analysis requirements to determine the granularity of choice, which is a combination of dimensions.

3. Determine the dimensions

After selecting the granularity, you need to design the dimension table based on the granularity, including dimension attributes, for grouping and filtering during analysis.

4. Determine the facts

Determine the indicators that the analysis needs to measure.

Guess you like

Origin blog.csdn.net/weixin_47699191/article/details/115012522