Logical layering of data warehouse

Logical layering

The data warehouse is layered, generally according to the overall structure of ods-> dw-> dm. Different enterprises and different business scenarios have derived different layered architecture models. For example, the classic four-layer architecture: ods-> dwd-> dws-ads, bdl-> fdl-> gdl-> adl, etc.

 

 

 

Layered basis

The basis of layering is generally unambiguous at the ods, da, and dim layers. The key is the layering basis of the dw layer, which is also the core of layered construction of the data warehouse.

 

The basis for dividing each layer is as follows:

  • ods layer: store the original data information, in principle, do not make any data clear, and keep it consistent with the data source.
  • dw layer: the data common layer, which is the focus of the construction of the data warehouse, generally the log sub-table and some wide tables, mainly to complete the data cleaning, conversion, etc.
  • dm layer: the data mart layer, which is the layer of the most direct system data assets, generally summarizes the data, has been gradually weakened, and is oriented to mining, data analysis, etc.
  • da layer: data application layer, highly summarized data, mainly used for report display.

Technology selection, traditional data warehouses are generally Oracle, greenplum, teradata, etc., Internet data warehouses are generally dominated by the Hadoop ecosystem, offline with Hive as the core, quasi real-time with spark as the core, and real-time with flink as the core.

Data research

Business research, business side alignment, follow the relational database modeling method, from conceptual model (cdm)-> logical model (ldm)-> physical model (pdm) modeling routine, is a continuous refinement from abstract to specific Perfect analysis, design and development process.

 

 

 

 

 

Demand research, existing BI report requirements, statistical requirements, user portraits, recommendation systems and other data applications.

Database research, understand the data structure and data form of database tables, grasp the overall flow of business process data, and achieve the combination of real business process and data structure.

 

04. Subject field division

The business is highly abstract, you can first determine the enterprise business bu module, and then you can divide the first-level topics according to the conceptual model (cdm), determine the consistency dimension and fact flow, and build the bus matrix.

 

 

 

Picture source Kimball "The Data Warehouse Toolkits,-3rd Edition"

 

Follow the four steps of the classic modeling of kimball master: select business process-> statement granularity-> determine dimension-> determine facts to carry out dimension modeling.

Data Warehouse Specification

To build an enterprise-level data warehouse, it is indispensable to formulate data warehouse specifications. Including  naming conventions, process specifications, design specifications, development specifications, etc. There are no rules and regulations, and so is the construction of several warehouses.

Examples of development specifications:

 

 

Guess you like

Origin www.cnblogs.com/songyuejie/p/12731922.html