The difference between data mart and data warehouse

The data warehouse is at the enterprise level and can provide decision support for the operation of various departments of the entire enterprise; while the data mart is at the departmental level and generally can only serve managers in a certain local area, so it is also called departmental level database.

1. Two data mart structures

The data mart is divided into the following two types according to the source of the data

(1) Subordinate data mart
insert image description here

The so-called affiliation means that its data comes directly from the central data warehouse. This structure maintains data consistency.

Generally, subordinate data marts are established for key business departments that frequently access the data warehouse, which can improve query response speed.

(2) Independent data mart
insert image description here
The data subsets of the independent data mart come from various production systems. When many companies plan to implement data warehouses, they are often out of investment considerations. urgent decision-making problems.

2. The difference between data warehouse and data mart

(1) Data warehouse: established based on the data model of the entire enterprise, oriented to themes within the enterprise;

Data mart: It is established according to the data model of a specific department. Since each department has its own specific needs, their expectations for the data mart are also different. There may or may not be an association between departmental topics and corporate topics.

(2) The data warehouse stores very detailed data in the entire enterprise; the data mart data has a lower level of detail and contains more summary and comprehensive data.

(3) The data organization of the data mart generally adopts a star model. The data organization of large data warehouses can be either star or snowflake.

(4) Data marts seldom retain historical data.

@二东东
RE: (3) The data organization of the data mart generally adopts a star model. The data organization of large data warehouses can be either star or snowflake.

In the era of data warehouses on the cloud (after Redshift, BigQuery, and Snowflake appeared in 2016), data organizations in data marts generally use wide tables because wide tables can be directly used by BI or analysts for analysis (they are used
to It is better to directly analyze a two-dimensional table instead of using the star model to join) The cost advantage of the star model does not exist, because in the era of data warehouses on the cloud, it is labor rather than data storage that is most expensive, so it saves time and effort and can be directly analyzed The wide table has an advantage in cost. As for the other two advantages of the star schema, performance and understandability are not as good as the wide table.
In summary, the data organization of the data mart in this era should be the wide table.

Guess you like

Origin blog.csdn.net/weixin_44976611/article/details/129215965