Data Warehouse Dimensional Modeling Overview

Overall architecture scheme of data warehouse under big data system

Data warehouse overall architecture

Data warehouse data processing process

Data warehouse data processing process

Data warehouse features

  1. Topic oriented. The data organization of the operational database is oriented to transaction processing tasks, and each business system is separated from each other, while the data in the data warehouse is organized according to a certain subject domain. For example: subjects, agreements, institutions, finances, events, products, etc.
  2. Integrated. Data in a data warehouse is delivered from a number of different data sources. There are inconsistencies between multiple applications in terms of coding, naming conventions, physical properties (different databases), data types of fields, etc. This data enters the data warehouse and is transformed, reformatted, rearranged, and aggregated. This eliminates inconsistencies in the source data to ensure that the information within the data warehouse is consistent global information about the entire enterprise.
  3. non-volatile. Data in a data warehouse is usually loaded and accessed in a bulk fashion. But in a data warehouse environment, updates are not done (modifications and deletions are rare in the general sense). The data in the data warehouse is in the form of static snapshots when it is loaded. When subsequent changes are made, a new snapshot record is written to the data warehouse. In this way, the historical situation of the data is saved in the data warehouse.
  4. change over time. Every unit of data in a data warehouse is accurate only at a certain time. The table structure of the data warehouse always contains a time element, and a timestamp is added to indicate which day the data is.

Similarities and differences between ODS, data mart, data warehouse, and data mining

  • basic concept
  • ODS: Operational Data Warehouse
  • Data Warehouse: EDW for short, Enterprise Data Warehouse
  • Data Mart: Data Mart, abbreviated as DM, is a local DW built with a certain application as the starting point. DM only cares about the data it needs. The overall data architecture and applications of the enterprise will not be considered comprehensively, and each application has its own DM. So DM can be based on warehouse construction or can be built independently.
  • Data mining: referred to as Data mining. Data mining generally refers to the process of searching for information hidden in a large amount of data through algorithms. Data mining is generally related to computer science and achieves these goals through a number of methods such as statistics, online analytical processing, intelligence retrieval, machine learning, expert systems (relying on past rules of thumb), and pattern recognition.

Model design ideas comparison

Compared


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326076455&siteId=291194637