[Data Warehouse] The difference between data warehouse based on big data and traditional data warehouse

table of Contents

Data warehouse based on big data

Traditional data warehouse

The similarities between traditional data warehouse and data warehouse based on big data


Data warehouse based on big data

With the emergence of new application scenarios (personalized recommendation, user portrait, machine learning, data analysis, risk control, precise operation), data has exploded, and data warehouse applications based on big data have emerged, which are characterized by rapid business changes , Is fast, can process massive data, and has real-time requirements; ER modeling is weakened, DWS, DM adopts dimensional modeling; modeling is weakened, more biased towards hybrid modeling, data quality is average, attention to demand is fast and flexible. More emphasis on the importance of data assets, generally reaching a certain scale, tends to do data governance (data quality, data security, data standards, data blood relationship, master data, metadata management)

  1. High effectiveness
  2. Business is flexible and changeable
  3. Data source diversity
  4. Data quality is uneven
  5. Complex application scenarios

Technology stack: HDFS, Hive, Hbase, Flume, Kafka, Flink, Spark, Sqoop, ES, oozie, etc.

Traditional data warehouse

Traditional data warehouses have relatively stable business scenarios and relatively reliable data quality, as well as relatively stable requirements. There are relatively complete project management and control procedures for data warehouse construction, and strict and stable construction standards for data modeling; but construction The cycle is long, the processed data structure is single, and the cost of processing complex data is high, as long as it supports data analysis and management decision-making.

DWD generally adopts ER modeling, DWS, DM adopts dimensional modeling, and the data quality is high.

Technology stack: TD, Oracle

The similarities between traditional data warehouse and data warehouse based on big data

1. Layered architecture

2. ER and dimensional modeling theory are also applicable

3. Data service systems generally have BI and indicator systems

Guess you like

Origin blog.csdn.net/henku449141932/article/details/112647200