Big data-data base architecture/data center construction

foreword

In the process of transforming from informatization to digitalization, enterprises have accumulated massive amounts of data, and it is still growing explosively. There is a lot of data, but very little data that can really generate value. Data is generally scattered, not aligned, and lacks a unified definition and structure. It is becoming more and more difficult to find the desired and usable data.
How to realize the aggregation and connection of data and break the data islands and monopoly is particularly important.

1. The goal of the data base

  1. Unified management of structured and unstructured data, which are all data assets, can be traced back to data producers, consumers and business sources;
  2. Open up the data service supply channel, provide rich data raw materials, semi-finished products or finished products for data consumption, and meet the data consumption needs of the company's self-service analysis, digital operation and other scenarios;
  3. Ensure that corporate data is complete, consistent, and shared. Monitor the data status of each link under the full data link; from the perspective of storage, diagnose data redundancy, duplication and "zombie" problems, reduce data dimensions and use costs;
  4. Ensure data security and controllability, based on data security policies, use data authority control, and through data service packaging and other technical means to achieve legal and compliant consumption of confidential and private data;

2. Data base architecture diagram

insert image description here

The construction of the data base is not achieved overnight, and it needs to be promoted in an overall manner to promote the construction; the data owner is the first person responsible for the construction of the data base of each business line; the construction principles are as follows:

  1. Data security principles: management requirements such as user rights, data confidentiality levels, and basic privacy should be followed;
  2. The principle of two-wheel drive of demand and planning: construction is driven by business planning and demand, and priority is given to the construction of core data assets;
  3. Data supply multi-scenario principle: Provide different data supply channels (offline/real-time/physical/virtual) according to business needs to meet different consumption scenarios;
  4. Information architecture complies with the principle: data base data assets should comply with the company's information architecture;

3. Data into the lake

Data into the lake standard

  1. Clear data owner
  2. Publish Data Standards
  3. Authentication data source
  4. Define Data Classification
  5. Data Quality Assessment
  6. metadata registration

How data enters the lake

  1. batch integration
  2. Data replication synchronization
  3. message integration
  4. stream integration
  5. data virtualization

more content

Big Data - Data Lake Construction

4. Data Subject Join: Transforming Data into Information

Through 5 types of connection methods, the data of different business lines are connected, and the data in the data lake is transferred from semi-finished products and finished products of raw material processing plants to support the data consumption needs of different scenarios;

  1. multidimensional model
  2. graph model
  3. Label
  4. Indicator data
  5. Algorithms and Models

more content

Big Data - Data Modeling & Topic Joins

Reference:
"The Way of Huawei Data"

Guess you like

Origin blog.csdn.net/csdn_tiger1993/article/details/125817484