Data Governance: An Inventory of the Mainstream Data Governance Architectures in the Industry

Guide: Different industries and different enterprises have different industry characteristics, different natures of enterprises, different degrees of informatization development, and different business and management needs, so the focus of data governance is also different. When designing a data governance platform framework, an enterprise should proceed from the actual needs and development needs of the enterprise, design a data governance structure suitable for the enterprise, and solve the pain points of the enterprise. In this process, the best practices in the industry can only be referred to and cannot be copied. The most taboo is to be greedy Be exact.

 

1. Overview of Data Architecture

In fact, there is no official and authoritative definition of data architecture. Most of the understanding and cognition of data architecture comes from enterprise architecture (EA). In EA architecture, data architecture is an important part of it. Enterprise architecture generally includes: business architecture, data architecture, application architecture and technical architecture. The data architecture abstracts the business entity of the enterprise into an information object, abstracts the business operation mode of the enterprise into the attributes and methods of the information object, and establishes an object-oriented data model. The data architecture realizes the transformation from a business model to a data model, and from business requirements to information functions Mapping, the abstraction of enterprise basic data to enterprise information. To put it simply: the data architecture is a logical description of the relationship between various businesses in the business architecture, and describes the data composition, interrelationships, and storage methods of each application module through the data architecture, which is located between the business architecture and the application architecture. To connect the previous and the next.

picture

The data architecture involved in what we call data governance, functionally, includes: information resource catalog management, master data management, metadata management, data quality management, data standard management, data security management, and data lifecycle management .

2. Design method of data structure

The design of data architecture is part of the enterprise architecture. There are many mature models and frameworks for the design of enterprise architecture, such as TOGAF, Zachman, FEA, and DoDAF. The most widely used in China is the TOGAF framework. In the TOGAF enterprise architecture framework, data architecture is also one of the four important components of the entire enterprise architecture framework.

picture

The TOGAF framework divides the planning and design of the enterprise architecture into a preliminary phase and eight design phases, as shown in the figure below. This time, we did not discuss too much on how to use the TOGAF framework. In fact, no two companies have the same business, data, application systems, and corporate nature, management and control models, and corporate culture, so we are doing data governance. When designing the architecture, any advanced system framework and best practices can only be used as a reference, not copied. The most important thing is to combine the characteristics and needs of the enterprise to design a data architecture that meets the requirements of the enterprise. At this point, I like the TOGAF framework very much. Every stage and every step of it requires us to plan and design around the needs of the enterprise.

picture

Combined with the TOGAF framework, the author believes that the data architecture design in the data governance project should have the following steps:

1. Strategic understanding. Fully understand the corporate vision and development strategy. The understanding of corporate strategy is not limited to business strategies such as vision and mission, but also the IT strategy of the company. At the same time, factors such as the company's positioning of data, organizational structure, and talent strategy need to be considered.

2. Business analysis. Clarify the main value chain of enterprise business, take the main value chain as the core, fully understand the synergistic relationship and existing problems of various business links, and find out the three points of enterprise business needs, namely: pain points, itching points and excitement points. These three points are mostly used in the marketing field, but many years of experience tell me that finding these three points and making a reasonable design is also an important guarantee for the success of the project.

3. Architecture design. The data architecture aims to solve business problems and needs, takes application functions as the starting point, undertakes business architecture upwards, and connects application architecture downwards. The data architecture needs to contain not only relatively static data, such as metadata, master data, and data models, but also relatively dynamic data, such as various transaction data, ETL, application access data, integrated data, and mobile data. Considerations of data standards, data quality, data security, and data lifecycle management are also required.

4. Model design. Design a data model based on the understanding of corporate strategy and business analysis. The data model is an abstraction of the real world, and the data model describes the static characteristics, dynamic behavior and constraints of the system from the abstract level. According to the principle of hierarchical design, the data model can be divided into conceptual model, logical model and physical model. Conceptual model, facing the user and the objective world, is used to describe the conceptual structure of the real world. A logical model, oriented to a database system, describes the structure and relationships of data objects. The physical model, for physical storage media, describes the structure of data on the storage media.

5. Data standards. Combine the data model to define the business meaning, business rules, data structure, quality rules, management department, and managers of each data. It is worth noting that model design should generally include the content of data standards. Data standards include data classification standards, data coding standards, data quality standards, and data security standards in addition to the content described by the data model.

3. Inventory of the current popular data governance architecture

Let’s talk about the characteristics of the data governance architecture of these industries or companies based on some industries and companies that the author has come into contact with and understand.

1. Metadata-driven data governance architecture

The development of new technologies has brought challenges to traditional industries, and even the banking industry, which we have always envied, is hardly immune. The informatization model of traditional banking enterprises is also built first and then governed. A large number of chimney-style architecture systems have produced a large number of data islands, cross-business, duplication of functions, redundant data, low data quality, inconsistent standards, and centralized management. Problems such as single collection and processing means, scattered storage, insufficient data mining capabilities, data fragmentation, and insufficient sharing are still common in most banking companies. Coupled with the impact of Internet finance, the banking industry is facing a difficult period.

picture

Data is the asset of the enterprise, especially for the banking industry. Especially with the extensive use of big data in marketing, risk control, and inclusive finance, data has evolved from a tool for improving operational efficiency and regulatory effectiveness to becoming the core asset of the banking industry and an important basis for realizing regulatory intentions. Driven by metadata, sorting out enterprise data assets, establishing a data standard system, data quality management system, and implementing data governance for enterprises are currently typical data governance structures in the banking industry. Through the metadata management platform, the collection, modification, deletion, and retrieval of metadata are realized, and the extraction, conversion, and loading of data are realized under the drive of metadata, and the data resource catalog is established to inventory enterprise data assets. °Customer master data management and data quality management, realize unified and standard external data services, and provide support for enterprise product innovation and service innovation. Through data governance, it has played a great supporting role in optimizing banking business, establishing and maintaining good relationships with customers, and increasing sales opportunities.

2. Data governance architecture driven by master data

For manufacturing enterprises, "cost reduction, efficiency increase, and quality improvement" are the eternal pursuit goals of enterprises. In the process of enterprise development, the business relationship is getting closer and closer, and the fragmented business system, data inconsistency, non-standard, incorrect, incomplete and other problems have caused great constraints on the collaboration and collaboration between businesses, and then Affected the enterprise's goal of "cost reduction, efficiency increase and quality improvement". In a manufacturing enterprise, when various departments and businesses communicate online, often due to inconsistent codes and inconsistent names, business communication is not smooth, communication costs are increased, and business efficiency is affected.

picture

Through unified sorting and identification of enterprise data resources, establish master data standards, including: classification standards, coding standards, data model standards, data quality rule standards, data integration standards, etc. Through the data governance platform driven by master data, the data channels of various business systems are opened up, and the unique data source and unified data view of master data are formed, so as to realize one code for one object, unified management, unified distribution, and unified application of master data. Solve the non-standard and inconsistency problems of data in various heterogeneous systems through master data, ensure business continuity and data consistency, integrity and accuracy, and improve the synergy between business lines. At the same time, high-quality master data also Provide support for the leadership's management decision-making.

3. Data Governance Architecture of Microservice Model

Microservices—decentralized information system architecture, service componentization, deployment automation, and flexibility and agility are highly praised by Internet companies and some open industries (2C businesses). With the implementation of the microservice architecture, people found that although the microservice architecture improved the development model, it also introduced some problems. Among all the problems, the most important one is the data problem. The microservice architecture emphasizes thorough componentization and serviceization. Each microservice can be independently deployed and put into production. Many microservices have their own independent databases. This brings up two questions: 1) How to integrate the data after the business system has completely queried the data? 2) How to further analyze and mine the data? These requirements may require the analysis of a full amount of data, and the analysis cannot affect the current business.

picture

The picture above shows a hotel’s data governance architecture based on microservices. The overall design idea is to adopt a three-tier architecture model, which is divided into: data layer, service layer and application layer. isolated from the data layer. Microservices are identified and divided according to the logic of master data, and applications with a high degree of sharing are micro-serviced, while master data applications are micro-serviced, such as: membership center, point center, product center, store center... . For the front-end business system, the data cannot be directly manipulated, but the back-end data is obtained by calling various microservices in the service layer. When it is necessary to perform statistical analysis on the full amount of data, the corresponding data is moved and summarized into the data lake through data movement technology, and then processed according to the needs of statistical analysis to realize the analysis.

4. Data governance architecture based on hybrid cloud

According to the "China Hybrid Cloud Market Survey Report (2018)", hybrid cloud has become the main theme of enterprise cloud migration. The report points out that reducing infrastructure investment, and being able to realize enterprise business customization and security considerations to a certain extent are the most important factors for enterprises to choose hybrid cloud reason. Hybrid data governance will be an issue that enterprises will have to consider in the future.

picture

Based on the hybrid cloud data governance model, the national standard and industry standard data resources will be formed into a public data resource pool, deployed in the public cloud, and provided through API interface services for enterprises to call. Each API interface can be regarded as a DSaaS service. In order to maximize the application of the public data resource pool, the data can be opened through OpenAPI for more application developers to use. For enterprises, the essence of data governance is to improve data quality . Now that the public cloud has high-quality standard data, this part of the data can be used within the enterprise, so that public cloud standard data resources can be integrated into the enterprise. In the process of data governance, on the one hand, it reduces the cost of enterprise data management and maintenance, and on the other hand, it improves the reliability of enterprise data.

5. Data governance system of big data architecture

In the era of big data, data is dispersed throughout the enterprise. It is structured, unstructured, semi-structured and various other formats. The volume, variety and velocity of available data continues to grow at an alarming rate. Also, the data sources are not under the control of the teams that need to be managed. Businesses face two pressing challenges: how to uncover actionable insights in this data, and how to secure it. These two challenges directly depend on the ability to govern data.

picture

How to achieve efficient data governance in the big data environment. The above is the big data governance architecture of a telecom company. The data governance platform includes metadata management, data quality management, master data management, data standard management and data security management. The data structure, quality rules and data standards of the big data platform are defined through the data governance platform to realize data control and governance of the big data platform. At the same time, the analysis results of the big data platform can also feed back to the data governance platform to form more reliable data services. The relationship between the modules of the big data platform and the data governance platform is as follows:

picture

Four. Summary

The definition of data architecture design is the overall IT system asset blueprint, which lays the foundation for the management and application of enterprise data assets. Data architecture supports data storage, access, integration, and analysis. Data architecture design must not only consider relatively static data, such as metadata, data models, master data, and standardization of shared data, but also relatively dynamic data. Such as: control and governance of transaction data, data transfer, big data, ETL, access application and data life cycle. The design of the data governance architecture should closely follow the characteristics of the industry and the needs of the enterprise, design a data architecture that meets the needs and development of the enterprise, strengthen the management of data governance, data life cycle, data security, etc., continuously improve data quality, and ensure that enterprises The reliability of data assets makes data a solid foundation for core competitiveness leading corporate strategic planning and business development.

picture

Author : Shi Xiufeng

An IT veteran who has worked in the data field for more than 10 years. A practitioner of enterprise data governance, data assetization, and data businessization!

References:

Enterprise Architecture TOGAF https://blog.csdn.net/watermelonbig/article/details/77620847

Microservice Architecture and Data Governance https://www.jianshu.com/p/e61e13e17efd

Hybrid cloud governance: how to make data storage borderless https://blog.csdn.net/qq_41689867/article/details/89221224

Guess you like

Origin blog.csdn.net/kuangfeng88588/article/details/118406807