2022 Build an enterprise-level data governance system

Data governance is an essential part of enterprise data construction.

A good data governance system can revitalize the entire data link and maximize the controllability and traceability of enterprise data 采集, management 存储, 计算and processes.使用

How to build an enterprise data governance system? What issues need to be paid attention to in the process of enterprise data governance? In general, you can't take a fat man in one bite, and the road has to be taken step by step.

Below, I will introduce the whole process of building a data governance system 企业级数据治理in detail based on my experience, 从0到1so as to help you sort out the main content of data governance and what pits you will encounter in the process.

If there are any omissions, please leave a message in the comment area to discuss.

1 What exactly is data governance?

1.1 A short story

Before the main text, let me introduce a short story.

At the end of the year, the corporate financial administrator Xiao Zhang needs to count the company's financial situation. After a busy year, the boss of the company urgently needs to know the current operating status of the company.

What points should Xiao Zhang consider:

"
  1. What assets does the company currently have?

  2. Where do these properties come from? Where is it used?

  3. Are all properties used in accordance with norms and systems?

"

Fortunately, Zhang has already formulated a set of management standards at the beginning of the year. The entry and exit of each property is recorded and the usage is strictly controlled, and the process can be traced and reviewed.

In the end, Xiao Zhang received unanimous praise from the leaders.

1.2 What data governance does

Xiao Zhang in the story supervises all financial property activities of the company to ensure the orderly and efficient use of property. This is also a similar function of the data governance role.

"

The core work of data governance:  in the process of enterprise data construction, ensure that the enterprise's data assets are managed correctly and effectively.

"

Generally speaking, after the data is generated externally or internally, it is processed by means of big data, and then flows to different business ends to provide data empowerment for the upper-layer applications of the enterprise.

The whole process is shown in Fig.

  • We first do some work like data synchronization to put the data into the big data system

  • After the data comes in, it needs to be managed and stored, that is, building a data warehouse with reference to modeling theory and actual scenarios

  • Processed through the steps of theme planning, dimension determination, label calculation output, etc.

  • Data output to reports, application side use

The overall process data governance system will be supervised throughout the process. How about checking in and out of the system 数据质量? Is it convertible 数据资产? 数据血缘Whether it is traceable, 数据安全etc.

"

Dirty and messy data is unusable and even seriously mines.

"

2 Why do data governance

Some enterprises have a very vague concept of this issue, thinking that the current data scale is small and human-controlled, and data governance is not needed for the time being.

However, there are still many problems in practical use:

  • Insufficient data supervision and dirty data

  • The scale of the data system gradually becomes larger, and the management is chaotic

  • The bloodline of the data is lost, and the old and old data cannot be traced back

Regardless of the scale of data in the enterprise, I think it is still necessary to plan for data governance. Considering the cost, it can be done in stages.

"

Why Data Governance:

  1. Is your data really available, what about missing and outliers?

  2. Where did the data come from, and whether bloodline information was lost

  3. Is data access secure, clear text identification or encrypted?

  4. What specification does the new data processing refer to, and is there a standard for dimension and label management?

"

有剑在手不用和无剑可用是两回事. Doing a good job in data governance planning in advance will save subsequent transformation costs, and avoid redundant process reconstruction or overhaul.

Data governance can effectively ensure that the data construction process is carried out under a reasonable and efficient supervision system, and finally provides business data of 高质量, 安全, and .流程可追溯

3 Data Governance System

The enterprise data governance system includes 数据质量管理, 元数据管理, 主数据管理, 数据资产管理, 数据安全and so 数据标准on.

1) Data quality

Generally, the standards commonly used in the industry are used to measure the quality of data: 完整性, 准确性, 一致性and 及时性.

  • Integrity: Whether the records and information of the data are complete and whether there is any missing situation

  • Accuracy: Whether the information and data recorded in the data summary are accurate, and whether there are any abnormalities or errors

  • Consistency: Common data between multiple business data warehouses must be consistent in each data warehouse

  • Timeliness: data can be produced in time and early warning

2) Metadata management

Metadata is information about the organization of data, data domains and their relationships. In general, metadata is data that describes data.

Metadata contains 技术元数据and 业务元数据. It can help data analysts clearly understand what data the enterprise has, where it is stored, and how to extract, clean, and maintain such data, ie 数据血缘.

  • Help build a business knowledge system and establish the interpretability of data business meaning

  • Improve data integration and traceability capabilities, and maintain blood relationship

  • Establish a data quality audit system, classify management and monitoring

3) Master data management

Enterprise master data refers to the consistent and shared business entities within the enterprise, and the vernacular understands it as the data shared between professional companies and business systems.

Common master data such as company 员工, 客户数据, 机构信息, 供应商信息etc. This data is authoritative and global, and can be reduced to the company's corporate assets.

General master data management needs to follow the following points:

  • Manage and supervise the access to master data of various organizations, subsidiaries and departments, and formulate access specifications and management principles

  • Regular master data assessments to determine the level of improvement of established goals

  • Organize relevant personnel and institutions to unify and improve the construction of master data

  • Provide technical and business process support, and centralize the whole group

4) Data asset management

Generally, enterprises will consider data asset sorting during digital transformation. Is your data being used properly? How to generate maximum value? This is the core work that data asset management cares about.

When building enterprise assets, different perspectives, namely business perspective and technical perspective, are generally considered, and finally merged, unified output 数据资产分析, and a unified data asset query service is provided to the outside world.

How to revitalize data, form data assets, and provide a complete panoramic view of data assets, which is convenient for operators to control the dynamics of enterprise assets globally and macroscopically.

5) Data security

Data security is an indispensable part of enterprise data construction. Our data is stored in large and small disks, and various degrees of query and computing services are provided to the outside world.

核查It is necessary to conduct , control, 敏感字段加密and control the data regularly to 访问权限ensure that the data can be used safely.

6) Data Standards

The vernacular understands that we need to define a set of norms about data within the organization, so that we can all understand the meaning of this data.

Today, Zhang San said that this customer number is a customer who has applied for a bank card, and tomorrow Li Si will say that it is a customer who has borrowed money. By comparison, the field types and lengths of the two are the same. Which opinion should be adopted?

Data standards are normative constraints that guarantee the consistency and accuracy of the internal and external use and exchange of data, passed 统一规范, and eliminated 二义性.

4 Enterprise Data Governance Implementation Process

4.1 Data Governance Implementation Framework

A data governance system is an organization, process and tool established to standardize various management tasks and activities in business data specifications, data standards, data quality and data security.

Through a normalized data governance organization, establish a 集中管理long-term data mechanism, standardize the data management and control process, improve data quality, promote the consistency of data standards, and ensure the security of data sharing and use, thereby improving the operational efficiency and management level of the enterprise.

4.2 Data Governance Organizational Structure

In addition to technical aspects, the enterprise data governance system 实施架构also needs management 组织架构support.

Generally, in the early stage of data governance construction, the group will first establish a data governance management committee. From top to bottom, it consists 决策层of , 管理层, and 执行层. The decision-making level makes decisions, the management level formulates plans, and the executive level implements them. Hierarchical management and unified coordination.

4.2.1 Organizational Structure

1) decision-making level

Provide the decision-making function of data standard management, and the popular understanding is the finalization of the plan.

2) Management

  • Reviewing systems related to data standard management

  • Discuss and make decisions on difficult data standards management disputes across departments

  • Manage major data standards matters and submit them to the Information Technology Management Committee for consideration

3) Execution layer

  • Business department: responsible for the formulation, modification, review of data standards for business lines, promotion and implementation of data standards, etc.

  • Technology development: undertake the implementation of governance platforms, data standards, data quality, etc.; follow data standards in system design and development

  • Technology operation: responsible for the formulation of technical standards and technology promotion

4.2.2 Management responsibilities

1) Project Manager

  • Determine project goals, scope and plan

  • Develop project milestones

  • Manage cross-project collaboration

2) Expert review team

Review the project plan and determine the rationality of the plan

3)PMO

  • Ensuring projects are executed as planned

  • Manage major project risks

  • Execute cross-project collaboration and communication

  • Organizing project key reviews

3) Data Governance Special Group

Execute the implementation and operation promotion of each project, and promote the implementation of data governance technology and project progress at the executive level.

4.2.3 Executive layer responsibilities

Data architects, data governance experts and business specialists form an "iron triangle" of data governance and work closely to promote data governance and data architecture implementation.

1) Business Specialist

As the interface person for data governance of the business department, the business specialist organizes business personnel to carry out work in the fields of 标准, 质量, etc.应用

  • Define data rules

  • Guarantee data quality

  • Make data needs

2) Data governance experts

As a member of the data governance team, data governance experts are responsible for designing data architecture and operating data assets; leading the organization of business and IT to achieve data governance goals.

  • Build a data logic model

  • Monitor data quality

  • operational data assets

3) Data Architect

As an expert in the IT development department, data architects are responsible for the implementation of data standards and models, and assist in solving data quality problems.

  • Data standard landing

  • Logical model landing

  • physical model landing

4.3 Data Governance Platform

After determining the technical implementation plan and organizational management structure, the implementation of the data governance system needs to be carried out below.

数据治理平台In large enterprises, a complete set of data governance functions is generally developed to provide external platform services.

1) Core functions

As a product system of data governance, the data governance platform aims to ensure that the data of the data platform is safe, reliable, standard and valuable.

  • 数据资产管理: Provides user-oriented scene-based search, provides panoramic data asset maps, and facilitates rapid asset search and asset analysis

  • 数据标准管理: Unify custom data standards, improve management including fields, code values, data dictionary, and ensure unified standards for business data and middle-end data

  • 数据质量监控: Provide data quality system before, during and after event, support data quality monitoring rule configuration, alarm management and other functions

  • 数据安全: Provide data security desensitization, security classification and monitoring

  • 数据建模中心: Unified modeling, providing business system modeling and model management

2) Metadata management

As the front-end display portal of the data governance platform, the metadata management system helps to realize the 快速检索ability of data assets and improve the effectiveness and efficiency of data use.

By establishing a complete and consistent metadata management strategy, it provides centralized, unified and standardized metadata information access, query and invocation functions.

3) Data quality

  • Data quality monitoring: support all users to configure data quality monitoring rules

  • Rule blocking: Configure data quality monitoring blocking rules. If there is a difference in data quality, downstream jobs can be blocked in real time, and links with incorrect results can be blocked from spreading.

  • Alarm: If there is a preset deviation in the data quality, an early warning notification will be issued in time to repair it in time.

4) Data Standards

Support customized and unified data standard platform, including field standard management, code value standard management and dictionary management, business source data and middle-end data unified standard.

5) Data security

Based on the group's data assets, it realizes data security hierarchical management and automatically identifies security information; provides data access security behavior monitoring to identify access risks in a timely manner.

4.4 Data Governance Assessment

After the development and operation of the data governance platform is completed, it is necessary to verify and evaluate the effect of the overall data governance system.

"

1) Whether the data can eliminate the phenomenon of "dirty, messy and poor"
2) Whether the value of data assets is maximized
3) Whether the blood relationship of all data is complete and traceable. . .

"

1) Data assets

By building a data asset management system, it can achieve full coverage of assets, and support global search and accurate positioning of target assets.

  • Realize global search and provide users with scene-based retrieval services

  • Supports multiple retrieval dimensions such as labels, data maps, table names and field names

  • Support data map, result filtering of source business data dictionary

  • For example, support PV/UV user search and asset display, and clarify service goals

2) Data Standards

The precipitation of old and new data standards has opened up data modeling tools, data standard libraries and stem standard libraries, and implemented data standards and stems.

  • Realize 100% pull-through of data standard library

  • Intelligent identification of data standards and references

  • The client synchronously updates data standards and stems

3) Data security

The principles of maintaining 事前制度建设, 事中技术管控, 事后监控审计and establishing a whole-process data security management and control system.

Based on the above data security management and control system, it supports data security grading and builds a flexible data security sharing process.

4) Data quality

Through the data quality radar chart, the data and task quality are scored regularly to comprehensively inspect the data quality effect.

  • Data Integrity: Check whether the data item information is comprehensive, complete and missing

  • Alarm response level: daily management, emergency response, impact reduction; avoid data damage and loss

  • Monitor coverage: ensure data adheres to consistent data standards and specifications

  • Job stability: monitor job stability and whether there are any problems such as job anomalies

  • Job timeliness: Check whether the acquisition of data item information corresponding to the task meets the expected requirements

5 Misunderstandings of Data Governance

1) Whether data governance should be done comprehensively

"

This is a classic problem. Generally, the degree of implementation of data governance will be different for enterprises of different stages and sizes. It is generally recommended to proceed in stages according to your own data status, to avoid blindly spreading the scale, and it can be adjusted during the process.

"

2) Data governance is only a matter of technical considerations

"

As mentioned in the article, data governance is not just a matter of the technical team, but the collaboration of the entire group. These include various business lines and other management organizations. Without a good implementation plan and collaboration mechanism, it is often twice the result with half the effort.

"

3) Data governance can be effective in the short term

"

Data governance is a long-term process that will be adjusted synchronously with changes in the scale of enterprise data and data warehouse planning. Some functions may be effective in the short term, and it is difficult to achieve a complete system in the short term.

"

4) There must be a tool platform to carry out data governance

"

As the saying goes, if you want to do good work, you must first sharpen your tools. Of course, it is better to have good tools. The premise is that you have a mature data governance system plan and strategy. The tools and technical means are currently very mature on the market, so let’s lay the groundwork for the theory first.

"

5) Data governance feels vague? Don't know the final result

"

Data governance is a long-term task that requires relevant practitioners to construct and adjust according to the current data status and management model of the enterprise. It is recommended to summarize and summarize while doing practice. Small steps are a good way.

"

Guess you like

Origin blog.csdn.net/ytp552200ytp/article/details/125987322