Cai Chunjiu: How to build master data standardization

The first session of Yixinhuachen's "2023 Digital Empowerment Season" master data management session was successfully held. We invited Cai Chunjiu, an expert on data standardization and governance in China, to share with you master data management from the theory to the tool level. The whole process is dry and in-depth interpretation. The following is the full text of the speech.

Cai Chunjiu: Data asset expert of China Big Data Technical Standards Promotion Committee, member of the Standing Committee of Enterprise Information Standardization Committee of China Electronics Standardization Association, and founder of China Data Craftsman Club.

The topic I shared tonight is "How to Build Master Data Standardization", which mainly introduces the challenges and trends of domestic master data management, the two systems and one platform of master data management, the implementation methods and difficulty analysis of master data governance projects.

Common issues and challenges in master data management

First, let's take a look at the challenges and trends of domestic master data management. Many large domestic conglomerates have been doing informatization for more than 10 years. In the process, they will inevitably encounter many data quality problems, such as difficulty in finding numbers, incomplete data, lack of data standards, inconsistent data, and high thresholds for using data. The data accuracy problem that seems to be floating on the surface of the water is rooted in the deep-seated data governance problem that is hidden underwater. This is the real reason that restricts scientific decision-making and business management improvement.

Next, I will start with five common problems.

①The first is the lack of information, such as the lack of information on products, customers, industries, etc., which will lead to insufficient information in our records, distorted risk management and control, and increased difficulty in data analysis and business decision-making.

②The second is inconsistent caliber. The same business concept is stored in different forms and conceptual categories in different systems. The same master data is maintained by multiple parties, leading to various data quality problems, such as the same name with different synonyms, and synonymous with different names.

③The third is data dispersion. For example, in a large conglomerate, customer information is distributed in multiple business systems, and business data is distributed in various nodes in the process. As a result, we lack comprehensive data control and cannot form a 360-degree portrait of the customer. , Data maintenance multiple times leads to data conflicts.

④ The fourth is information duplication. One customer corresponds to multiple customer information, which will lead to statistical errors, business indicators cannot be synchronized accurately, timely and comprehensively, and it is difficult to confirm authoritative data sources.

⑤ The fifth is information islands, where data in different departments, different branches, and regional branch business systems are stored in layers and scattered, resulting in segregated business data and difficulties in data aggregation and circulation. These are some of the common problems we have encountered throughout the informationization process.

Moreover, we lack an enterprise-level perspective for data standards. For example, this is an I-beam, and there are very subtle differences in the descriptions of the engineering department, the purchasing department, and the equipment department. As a result, we may already have this material in the warehouse, but it cannot be automatically summarized in the material procurement, which indirectly leads to inventory. The cost is high, and it also brings great troubles to the analysis of various statistical reports. The reason behind this is the lack of enterprise-level data standards, resulting in a lack of sharing foundations across departments, organizations, business units, and sectors. Moreover, without the support of its own data tools, data quality cannot be guaranteed, and scattered data is difficult to manage and plan in a unified manner.

Let's look at four common challenges encountered in master data management. One is that they do not pay attention to the overall planning of master data and lack top-level design; the second objective reason is that master data of common standards, such as master data generated by international standards, national standards and industry standards, are often at the national level, with decentralized management and lack of convenience Reliable data acquisition channels make it difficult to obtain data; the third is the internal reason of the enterprise, that is, there is already decentralized management of master data within the enterprise, and there is a lack of unified standards and data association; the fourth challenge mainly exists in some large group enterprises. Due to the large number of systems and the long span of time, the data standardization of some early systems was not high, and the transformation was difficult and costly, which brought great difficulties to the integration of master data applications.

We who do enterprise data management will also encounter some common problems. The first one is "two-layer skin" . Many standards have not really been implemented in management. They are reflected in practice and are usually shelved. For example, many coding standards and master data standards are only revealed in annual summary reports or external audits. The second is called "cooked rice" , which is out of touch with the actual management of the enterprise, and the operability of formulating standards is low. The management and operation layers are at a loss, and it is difficult to guide the informatization work. The third is to "stand aside" . Data governance and master data management are "important to say, but secondary to practice, and don't need to be busy". Under the pressure of "tight schedule and heavy tasks", standardized management is often implemented for business systems. Make way, hinder the standardized management of enterprises. "Two layers of skin", "cooked rice", and "stand aside" are the true portrayal of some of the difficulties encountered by our data managers.

Terms and definitions related to master data

Next, we focus on a brief introduction to the definitions of some terms of master data. As we all know, master data is the basic information that meets the needs of cross-departmental business collaboration and reflects the state attributes of core business entities. We can simply understand that the technical data common to two or more systems is called master data. Compared with transaction data, master data has relatively stable attributes and higher accuracy requirements. Master data has three characteristics: accuracy, uniqueness and consistency in heterogeneous systems.

Master data has 5 distinct characteristics, which we call 5 transcendences:

①Beyond the department. Master data is to meet the needs of cross-departmental business collaboration. It is the data that all functional departments need in the process of conducting business. It is the "biggest convention data" for all functional departments and their business processes.

②Beyond the process. Master data does not depend on a specific business process, but it is required by all major business processes. The core of master data is to reflect the state attributes of objects. It does not change with a specific process, but serves as the complete process Invariant elements.

③Beyond the theme. Master data is the core information about business entities that does not depend on a specific business subject but serves all business subjects.

④Beyond the system. The master data management system is the foundation of information system construction and should remain relatively independent. It serves but is higher than other business information systems. Therefore, the management of master data should be centralized, systematic and standardized. At present, many master data tools in our industry are part of the data platform. I think this is also understandable. On the one hand, it provides data consistency and uniqueness guarantee for our front-end business systems, and it also provides protection for our entire data platform and data. Warehouses and data centers provide master data services.

⑤Beyond technology. Master data must apply a technical condition that can be compatible with various heterogeneous systems. In this sense, service-oriented architecture (SOA) provides an effective tool for the implementation of master data.

For example, this is an organization, above which is the basic view, including unit name, nature, mailing address, etc. The basic view is relatively common. For internal units, there will be some fields that the human resources department cares about. In the personnel view, there will be management levels, personnel unit levels, etc. For the finance department, the financial view includes holding ratios, business segments, etc.

In addition to the association between master data, there are also hierarchical associations between upper and lower levels within the master data. For example, there are large categories, medium categories and small categories in the material, and the organizational structure ranges from the company to the office to the position. These are all hierarchical relationships within the master data. This hierarchical relationship is also called master data.

The following is a specific application scenario. There is a piece of equipment in an industrial enterprise, and its KKS code information includes safety level, installation location, quality assurance level, etc.; from the perspective of the material supply chain, it involves information such as the specification, model, and material of the material; from the individual information of the equipment Look, it has purchase time, purchase value, manufacturer, serial number... The same thing may be different data objects in different application scenarios. Through the association relationship, the efficiency of master data maintenance can be improved, and manual repeated entry and redundant storage can be reduced, instead of simple electronic form manual.

Master Data Management System

Master data management involves two systems and one platform. Let's first talk about the management system of master data, which consists of the following three parts.

①Master data standard system: it is the top priority of master data management. It includes master data business standards (coding rules, classification rules, description rules, etc.), master data model standards; and a set of code system table derived.

②Master data guarantee system: It involves five parts: master data management organization, system, process, management, and evaluation.

③ Master data management tools: including data modeling, data integration, data management, data services, basic management, standard management and other functions.

Let's focus on the master data standard system below. The master data standard system involves three parts: one is the application standards and specifications of master data, such as coding rules, classification standards, naming conventions, master data models, and guidelines for submission and review; the second is master data management standards and specifications, which involve The organization system and standardized management process of master data, etc.; the third block is master data integration service standards and specifications, mainly including master data format specifications, original system access specifications, etc.

In a group enterprise, there are many types of master data. The bottom one is the general basic class, also called reference data, such as administrative division, currency, language, etc. Generally, there are about 40 or 50 types of general-purpose basic data in large groups. Most of this kind of data are national standards and industry standards, which change slowly and basically do not change. The group also involves a large amount of data on human resources, finance, and merchants, as well as diversified sectors, such as new energy, real estate, finance, etc., and some sectors will also have some data in their own fields. So we first need to sort out such a master data asset catalog, so that we can promote and apply master data according to each professional sector common to the headquarters.

There are five main categories of application standards for master data. The first category is classification standardization. We generally classify according to natural attributes. The second category is coding rules. We generally recommend using serial codes as coding rules. The third category is naming conventions. Each type of data object in master data has naming conventions. This is very complicated in industrial enterprises, and I will introduce it in detail later. The fourth category is the data model, which is to manage how many fields the master data has in the master data system. The fifth category is the preparation of submission review guidelines, which are used to guide the filling and reporting of master data. Some of these five types of standards are simple and some are complex. If it is a relatively simple master data object, these parts can be combined in one document. If it is more complicated, a separate document needs to be listed.

Let's take asset-intensive industries as examples, such as energy, electricity, petroleum and petrochemicals, and mining. First of all, according to the whole life cycle of the data object, the engineering project designed by the engineering design department has an engineering material code, which is the material code when it arrives at the purchasing department, the material code when it arrives at the legal department, and the material code when it arrives at the warehousing department. The fixed asset code, which involves equipment management and backup requirements in the production operation stage, is the material code again. According to this company-level structure, from engineering material coding, to material coding, to entire engineering data coding, to equipment coding, if the installation location of the equipment is different, it also involves KKS codes, and there are also fault codes, etc. There are many types of codes. When we manage, we have three codes in one and six codes in one. Six codes in one is to count some fault codes and KKS codes, and make the six core codes into a correlation system. In this way, the master data will be can interact.

Next, let's take a look at the differences between fixed assets, equipment, and materials. Fixed assets are viewed from a financial perspective. Equipment is often viewed from a production perspective. Materials are mainly used for purchase and sales. production and maintenance. The same object may need to be coded differently in different links of the entire supply chain, and a relationship between them needs to be established.

Materials are the most complex in industrial enterprises. There are four types of material data: material classification, description rules, coding standards and reporting guidelines.

In large-scale manufacturing enterprises, materials are generally divided into large categories, medium categories and small categories to form a book. All functional departments need to conduct statistical analysis according to the unified natural attribute classification. Classification is particularly important, and which one often affects the whole body. Generally, it cannot be easily adjusted after it is settled.

The coding rule (naming specification) of master data refers to splitting and describing materials according to their natural attributes. Let's take the cable as an example. Its natural attributes include name, combustion characteristics, voltage level, etc. We can split it according to the national standard, and then generate a structured material description template with a unified description format, which will not vary from person to person. leading to differences in naming. After the material description template is split, an unconventional name will be formed, which will be automatically generated by the system, so as to ensure the uniqueness and accuracy of the material through the name.

Regarding the management standards and norms of master data, it involves the organization and system of master data management, master data management process, master data application management and master data management evaluation. The data management system process provides an effective basis and guidance for the development of master data management. It is an important guarantee for master data management and operation. The rationality of the system process = the correct method + the guarantee of implementation.

All companies that do a good job in master data have positions corresponding to master data, such as experts, audits, standard groups, quality groups, etc., to ensure that a complete set of master data standards can operate normally.

Integration service standards for master data are also very important. Because master data needs to provide shared services for all systems, all users, and all business departments, it involves the standardized format of master data, selection criteria for integrated data, and so on. Clarifying master data integration service standards can ensure that master data can provide better services.

Master Data Operating System

Next, let's introduce the master data operation system. The operation system involves the establishment of the management organization, system, process and knowledge base of master data, including the master data management platform.

Large-scale conglomerates generally need to establish an organizational security system for secondary maintenance of "headquarters-subgroups/professional fields", and have corresponding business leading departments for important data, so as to keep the standards "fresh". The following is an example. The master data coding is applied by the user, and the business department conducts a preliminary review through the master data management platform, and then a professional team conducts a professional review. Of course, different data objects will be matched with different professional review teams. After the two approvals, enter the master data coding database and provide services to the business system in various forms. Therefore, we need to build this kind of part-time or full-time team in the enterprise to establish this kind of operation mechanism, so as to ensure the continuity of master data management. In addition, it is necessary to formulate maintenance rules for master data to ensure the normal operation of the data.

Master Data Management Platform

Traditional master data management tools include functions such as master data collection, model coding management, and distribution services.

The service architecture of master data, taking group companies as an example, includes general basic domains, financial domains, human resources domains, etc., as well as master data in some professional fields such as real estate and finance, which can be shared externally through the data service platform with the help of APIs Serve. Of course, some of our main data sources come from business systems, and some may come from external data. For example, the database of business owners can be compared through the enterprise check and sky eye check. The master data platform has a built-in interface. If the business system wants to use the master data, it must first call the service through this interface, so as to ensure the global management of the data in the master data system, and then use the application approval function to ensure the consistency of the master data of the business system. uniqueness and accuracy.

At present, with the application of such large numbers, only managing static fields may not meet the needs of business departments. The new-generation master data management platform not only manages static fields, but also manages unstructured data, semi-structured data based on data objects, and some internal and external related data. That is, we use data as the starting point for big data analysis, and use big data scenarios to match multiple data domains and extract business insights.

Multi-dimensional management of master data may gradually blur the original concept of static management. Let's take the well/wellbore in the petrochemical industry as an example to see the new master data management.

The whole life cycle of well data, including well deployment design, pre-drilling engineering, drilling engineering, etc. During the drilling and logging process of wells, some videos, pictures and data of drilling, as well as a large amount of document data will be involved. We can bring all these data together through semantic recognition. In this way, we can see the whole life cycle process of the wellbore master data from development to retirement, which is of substantial help to our business. If you only care about one code and one name, the business value may not be reflected. Our future master data must develop in this direction, so that it can better solve the pain points of the business, instead of just managing static data.

The master data service needs to introduce the data service bus and the microservice gateway, and publish the completed master data to an API gateway. Master data services include query, quick addition, operation and maintenance, in-depth analysis, etc. Based on the comprehensive analysis capability of master data for structured data, semi-structured data, and unstructured data, users can see the required master data objects more easily at a glance.

At present, many Fortune 500 companies in China have started to master data again, and the original tools can no longer meet their needs. Based on such a data integration platform, we can do master data construction.

Implementation method and difficulty analysis of master data governance project

Let's share how to implement master data and what are the difficulties.

We divide the master data into 7 stages, about 28 steps.

The more difficult thing is to do a good job in the research and analysis of the current status of master data. It is necessary to judge the specific needs of master data through business research and information research. The construction of the master data standard system in the third stage accounted for about 30% of the entire workload. Master data standards are also dynamic rather than static. With the continuous refinement of master data management granularity, master data standards will also be dynamically adjusted accordingly. Master data cleaning accounts for a large workload in the whole process, about 40%. After master data standards are established, we need to clean our business systems. Afterwards, the tool platform should be integrated with all business systems in the form of services. The last is the construction of the operating system of the master data. The establishment of data standards and data cleaning can be completed in about half a year, but every company is not a blank sheet of paper, and different companies do it to different degrees. Some new and old system switching paths may take three to five years or even For a longer time, this is the most difficult thing to risk. Therefore, there is no turning back when the master data project is launched. Once the project is launched, the first phase, the second phase and the third phase may be completed for many years.

The implementation of master data standards is also relatively complicated, especially in the case of many and repetitive enterprise systems. For the system under construction or the system to be built, it is relatively easy to directly implement the standards we have built. It is more difficult for the established system. One way is to replace the original master data standard with the latest master data; the second is to compare. The comparison is not a particularly good way, but sometimes there is no way, once you want to compare , the workload is also very large.

So we roughly have three options. The first one is to re-launch the system, which is equivalent to re-initializing, which actually has a relatively large impact on the enterprise. The second solution is to make a systematic adjustment to the original system, mainly to convert the old and new object codes. The third solution is to adjust the original system in stages. This adjustment is not the best solution, and time needs to be exchanged for space.

Therefore, it is extremely difficult to implement master data standards. If a group company has too many systems, it may take two or three years or even longer to gradually implement this set of standards in each system.

summary

That's all for what we're going to talk about today. Let me briefly summarize that master data is the source of data, the core of data asset management, the golden data in data, the cornerstone of information system interconnection, and an important foundation for informatization and digitization. Doing a good job in master data governance can establish a very important foundation for data analysis and data entry into the lake. Master data is closely related to our business system. Only by doing a good job of master data can we establish a better foundation for big data analysis.

Guess you like

Origin blog.csdn.net/esensoft123/article/details/130347427