In-depth interpretation of DAMA-DMBOK2

start of text

foreword

The field of data management is an emerging field in the development of information technology. With the rapid development of the Internet, globalization and informatization, the importance of data management has become increasingly apparent. Data management is a set of technologies, methods and corresponding management and governance processes necessary to integrate business and information technology. Its special positioning determines that it involves a wide and deep knowledge system, and it is not an easy task to sort out and explain clearly the various knowledge fields and the interrelationships between them. DAMA International published the "DAMA-DMBOK2 Data Management Knowledge System Guide (Second Edition)" through the analysis and summary of the best practices in data management in the industry. This book provides the standard industry of data management functions, terminology and best practices Explain, provide the overall framework of data management, and provide an important theoretical basis for the development of data management.

one

Background and overview of this book

The International Data Management Association (also known as DAMA International, hereinafter referred to as "DAMA") is a non-profit association composed of global data management and business professional volunteers, dedicated to the research and practice of data management. Since its establishment in 1980, DAMA International has been committed to the theoretical research, practice, lessons and related knowledge system construction of data management, and has accumulated extremely profound knowledge and rich experience in the field of data management.

For decades, DAMA has organized many international senior experts in the field of data management to compile books, in-depth elaboration of the complete knowledge system in various fields of data management. As the final embodiment, DAMA's Data Management Body of Knowledge [DAMA-DMBOK2: Data Management Body of Knowledge.2 nd Edition], the Chinese version is "DAMA Data Management Body of Knowledge Guide (Second Edition)", the second English version in 2017 published in the year.

The Chinese version of "DAMA Data Management Knowledge System Guide (Second Edition)" is published by Machinery Industry Press in China and will be available at the end of May. The whole book was translated by many members of the China Branch of the International Data Management Association voluntarily, which is a milestone work.

Figure 1. Cover of DAMA-DMBOK2

This book is a summary of DAMA International's knowledge and practices in the field of data management over the past 30 years. It was written by members after years of repeated discussions with industry experts.

It is the only authoritative book on the market that synthesizes all aspects of data management. It is not uncommon to see books on a specific field of data on the market, but as far as I know, there is only one that discusses various fields of data management as a complete body of knowledge, and this is one of the main points of this book. Special feature.

The theoretical framework of DAMA-DMBOK2 consists of the wheel diagram (11 functional areas of data management) and the hexagonal diagram of environmental factors (7 basic environmental elements) together to form the "DAMA Data Management Knowledge System" (the vertical axis is the 11 areas of data management). functional areas, the horizontal axis is 7 environmental elements), and each data functional area works under the constraints of 7 basic environmental elements.

Figure 2. DAMA Data Management Body of Knowledge

  • "DAMA-DMBOK2 Functional Framework" defines 11 main data management functions, and describes each function through 7 environmental elements. The matrix below gives this framework graphically.

  • Data management functions include data governance, data architecture, data modeling and design, data storage and operations, data security, data integration and interoperability, document and content management, reference data and master data management, data warehousing and business intelligence, metadata management, data quality management.

  • Essential context elements: goals and principles, organization and culture, tools, activities, roles and responsibilities, deliverables, technology.

Figure 3. DAMA-DMBOK2 functional framework

Each chapter of the DAMA-DMBOK2 Guide introduces a data management function and discusses the seven contextual elements of that function. Each chapter is discussed in varying degrees of depth, depending on the specific issues discussed. Each chapter follows a unified structure that includes:

(1) A brief introduction to the function, including definitions of key terms, a relationship diagram, and a list of business objectives.

(2) Description of concepts and activities: including relevant deliverables, responsible roles and organizations, best practices, common procedures and methods, supporting technologies, etc. There are chapters where concepts and activities are defined separately for each subfunction.

(3) An overview: includes a list restating the guiding principles, a table restating the activities, deliverables, and roles of responsibility related to the function, and a brief discussion of organizational and cultural issues.

(4) List of recommended readings: The optional books and articles are given for reference.

two

Purpose of this book and audience

2.1

Purpose and objectives of this guide

The book "DAMA-DMBOk2 Data Management Body of Knowledge Guide" (that is, "DAMA-DMBOK2 Guide") further promotes the development of the data management industry. The purpose of this guide is to provide a definitive overview of the science of data management and it is not intended to be an encyclopedia of data management or a comprehensive treatise on all things data management related. Instead, this guide provides a brief introduction to data management-related concepts and identifies key deliverables, roles, principles, technical and organizational cultural aspects of data management objectives, functions and activities. It briefly introduces generally accepted good practices and important alternatives.

The main 10 uses and goals of the DAMA Data Management Body of Knowledge Guide (2nd Edition) are:

  • Reach a common consensus on the data management function, so that different readers can understand the nature and importance of data management.

  • Provides definitions of commonly used data management functions, deliverables, roles, and related terminology standards to help data management professionals and data management professionals understand their roles and responsibilities.

  • Helps organizations develop an enterprise data strategy. Identify guiding principles for data management and help build consensus in the field of data management.

  • Guidance on efforts to implement and improve the data management function, widely adopted methodologies and technologies, and important alternatives, without reference to specific technology vendors or products.

  • Concisely identify common organizational and cultural issues.

  • Clarify the scope and boundaries of data management.

  • Directs readers to additional resources to enhance their understanding of data management.

  • Provides the basis for data management effectiveness and maturity assessments.

  • Directs higher education systems to develop and deliver data management course content.

  • Helps data management professionals prepare for the CDMP exam.

2.2

Suitable audience for this book

The author believes that the following 12 major reader groups are suitable for reading this book:

  • Informatization supervisor (CIO) of enterprises and institutions.

  • Data Management Director (CDO) of enterprises and institutions.

  • IT personnel of enterprises and institutions.

  • Data management teams of enterprises and institutions, full-time and part-time data management personnel, and solution providers for data-related projects.

  • Data management specialists of various business functional departments of enterprises and institutions.

  • Practitioners of accounting firms

  • Consulting firm's risk, compliance, management, data governance practitioners.

  • Lawyers and practitioners of law firms in terms of compliance and rights and interests.

  • Certified and aspiring data management professionals

  • Educators responsible for developing and delivering data management courses.

  • MBA and information management major undergraduate students and above.

  • Researchers in the field of data management in the government sector.

At the same time, "DAMA Data Management Knowledge System Guide (Second Edition)" has a comprehensive and systematic entry point, which is suitable for textbooks for MBA or computer majors or above in colleges and universities.

three

Main changes and new additions of MBOK1 and DMBOK2

Compared with DMBOK1, DMBOK2 has 8 changes, which are detailed as follows:

3.1

Chapter Changes

DMBOK1 removed the "Data Development" chapter, added "Data Modeling and Design" and "Data Integration and Interoperability", and added "Data Processing Ethics", "Big Data and Data Science", "Data Management Maturity Assessment" ", "Data Governance Organization and Role Expectations", "Data Governance and Organizational Change Management" chapters

Figure 4. Comparison of DAMA-DMBOK wheel diagram changes in different versions

3.2

Data Governance Embedded in Knowledge Areas

(1) Data governance is not only introduced as an independent chapter, but also in each knowledge domain chapter, a special section is added to introduce the content related to governance in this domain.

(2) Emphasize the integration of data governance into the system design and development process, making data governance a powerful guarantee to ensure system quality and data quality. It implements the whole process of system construction, and data governance is more practical.

3.3

Changes in the body of knowledge

The importance of data architecture, data modeling and design has been strengthened, and data standards are included in data model design. Emphasizes data design and dropping bids during the design process.

3.4

Changes in Data Governance Philosophy

(1) Transform from post-governance to pre-governance, from passive governance to active governance, from theory to practice, from simple governance to governance + service expansion, and from traditional data to big data.

(2) Emphasize risk management and corporate culture factors, and add a section of "Implementation Guidelines" for each knowledge area, including risks that may be encountered in the practice of risk assessment in various areas and suggestions for countermeasures.

3.5

Stronger landing

(1) Data governance is embedded in business development, system construction, and data application processes, and the concept of governance moves from the virtual to the real.

(2) In the process of introducing the 11 knowledge systems, each part introduces the implementation methods and tools in detail, which is convenient for implementation.

(3) A section of "Implementation Guidelines" is added for each knowledge area, which provides suggestions and reflections on the assessment of the current situation and the transformation of corporate culture.

3.6

Hexagon content changes

"Actual Combat and Methods" and "Main Deliverables" are merged into delivery management, and tools are added;

Figure 5. Comparison of hexagonal changes in DAMA environmental factors

  • Added to the diagram to show the classification of people, process and technology;

  • "Practices and methods" is replaced with "tools";

  • The content of the hexagon has been changed to "practical combat and methods", and the two parts of "main deliverables" have been merged into delivery management, and the content of tools has been added. DMBOK1 focuses on the theoretical knowledge system. In the two parts of "actual combat and methods" and "main deliverables", it emphasizes methodology, while DMBOK2 emphasizes the importance of delivery and tools;

3.7

The Evolution of Data Management Frameworks

The framework begins to capture value from the guiding goal of data management, so it is related to the full life cycle of data, and the derived value requires life cycle management. Starting from the data life cycle, data governance runs through the entire data development process.

Figure 6. DAMA data management functional framework

Figure 7. DAMA functional domain dependency diagram

3.8

More technical and process-oriented

Data governance must be embedded in business development, system construction, and data application processes, and supported by tools. The introduction chapters and shares of tools have increased significantly.

Four

Core content introduction

It can also be used as a business reference for data management professionals. The book has 17 chapters in total. They are:

Figure 8. DAMA-DMBOK2 chapter distribution

  1.   Data Governance: Provide guidance and supervision for data management by establishing a data decision-making system that can meet the needs of enterprises. These authorities and responsibilities should be established taking into account the overall needs of the organization. (see Chapter 3)

  2.  Data Architecture (Data Architecture): Defines the "blueprint" for managing data assets that is coordinated with the organization's strategy, guides the organization's strategic goals, and specifies a data architecture that meets strategic needs. (see Chapter 4)

  3. Data Modeling and Design: The process of discovering, analyzing, presenting, and communicating data requirements in the precise form of a data model. (see Chapter 5)

  4. Data Storage and Operations: Aiming at maximizing the value of data, it includes the design, implementation and support activities of stored data, as well as various operational activities throughout the data life cycle, from planning to destruction. (see Chapter 6)

  5.  Data Security: This activity ensures data privacy and security, and the acquisition and use of data must be secured. (see Chapter 7)

  6. Data Integration and Interoperability: Includes processes related to data movement and integration between data stores, applications, and organizations. (see Chapter 8)

  7. Document and Content Management: The lifecycle processes used to manage data and information in unstructured media, including planning, implementation, and control activities, especially documentation required to support legal and regulatory compliance requirements. (see Chapter 9)

  8. Reference data and master data management (Reference and Master Data Management): including the continuous coordination and maintenance of core shared data, so that the real information of key business entities can be used consistently across systems in an accurate, timely and relevant manner. (See Chapter 10)

  9. Data Warehousing and Business Intelligence (Data Warehousing and Business Intelligence): Includes planning, implementation, and control processes to manage decision-support data and enable knowledge workers to derive value from data through analytical reporting. (See Chapter 11)

  10. Metadata Management: includes planning, implementing, and controlling activities to enable access to high-quality integrated metadata, including definitions, models, data flows, and other information critical to understanding data and its creation, maintenance, and access system helps). (See Chapter 12)

  11. Data Quality Management: includes planning and implementing quality management techniques to measure, evaluate and improve the suitability of data within an organization. (See Chapter 13)

In addition to the chapters on knowledge domains, DAMA-DMBOK, other than the wheel diagram, contains the following thematic chapters:

  1. Data Handling Ethics: Describes the central role of data ethics norms in promoting information transparency and socially responsible decision-making in the process of data and its application. Awareness of ethics in data collection, analysis, and use guides all data management professionals. (see Chapter 2)

  2. Big Data and Data Science: Describes the emerging technologies and business processes for the increased ability to collect and analyze large, diverse data sets. (See Chapter 14)

  3. Data Management Maturity Assessment: Outlines methods for assessing and improving an organization's data management capabilities. (See Chapter 15)

  4. Data Management Organization and Role Expectations (Data Management Organization and Role Expectations): Provides practice and reference factors for forming a data management team and achieving successful data management activities. (Chapter 16)

  5. Data Management and Organizational Change Management: Describes how to plan for and successfully drive the change in corporate culture that is a corollary of effectively embedding data management practices in the organization. (Chapter 17)

1

Chapter 1, the main content of data management

(1) 9 core principles of data management

Figure 9. Nine core principles of data management

  • Data is an asset in its own right: Data is an asset, but it is managed in some ways very differently than other assets. Comparing financial and real assets, one of the most obvious features is that data assets will not be consumed during use.

  • Data value can and should be expressed in economic terms: calling data an asset implies that it has value. While there are technical means to measure the quantity and quality of data, there are no standards for doing so to measure its value. Organizations that want to make better decisions about their data should develop consistent methods to quantify that value. They should also weigh the costs of low-quality data against the benefits of high-quality data.

  • Managing data means managing data quality: Ensuring that data meets application requirements is the primary goal of data management. To manage quality, organizations must ensure they understand stakeholders' quality requirements and measure data against those requirements.

  • Managing data requires metadata: managing any asset requires first having data about that asset (number of employees, account numbers, etc.). Data that is managed and how it is used is called metadata. Because data cannot be held or touched, understanding what it is and how to use it requires defining this knowledge in the form of metadata. Metadata arises from a range of processes related to data creation, processing, and use, including architecture, modeling, management, governance, data quality management, systems development, IT and business operations, and analytics.

  • Managing data requires planning: Even small organizations can have complex technical and business process blueprints. Data is created in multiple places and moved between many storage locations due to usage. Some coordination is required to keep the end result consistent, planning from an architectural and process perspective.

  • Managing data is a cross-functional effort: it requires a range of skills and expertise, so a single team cannot manage all of an organization's data. Data management requires technical competencies, non-technical skills, and the ability to collaborate.

  • Data management requires an enterprise-wide perspective: While there are many local applications for data management, it must be effectively applied across the enterprise.

  • Data is fluid, and data management must constantly evolve to keep pace with changes in how data is created, how it is used, and who consumes it.

  • Data management is the management of the whole life cycle: data has a life cycle, so data management needs to manage its life cycle. The data lifecycle itself can be quite complex, as data in turn generates more data. Data management practices need to consider the entire life cycle of data.

(2) Context diagram of knowledge domain

Figure 10. Knowledge Domain Context Diagram

  • Details of the Knowledge Area are described, including those related to people, processes, and technology. They are based on the concept of a SIPOC diagram for product management (suppliers, inputs, activities, deliverables and consumers).

  • Context diagrams center the activities that produce deliverables that meet stakeholder needs. Every Context Diagram begins with a Knowledge Area definition and objectives.

  • Activities that drive objectives (centres) are divided into four phases: planning (P), developing (D), operating (O), and controlling (C).

  • In the inflow activity on the left are inputs and suppliers. Flowing from the activity on the right are deliverables and consumers. Participants are listed below the activity.

  • At the bottom are the tools, techniques, and metrics that affect every aspect of the knowledge domain.

(3) DAMA Pyramid

Figure 11. DAMA Pyramid Chart

Phase 1: Organizations purchase applications that include database functionality. This means that organizations use it as a starting point for data modeling, design, data storage, and data security. In order for the system to operate in its data environment, work on data integration and interoperability is also required.

Phase 2: Once they start using the application, they will discover data quality challenges. But achieving higher quality data depends on reliable metadata and a consistent data schema. They illustrate how data from different systems work together.

Phase 3: Managing data quality, metadata, and schema requires rigorous practice of data governance to provide systematic support for data management activities. Data governance also supports the implementation of strategic initiatives, such as document and content management, reference data management, master data management, data warehousing, and business intelligence, all of which are fully supported by advanced applications in the golden pyramid.

Stage 4: The organization leverages the benefits of well-managed data and improves its analytical capabilities.

2

Chapter 2, Data Processing Ethics

(1) Data processing ethics context diagram

Describe the basic principles that make up the ethics of data management; explain how an ethical approach to data can help organizations avoid improper use of data and the resulting harm to customers, reputation or the wider community.

Figure 12. Data processing ethics context diagram

(2) Data Ethics Guidelines

Respect for Others:  This code reflects the most basic ethical requirement for treating human beings, which is respect for the dignity and autonomy of the individual.

Principle of good deeds: This principle has two elements: first, do no harm; second, maximize benefits and minimize harm.

Fairness: This code recognizes that people are treated fairly and equitably

(3) Establish an ethical data processing culture

Establishing an ethical data handling culture requires understanding existing practices, defining expected behaviors, codifying them into policy and ethics, and providing training and oversight to enforce expected behaviors, among other initiatives related to managing data and changing culture Again, this process requires a strong leadership push.

Ethical data processing clearly includes compliance with the law. It also affects how data is analyzed, interpreted and utilized both inside and outside the organization. An organizational culture that values ​​ethical behavior not only has a code of conduct, but also ensures that there are clear communication and governance mechanisms in place to support those who are aware of unethical behaviour. or risky employees.

Figure 13. Ethical Risk Model

(4) Main points of view

  • Organizations need to handle data ethically or risk losing the trust of customers, employees, partners and other stakeholders;

  • Data ethics is rooted in the basic principles of society and the basic requirements of ethics and morality;

  • Data-related regulation is based on these same principles and requirements, but regulation cannot cover all contingencies. Therefore, the organization must take into account the ethical and moral norms of its own behavior;

  • Organizations should foster a culture of ethical responsibility for the way they handle data, not only for compliance, but also as the right thing to do;

  • Ethical data handling will ultimately provide organizations with a competitive advantage as it is the foundation of trust.

3

Chapter 3, Data Modeling and Design

(1) Data modeling and design context diagram

Data modeling and design : Data modeling is the process of discovering, analyzing, and determining data requirements, then representing and communicating those data requirements in a precise form called a data model. The process is iterative and may include conceptual, logical and physical models.

Figure 14. Data modeling and design context diagram

4

Chapter 8, Data Integration and Interoperability

Definition: Manage and integrate data transferred within or between applications and organizations.

Figure 15. Data integration and interoperability context diagram

5

Chapter 14, Big Data and Data Science

Big data not only refers to the large amount of data, but also includes the types of data (structured and unstructured, documents (documnents), files (files), audio, video, streaming data, etc.), as well as the speed at which data is generated. Those who explore and develop predictive models, machine learning models, prescriptive models, and analytical methods from data, and deploy the results of the development for analysis by interested parties are called data scientists.

Big Data and Data Science: Many different types of data collection (big data) and analysis (data science, analytics, visualization), all aimed at gaining insights and solving problems that were not initially known.

Figure 16. Big data and data science context diagram

As big data is loaded into data warehouse and business intelligence environments, data science techniques are used to provide organizations with a forward-looking view (“windshield”). Using different types of data sources to achieve predictive capabilities and model-based real-time analysis capabilities can provide deeper insight into the future development direction of the organization.

Figure 17. Convergence Information Triangle

Taking advantage of big data requires a change in the way data is managed. Most data warehouses are based on a relational model, while big data generally does not use a relational model to organize data. Most data warehouses rely on the concept of ETL (Extract, Transform and Load). Big data solutions, such as data lakes, rely on the concept of ELT - load first and transform later. What's more, the velocity and volume of data production creates challenges that require different approaches in various key areas of data management, such as integration, metadata management, and data quality assessment.

Figure 18. DW/BI concept and big data architecture

6

Chapter 15, Data Management Maturity Assessment

The maturity model defines the level of maturity by describing the capability characteristics of each stage. When an organization meets the capability characteristics of a certain stage, it can assess its maturity level and develop a plan to improve the capability. It also helps organizations to make improvements guided by ratings, comparing them with competitors or partners. At each new level, competency assessments become more consistent, predictable, and reliable. Levels are boosted when abilities take on characteristics that do not match the level. But there is an established order of competency levels and no levels can be skipped.

Figure 19. Data Management Maturity Assessment Context Diagram

CMM usually defines five to six maturity levels, each level has its own characteristics, from the initial level to the optimization level. The Data Management Maturity Assessment Framework is divided into discrete data management topics, with framework focus and content depending on whether they are for general or industry-specific use.

Figure 20. Example of a data management maturity model

7

Chapter 16, Data Governance Organization and Role Expectations

Most organizations are faced with an increasing amount of data. These data formats are diverse, large in quantity, and come from different channels. Due to the increase in the amount and variety of data, the complexity of data management has been exacerbated. At the same time, data consumers demand faster and easier access to data, and they want to understand and use data to solve critical business problems in a timely manner. Data management and data governance organizations need to be flexible enough to work effectively in an ever-evolving environment. Therefore, fundamental questions about ownership, collaboration, responsibility and decision-making need to be clarified.

This chapter describes a set of principles that should be considered when forming a data management or data governance organization. It involves both data governance and data management, as data governance provides the direction and business context for the activities performed by the data management organization. Neither has a perfect organizational structure. While data governance and data management organizations should follow some common principles, many details depend on the drivers of the industry in which the organization operates and the organization's own corporate culture.

Figure 21. Assessing the data management organizational operating model

The operating model is the starting point for improving data management and data governance practices. Before introducing an operating model, one needs to understand how it will affect the current organization and how it might evolve. As the operating model will help in the definition, approval and enforcement of policies and processes, it is critical to determine which operating model is best for the organization.

Assess whether the current organizational structure is centralized, decentralized, hybrid, hierarchical or relatively flat? Describe the independence of the relevant department or area. Are they almost self-sufficient in their operation? Are their requirements and goals very different? Most importantly, try to determine how decisions are made (e.g., democratic or mandated), and how are those decisions enforced?

8

Chapter 17, Data Management and Organizational Change Management

A successful data management practice requires:

  • Learn about horizontal management by aligning data accountability systems with the information value chain.

  • Transform vertical (silo) data accountability into a shared information management effort.

  • Evolving a local business concern or information quality in the work of the IT department into a core value for the entire organization.

  • Transform the thinking on information quality from "data cleaning and data quality scorecard" improvement to the basic capabilities of the organization.

  • Measure the cost of poor data management against the value of standardized data management.

Organizational change management expert John P. Kotter has summarized a basic set of "laws of change" that describe why change is not easy. Recognizing these issues at the beginning of the change process can help to achieve success.

  • If the organization does not change, the people will change: It is not because a new organization is announced or a new system is implemented that it is necessary to change. Change happens when people change their behavior in recognition of the value it brings. Improving data management practices and implementing a formal data governance process will have a profound impact on organizations. There needs to be a change in the way people work with data and how they interact in data-related activities.

  • People don't resist change, but they resist being changed: People cannot accept change that seems arbitrary or authoritarian. They are more likely to be willing to make change if they are consistently involved in the change, if it is defined, and if they understand the vision to drive it and know when and how it will happen. The change management part of a data-related initiative involves working as a team to build an understanding at the organizational level of the value of improved data management practices.

  • Things exist because of inertia: things may be as they are for good historical reasons. At some point in the past, someone defined the business requirements, defined the process, designed the system, wrote the strategy, or established a business model that needs to change just now. Understanding the origins of current data management practices will help organizations avoid historical mistakes.

  • Unless someone drives change, progress is likely to stop: if there is to be improvement, new steps must be taken.

  • Change is easy if the human element is not taken into account change is usually easy to implement at the "technical" level. The challenge comes from how to deal with the natural differences between people.

conclusion

With the advent of the big data era, "data is an asset" has become the core industry trend. In this era of "data is king", the decisive factor for the development of enterprises is not the competition for one city and one pool, nor is it the production factors in the traditional sense of land, manpower, technology, and capital, but has been neglected for a long time. "Data Assets".

The key to data becoming an asset lies in data mining and analysis, and data governance needs to be carried out in an asset-based manner to achieve steady progress in "application and management". Only through the integration of data can the enterprise data be connected up and down and integrated horizontally, can the operation of data assets be better done.

Data governance has become an indispensable basis for refined management of enterprises and institutions. Only by effectively implementing data governance can we improve the quality of enterprise data, realize the sublimation of data value, and truly become a boost for enterprises to cope with market challenges.

picture

Guess you like

Origin blog.csdn.net/kuangfeng88588/article/details/119118390