Tag category system (business-oriented data asset design methodology) - Reading Notes 5

Chapter 5 Method: Complete Design Method

1. 3 building prerequisites

1.1 Unified data thinking

1.1.1 Data awareness

To determine whether an enterprise has a clear and unified understanding of its own data, it mainly depends on whether it can answer the following three questions and form a unified answer:

  • Where is the data?
  • What is the value of data?
  • How to use data?

1.1.2 Data architecture

In the DT era, data architecture is used to ensure smooth data flow, turn data resources into data assets, and act on the lifeline of business and product commercial value.

  • Business dataization: In the data architecture, the bottom layer is the information system data generated by each business system through business processes or targeted data retention. Using tools for collection and exchange, technicians can clean, exchange, and summarize data in business systems to form an enterprise's data center.
  • Data capitalization: In the data center, data developers can use different computing engine tools such as offline, real-time, and algorithm development to process data in various types: sorting and processing raw data into information that can be understood, viewed, and used by the business. Data assets also exist in asset management tools, and then continue to be governed and optimized.
  • Asset servitization: In the asset center, the data assets that have been standardized organized and sorted are filtered and poured into the service component tool. In the service center, all data calls, operations, etc. can be measured, globally monitored, and scheduled and configured.
  • Businessization of services: Created data services (APIs) can directly connect to existing business systems or data application products at the encapsulation layer interactive interface, ultimately supporting business problem solving or improving business execution efficiency and generating business value.

1.1.3 Execution guarantee

Execution guarantee refers to the need to do a good job in corresponding data strategy guarantee and response work from all levels of the enterprise:

  • The leadership must give the direction of the data strategy and adjust the organizational structure to provide corresponding organizational support;
  • The management needs to transform the data strategy into tactical support, formulate specific data-oriented operational plans and assessment indicators, and guide the effective refinement and downward transmission of the data strategy;
  • The execution layer needs to actively work to ensure the smooth advancement of data flow and the orderly connection of data architecture links based on the operational plan and assessment indicators, and continuously improve the ability and boundaries of operating and using data through the learning of data thinking and data knowledge.

1.1.4 Value driver

The realization of the value of data assets is the most fundamental goal. What data should be connected, what data assets should be built, how to manage and optimize it, what kind of data architecture should be built, and what support should be provided all need to be considered from the perspective of value.

1.1.5 Scenario capabilities

Facing an uncertain future, enterprises need a flexible data support capability: data items, data items and data services are loosely coupled and can be split or combined at any time. Data asset items and data service capabilities with reusable value can be deposited in the data center in advance.

1.2 Sufficient preliminary research

1.2.1 Business scenario research

Who are the customers to be served? How is the customer's business process carried out? How does the customer's daily work process proceed? What are the upstream nodes associated with it? What are the downstream nodes? What are the client's current role goals? What is the team goal? What are the departmental goals?

1.2.2 Survey on demand pain points

What pain points exist in which business link? What are the causes of these painful injuries? How serious are these pain points and what serious consequences will they have? What kind of demand does this create?

1.2.3 Data thorough research

  • Don't let users assume what kind of data there is, but seek truth from facts and register and sort out the data that actually exists.
  • Don't make customers underestimate their data.

1.3 Implement ideas correctly

1.3.1 Sort out the data category system according to the business process

In the information system of the company's existing business processes, the company's existing data situation can be sorted out, and then the company's data category system can be constructed. It consists of data organized by "process", data organized by "things", and data organized by "people".

1.3.2 Design label category system according to business needs

After a complete combing of the company's existing data memory, a label category system can be designed according to business needs. Labels designed according to "people/things" need to consider business needs on the one hand, and on the other hand they must be based on data organized by "people/things".

2. 6 design steps

2.1 Identify objects

"People" are the subjects who actively initiate actions; "things" are the recipients of actions; "relationships" refer to certain connections between people and things, people and people, things and things, etc. at a certain moment. Including various strong and weak relationships such as behavioral relationships, belonging relationships, and thinking relationships.

2.2 Opening up data of the same object

Since the same object has information records organized by different IDs in multiple systems, it is necessary to identify the same object between multiple IDs.

2.2.1 ID-Mapping technology

ID-Mapping technology in the field of big data is used to solve the problem of connecting multiple sources of data for a certain object. Input pairs of ID relationships, use machine learning algorithms to perform probability matching calculations, and build an ID relationship network.

2.2.2 One-ID

Each website can customize a unified ID for users, referred to as One-ID. Each user account ID can uniquely correspond to a specific One-ID code.

2.2.3 4 levels of ID

  • First-level ID: Strong identity attribute ID, such as ID card information, passport number, driver's license number, face ID, fingerprint ID, iris ID, etc., which is a number used to uniquely identify individuals in real society.
  • Second level ID: Device-related ID, such as mobile phone number, mobile IMEI, mobile IDFA, mobile phone MAC, PC MAC, etc., which are closely related to individuals.
  • Third level ID: ID related to registered accounts, such as Alipay account, Taobao account, WeChat account, water meter account, medical insurance account, game account, etc. They often reflect individual socialization behaviors.
  • Fourth-level ID: Temporarily record related IDs, such as cookies, IP addresses, GPS positioning, operating behaviors, etc. This type of ID is a weak ID. When no higher-level ID is available, they can also be used to establish a relationship with the core ID. Temporary relationship.

2.2.4 Association operation between ID and ID

After setting the bed frame rules for One-ID, communicate information between One-ID and other IDs. As more and more associated IDs are connected to One-ID, the number of pairwise relationship pairs between IDs will grow faster and faster.

2.3 Data-based expression of things

Data mapping for quick reading of the real world: map all things into three categories: "people", "things" and "relationships", and systematically sort out the full-dimensional attributes of each object and the specific attribute values ​​downstream of each attribute. The meanings expressed by data-based things include the following:

  • This method is the basis for building the data category system and label category system;
  • This method can help business personnel to change their data thinking, learn to use data language to express and transform business pain points;
  • Through this digital disassembly method, data information can be cleaned and sorted out in an orderly manner;
  • This approach can not only solve current business problems, but also trigger inspiration for future data usage.

2.4 Construct a data category system

2.4.1 Object abstraction of data category system

Enterprise information construction accumulates various raw data through steps such as acquisition, storage, and communication. Conventional enterprises usually only store data according to "business processes". Enterprises with a certain informatization foundation or completed data warehouse construction may specifically extract information related to "people" and "things" from the database of the business system, and Combined with the basic information tables of "people" and "things", a proprietary database of "people" and "things" is established to achieve comprehensive information summary of entity objects.

2.4.2 Category sorting of data category system

After the object directory is determined, the data categories under the object need to be expanded.

1) Sort out the data categories of the "process" object

Process data categories can often classify data according to business ownership, business repository, business tables, etc.

2) Sort out the categories of entity objects such as “people” and “things”

The data category system reflects the builder's understanding of the enterprise's original data. How databases, data tables, and data fields are organized in reality can be converted into data categories accordingly, without being too divergent.

After the original data is processed according to the data category system, no matter what changes the previous business makes to the form, cycle, transmission method, etc. of data collection, the data will remain stable after being transferred to the data category system. The management of original data, that is, the data category system, will not change the underlying structure along with upper-level changes such as business forms and business activity plans.

2.5 Build a label category system

After sorting out the data category system originally accumulated by the enterprise, it is necessary to design tags and tag category systems according to the needs of the business scenario. The design process of the label category system is more complicated than the data category system.

2.5.1 What is a label

The data in the data category system refers to the original data of the enterprise, which has not yet been processed, and is the category of all data that needs to be cleaned and processed. It is oriented to the data collection end and answers the question "Where is the data?" "Tag" refers to data resources that are cleaned and processed from original data and can be used by the business and generate value. Generally, they need to be structured to field granularity to ensure service-oriented use. It is oriented to the data application side and answers the questions of "how to use data" and "what is the value of data".

1) Two major prerequisites for label design

Labels cannot be designed out of thin air, and the data feasibility of label development and implementation must be considered; at the same time, labels must be business needs and data items that can help business personnel make business judgments, support, and assistance.

2) The difference between tags and tag values

Tags describe the attributes of an object and are data resources structured to field granularity. Tags describe the essence of a certain type of object; tag values ​​are phases, which often change with time and space, and each person has different tag values.

2.5.2 5 ideas for label design

The design of labels generally comes from the abstraction of business demands. Decomposing business pain points into corresponding data solutions and decomposing data resources in the data solution to field granularity is the label design process.

  • Divergent from the perspective of core word attributes;
  • Diverging from the perspective of inclusion and ownership;
  • Think in terms of detailed content;
  • Think in terms of the development process;
  • There is a special label design idea that comes from the unified abstraction of the same type of things. If you need to highlight the type, you can also convert the values ​​into labels: add a "whether" prefix or a "degree/index" suffix to each value of the label.

2.5.3 Why is a label category system needed?

According to the principle of data mapping, it can be estimated that there will be a lot of data items accumulated by the enterprise or enterprise business line and label items that the business needs to use. A classification mechanism is needed to systematically classify tags.

  • Establish a label category system to classify and manage labels, optimize governance, and promote planning;
  • The label category system can be used to design labels based on objects and categories. Think systematically and plan the classification and grading of data assets, and organize production labels from top to bottom and from coarse to fine.

2.5.4 Structure of label category system

1) Root directory

To build a tag category system, you first need to determine the root directory. The subdivided groups in the display are mapped to the label category system. Whether they are subdivided root directories or do not need to be subdivided is distinguished by attribute values.

  • Are the properties of the segmented objects very different? When the information construction of enterprises becomes more and more complete and rich, and the research on objects becomes more and more in-depth, it is necessary to build a lot of objects and a label category system based on a certain object.
  • Don’t use real-world classifications directly as categories. If the attribute types of the subdivision objects are not similar, they will be directly split into multiple subdivision objects; if the attribute outlines are similar, they will be one object. The categories in the label category system are classifications of labels, not objects.

2) Attribute tags of objects

The "person" and "thing" entity objects naturally have some static tags. When a "person" or "thing" moves and some kind of connection occurs, a new object of "XX relationship" will be generated. This new object naturally carries the dynamic label of the relationship occurrence. The dynamic tags of these "relationship" objects can be projected onto "people" or "things", causing them to generate corresponding dynamic tags, such as: browsing duration, browsing channel, browsing depth and other dynamic tags.

3) Category system

The category system refers to the classification and structure organization method of a certain type of items. The category system is a tree structure, and the first-level branches growing from the root directory become first-level categories. The first and second branches growing from the first-level branches become the second-level categories, and the third-level branches growing from the second-level branches become the third-level categories... A category without a higher-level category becomes a first-level category, and a category without a lower-level category becomes a leaf category, and the specific leaf hanging on the leaf category is a label. Categories with subordinate categories are the parent categories of the subordinate categories, and categories with superior categories are subcategories of the superior category.

2.5.5 Category design ideas of label category system

The data category system is recommended to be divided according to the original system of data collection, storage, management and other systems. This can help data managers and data developers match categories and find the required data using their way of thinking and cognition. The label category system is recommended to be divided according to the perspective of data application such as object understanding and value scenarios, because the significance of the label category system is for business personnel, product managers and other data asset users to understand, search, and explore the labels required for business , giving full play to the value of data assets. The traditional technical perspective must be changed to organize tags from a business perspective that business personnel can understand.

There is no universal and unified tag category architecture that can meet the data needs of all enterprises, institutions, and governments in various business and management scenarios. The only thing that remains unchanged is the idea of ​​building a label category system according to real business and management needs. The following ideas and experiences are provided for reference.

1) Design ideas for the label category of “people”

When constructing a human label category system, first-level categories can be considered from the following dimensions: First, relatively static and fixed basic attributes, including a person’s statistical information, file information, physiological information, education information, work information, and permanent residence information. etc.; above the basic attributes, consider more dynamic and scene-based behavioral relationships, including the content of each person's behavior and the relationships that occur in the behavior; above the behavior, look at the interests and habits extracted based on behavioral relationships, Including hobbies, behavioral habits, etc.; based on interests and habits, people's personality characteristics can be further explored; based on personality, people's thinking consciousness can be abstracted.

2) Design ideas for label categories of “things”

When constructing label categories for objects, first-level categories can be considered from the following dimensions:

  • Consider static and fixed basic attributes, including basic information of objects, category affiliation, color patterns, packaging and storage, size and weight, composition, etc.;
  • On top of the basic information, consider the functional utility of the object, including its functional role, included services, usage methods, utility cycle, etc.
  • After functional utility, more consideration is given to master-slave attributes, including affiliation, generation and manufacturing, operation and sales, release and maintenance, etc.
  • After the first several types of static information, consider more dynamic passive relationships, including various relationships during the use of objects, which can be expanded from processes such as being browsed, being collected, being purchased, being transported, being evaluated, being complained about, etc.
  • Expand from the front and back development processes such as relationship conditions, relationship behaviors, and relationship results.
  • Split according to various types of passive relationships, such as used by home, used by work, used by entertainment, etc.
  • Summarize the depth and explore the value evaluation of objects, including various dimensions such as quality evaluation, service evaluation, cost performance, safety evaluation, applicability, scalability, market share, competition ranking, certification and authorization, etc.

After the first-level categories are constructed, you can expand the second-level and third-level catalogs by referring to people's label system construction ideas, and then put tags under each leaf category.

3) Design ideas for “relationship” label categories

When building label categories for relationships, first-level categories can be considered from the following dimensions:

  • Sort out the labels of the related people and related objects involved in the relationship. The related person or related thing under the relationship object is an attribute description of the relationship, and only involves the attributes of the person/thing displayed in the entire relationship.
  • Thinking from the preparation level of the relationship, including the opportunity, mechanism, preparatory conditions, etc. for triggering the relationship; then thinking from the actual process level of the relationship, including the time, place, weather, ways, channels and other spatio-temporal conditions of the relationship, as well as the time and place of the relationship. Participants, participating objects, steps, frequency, degree, strength, connection and other process behaviors; finally, we can think from the result level of the relationship, including the direct results of the process, the transformation of links in each link, comprehensive effect evaluation, splitting each link Factor side effect evaluation, optimization recommendation and other dimensions.

After the first-level categories are constructed, you can expand the second-level and third-level catalogs by referring to people's label system construction ideas, and then put detailed tags under each leaf category.

4) Summary of each object category

After summarizing the label categories of people, objects, and relationships, a complete label category structure diagram of an enterprise can be obtained.

2.5.6 Frequently Asked Questions in the Construction of Tag Category System

1) The same thing forms labels on different objects. Is this information redundant?

What is the core purpose of building a tag category system? It allows business personnel to find the desired tags as quickly as possible and combine them easily and quickly. In the tag category system methodology, it is recommended to construct comprehensive tags according to their respective objects, so that the logic when searching and using tags is clear.

2) Do labels that are not currently used but may be used in the future need to be designed?

The tag category system is designed and planned based on the existing data base and current and possible future business needs. It is a complete set of reusable and valuable tags. It needs to include all available tags under people, things, and relationship objects as much as possible. . The tag category system not only solves the current needs of customers, but also serves the future scenario-based data needs of enterprises and solves future problems.

3) Will the structure of the tag category system change with business development?

  • The hierarchical and grading lottery of the label category system takes into account the applicability of the future as much as possible, and it is not recommended to modify the category structure frequently. Therefore, when setting categories each time, it is necessary to consider a certain degree of extensibility, and pay attention to whether the granularity of parallel categories is consistent. When tags are added, it is generally not necessary to modify the original category, but to subdivide subcategories on the basis of the original category.
  • It is recommended that the category system structure generally does not exceed three categories and the number of hierarchical levels should not be too many. If the number of tags is large, it is better to convert category depth into category breadth, that is, the number of horizontal classifications increases.

After extensive verification of usage scenarios of the label category system, it was found that a three-level classification structure with no more than 10 categories at each level, that is, a total category tree of no more than 1,000, is a more appropriate category structure.

2.6 Front-end and back-end tag category system

After building the company's complete label category system, it needs to be further processed into front-end and back-end categories.

2.6.1 What are front-end and back-end categories?

Enterprise backend tag category system: the complete set of data assets, that is, all tag pools that need to be precipitated, reusable, and valuable. Data asset designers or administrators can create and maintain the background tag category system. Other personnel can view the category system, but cannot modify it at will. The backend tag category system is relatively stable and is a universal classification of the essential descriptions and descriptive attributes of characters, relationships, and various objects. It is loosely coupled with business scenarios and maintains global and stable label definitions for people, things, and relationships.

The front-end tag category system focuses on the impact on business scenarios, collects the required tag collections according to business needs, and classifies tags based on business understanding for calls by business systems or data applications. Therefore, the front-end label category system will change with the change of business scenarios, and it is flexible and configurable. The business scenario itself may be short-lived, generated quickly, closed quickly, or evolve into another business scenario, and its associated front-end label categories will also be added, deleted, or evolved accordingly.

2.6.2 Connections and differences between front and back categories

The front-end label categories are generally expanded according to the levels of large scene/sub-scene/data service. Under the big scene/sub-scene/data service, first abstract the objects involved, then go to the tag collection under the background tag category system object to find the tags required for the data service, and add them to the front-end category object.

The idea of ​​constructing front-end categories is different from that of back-end categories. The way of constructing front-end categories is often closely related to the design of data scenarios. A properly set up front-end category system is often closely related to the design of data product systems. Front-end categories can facilitate the application development of data product systems and help front-end development engineers quickly clarify the mapping relationship between each functional module of the data product and object tags.

Combining the sorted front-end and back-end tag category systems together forms a complete tag category system for an enterprise.

2.6.3 Significance of distinguishing front and back categories

Why distinguish between front-end categories and back-end categories? Because there is a conflict between business needs and technical management. On the one hand, business is flexible and changing, and its data requirements are highly responsive and flexible. Therefore, the label organization method for a certain scenario should be freely configurable. But on the other hand, when managing tags from a technical perspective, we hope that the tag organization method is relatively stable, and when business personnel come to the tag portal to view and select tags, they also hope that the tag organization method is relatively stable.

2.6.4 Complete front-end and back-end category design steps

  • Based on the existing business needs and data situation of the enterprise, identify the objects;
  • Determine these objects as the root directory of the label category system;
  • Sort out all possible tags under each root directory, and use the category architecture to classify the tags;
  • Record, plan and manage the acquired catalog;
  • Put specific tags under each backend category;
  • The background tag category system is constructed, which is the category management of all tags in the enterprise, forming tag category related documents or consultable system information;
  • According to the needs of the business scenario, determine a certain data application scenario, that is, the front desk;
  • Determine the objects involved in the foreground scene. In principle, the foreground objects are a subset of all objects in the background;
  • According to the front-end data application scenario, sort out the required tags and front-end category structure;
  • Record, adjust and manage front-end categories in a unified manner;
  • Put the tags you need to use under each front-end category.

The process of label design and category design is a complementary process that integrates each other: you can choose to design labels according to business needs first, and then classify the labels; you can also plan categories first, and then design specific labels under the categories; the actual situation is also It may be repeated optimization iterations of the above two processes.

Guess you like

Origin blog.csdn.net/baidu_38792549/article/details/126231231