Tag category system (business-oriented data asset design methodology) - Reading Notes 6

Chapter 6 Technique: Using Techniques and Important Issues

1. Label specifications

Data must be transformed into tags that can solve business problems and improve business efficiency before it has value, otherwise it is a data burden. The process of converting data into labels is called "tagging". Two factors need to be fully considered in labeling:

  • Is it data-feasible, and is raw coral available for processing into labels;
  • Whether it can reflect business value, that is, whether it is a core business need or can innovate business scenarios.

The core of tagging is to use data thinking to understand, abstract, refine business scenarios and solve business problems. In the process of labeling, it is necessary to have labeling specifications for standard operation guidance.

1.1 Labeling

1.1.1 The root directory points to the object to which the label belongs

The root directory is often a relatively vague, broad, and simple noun or gerund. At the physical level of data, it is often mapped to the primary key in a large wide table. The information in this large wide table is a detailed description and data record of the primary key object: the columns of the large wide table are mapped to labels, and the rows of the large wide table The records correspond to the specific attribute value records of specific objects on each label attribute.

1.1.2 Category is the classification of labels

Categories are often made up of nouns. A category and its classified tags can correspond to a specific table at the data physical level. Multiple data tables with the same primary key but different information types can be associated together to form a large table under the primary key object.

1.1.3 Tags are attributes of objects, with granularity down to the field level

A label generally corresponds to a certain field in a certain data table in a certain database.

1.1.4 The tag value is the specific value of the object attribute

The tag value generally corresponds to the value of a field in a data table in the database.

1.2 Meta tags

Tags of tags are called meta tags. Meta tags are attribute descriptions of tag objects, aiming to use business-oriented terms to help front-end businesses better understand tags.

1.2.1 The root directory to which the label belongs

The root directory to which a tag belongs refers to which object the tag belongs to.

1.2.2 Category to which the label belongs

The category of the tag is the first-level directory, the second-level directory, the third-level directory, etc. mentioned above.

1.2.3 Tag name

Tag naming should follow three major principles: avoid misunderstandings that infringe privacy, use the same tag name for the same tag, and use similar sentence structures for similar tags. The basic specifications for label naming are as follows:

(1) Format specification

The same tag should be normalized to the same tag name, and the same tag uses the same sentence structure.

(2) Specification of word usage

  • It is not recommended to use words such as "ID card", "trajectory", "positioning", "tracking", "GPS", "user habits", "intention" and "minors". These words are sensitive words and may easily cause unnecessary attention and investigation.
  • For the labels produced by the algorithm model, it is recommended to add the word "prediction" before the label name, such as "predict whether there is a house", etc.
  • Do not apply discriminatory terms, such as "country bumpkin" and "manly woman"
  • Tags for user hobbies and intentions should end with "preference", such as "predict brand preference", etc.
  • "Habit" can be used alone as a verb in behavioral habits tags, such as "Habit"

(3) Content specification

  • Data related to minors should not be counted in the data calculation content of tags
  • Tag data must be obtained legally or legally authorized to be used, and illegal or gray data information must not be used to process tags.

1.2.4 Label description

Use one or two sentences to explain the tag name to avoid problems such as ambiguity, ambiguity, and polysemy caused by too short words in the tag name.

1.2.5 Types of label processing

Tags can be divided into original tags, statistical tags and algorithm tags according to different processing types.

(1) Definition of three types of processing labels

  • Original class tags: Fields that exist in the original data table can be used by business personnel after simple regularization and become tags.
  • Statistics tag: Raw data is processed through ETL, such as sum, average, regular expression, rule operation and other simple mathematical function operations.
  • Algorithmic label: the deep processing label of the original data calculated by the algorithm model, such as the comprehensive score and prediction index obtained after the operation of the algorithm model such as pattern recognition and deep learning

(2) The relationship between the three types of processing labels and attribute classification labels

  • The original class tags are often basic attribute class tags, eg: gender, age, name, mobile phone number, etc. of member registration. Basic attributes directly describe the attributes, characteristics, and information of a certain type of object. Wanwangleizi basic information table, in which important information items can be converted into original class tags through simple cleaning, data clipping, etc., for use by business personnel.
  • Statistical tags are often behavioral tags, such as the total transaction amount in the past month, etc., which are often obtained through ETL development of original transaction records, collection records, and browsing records. Behavioral data has too many detailed item records, so it usually needs to be summarized and developed to obtain statistical tags for use by business personnel.
The design of statistical composite tags can refer to the following design template. On the basis of atomic tags, add dimensional information to describe or expand a certain type of attributes in detail, that is, [scenario] + [space-time modification] + [calculation method] + [modifiable words] ] and other information combined as modifiers.
A.  [Scenario] often refers to a certain behavioral scene, such as e-commerce transactions, offline transactions, etc.
B.  [Space-time modification] refers to the statistics of atomic labels shrunk to a certain time latitude and a certain space dimension. Time modifications include the last 1 day, the last 7 days, etc. Spatial modification includes different regional divisions or channel types such as East China area, Zhejiang area, Hangzhou area, and mobile terminal.
C.  [Calculation method] refers to different statistical calculation methods, such as summation, average, and maximum.
D.  [Modifiable words] are often closely related to the scene. For example, in the "e-commerce transaction" scene, the categories are divided into "electronic products", "clothing", etc. According to customer type, it can be divided into "VIP customers", "new customers", etc.
Combining the above factors together, a statistical matching label can be generated, eg: the total transaction amount of mobile electronic products in the last month.
  • Algorithm tags often correspond to high-level abstract tags such as interests and hobbies, personality thinking, and value evaluation. Because there is no simple way to confirm and judge the specific values ​​of these high-level abstract labels, it is necessary to use algorithm modeling to conduct deep learning and intelligent judgment of big data based on a large amount of basic information and behavioral information. The original data uses algorithm technologies such as data mining and machine learning to predict and evaluate advanced features.

(3) The relationship between the three types of processing labels and various labels under people, objects and relationships

  • The basic attribute labels of "human" objects are often primitive labels, and the behavioral relationship labels are often statistical labels; the interest, habit, and thinking labels often correspond to algorithm labels.
  • The basic attributes, functional utility, and master-slave attribute tags of "thing" objects are often primitive tags; passive behavior tags are generally statistical tags; value evaluation tags are usually algorithm tags.
  • The person tags of "relationship" objects often point to ID basic attribute tags, which are primitive tags used to uniquely identify related people and related things; relationship preparation and relationship process tags often correspond to statistical tags; relationship result tags Corresponds to the algorithm evaluation class label.

1.2.6 Label logic

Label logic refers to the description of label development methods, processing procedures, calculation logic, etc.

  • Original class tag: The logic is generally expressed as directly using the m field in table a after simple cleaning.
  • Statistical tags: the logic is often historical accumulation / last N days / last N months / frequency of the latest XX behavior / frequent time / frequent location / quantity statistics / frequency statistics / amount statistics, etc.
  • Algorithm tags: The logic generally needs to be clearly defined, and important features that need to be included in the algorithm model processing, positive and negative sample definitions or learning sample logic, model selection and model structure, model output result form and threshold segmentation setting, desired model Forecast performance indicators, etc.

1.2.7 Value labels

A value dictionary is an enumeration of possible values ​​for a tag.

1.2.8 Value type

The value type is the data type of the tag value.

1.2.9 Examples

Give 1 or 2 examples of label values, which are mainly used for continuous numerical labels that cannot be enumerated exhaustively or labels with hundreds or thousands of enumeration items, so as to help developers and business personnel better understand label definitions.

1.2.10 Update cycle

The update cycle generally refers to the data update cycle of the indicator.

  • Original class label: The label value is unlikely to change, and the update cycle can be lengthened;
  • For statistical labels, the original data can be updated every 1 day, every 7 days, every month, etc. to design the update cycle of this label;
  • Algorithmic labels: Algorithmic models are often designed for iterative optimization, so they are updated quarterly or semi-annually, and the update cycle is between the original label and the statistical label.

1.2.11 Security level

It is recommended to construct a security rating of 1~4 levels (L1~L4):

  • L1: Public label, which can be disclosed to the outside world. It is the most open data label and has the lowest security level;
  • L2: Internal tags are data tags that can be directly circulated, applied for, and used across departments within an enterprise/institution, with a low security level;
  • L3: Confidential label, a label that requires authorization for cross-department use within the enterprise and can only be used after approval. The security level is higher;
  • L4: Confidential label, which is a label that can be used by only a few talents within the enterprise/institution and cannot be disseminated, with the highest security level.

Each enterprise/institution can set different application, operation, and use permissions for L1~L4 level tags based on their own actual conditions.

1.2.12 The physical storage information corresponding to the label

Tags need to be mapped with the underlying physical table in order to perform real data flow when producing data services. Register the physical table name and field name to which each tag is mapped to ensure that when the tag needs to find problems or manage and optimize later, the corresponding physical path and real development logic can be quickly located.

1.2.13 Label person in charge

It is necessary to register a list of people responsible for the label so that business personnel who have questions about the label can quickly locate the relevant personnel and get answers quickly when tracing.

1.2.14 Completion time

The completion time refers to the time when the latest logical confirmation development of the tag is completed, or the version time of the last stable modeling run of the algorithm type tag.

2. Combination tags

Combination tags can be divided into two levels according to the complexity of the combination:

2.1 Tag combinations under the same object

Including the value processing of a single tag and the value processing of multiple tags. Processing methods include the use of various statistical operations such as regular expressions, mathematical operators, and data functions.

2.2 Label combinations between different objects

2.2.1 Design steps for cross-object label combination

  • Identify objects in business requirements
  • Design "object" tags related to conditions, involving multiple objects
  • Split labels to the smallest detail
  • Configure base tags as composite tags.

2.2.2 Three points to note when designing cross-object label combinations

(1) Always remember that labels and data results are different

Tags are basic and reusable data assets, and the data result requirements of general business are actually the requirements for data services. Data services are often composed of related tags + the process of processing tags.

(2) It is very important to find out the relationship labels of two objects

The labels of two objects need to be combined into one label through associated labels to achieve object spanning.

(3) The label design process and label usage process are reverse processes

In complex data application scenarios, the label design process is to work backwards and disassemble the business requirement results to basic tags; while the tag usage process is from the initial basic tag operation to the output of the business requirement results.

3. How to use tags

3.1 What is platform-level reuse

The core essence of the data center is to improve reusability, reduce business trial and error costs, and liberate the initiative and enthusiasm of business personnel to the greatest extent. There are four levels at the system reuse level:

  • The first level [code-level reuse]: find out the reusable parts from the existing code, modify them, and then reuse them. This kind of reuse is the shallowest, and you will have problems such as code migration and usage errors.
  • The second level [component-level reuse]: common codes that meet the needs of a certain function are summarized and encapsulated into a component, and the use of this component is reusable. Technical middleware can be counted as component-level reuse.
  • The third level [product-level reuse]: Some products have general capabilities and wide applicability. After packaging, an adaptation interface is left, so that the entire product can be reused.
  • The highest level [Platform-level reuse]: Various components, products, etc. exist completely in an ecological chain. In this platform, system developers can select the required building block modules (reusable units) by building blocks. , quickly assembled into the final technical system.

3.2 Label usage method for platform-level reuse

3.2.1 Free choice of labels

Tags are a concept at the data asset level and are the smallest unit of data information. After encapsulating the data with tags, you only need to select the required tags in the tag portal/tag mart each time to enter the use and setting process. No need to Every time, operations such as table lookup, reading, and code writing are performed to retrieve data tables.

3.2.2 Label configuration

Service components have two characteristics:

  • The component tool itself does not contain data, and the label data table needs to be automatically synchronized or actively imported into these products through the label selection in the first step;
  • Various operations can be configured and dragged through the visual interface, basically achieving zero-code or low-code development.

Through the above two steps - free selection of labels and zero-code configuration of service components, the development of data services/data application systems can be completed through platform-level reuse. Only this way of using tags can empower the business side: greatly improve the efficiency of tag usage, fully optimize tag quality, and establish a value connection between the data side and the business side.

3.3 What are service components, data services, and data application systems?

In large-scale data use by enterprises, tags must be used in conjunction with service components to maximize the value of data and ensure the stability of data services.

3.3.1 Service components

A service component is an engineering encapsulation of a certain data function. It generally provides an interactive interface to implement operations such as importing or associating data tags, setting service functions, etc. There are two output methods:

  • Generate data services in the form of API, which is suitable for docking with complex systems or when interface and system customization requirements are high;
  • The generated data application system directly comes with a simple interactive interface, which can be used directly by business parties from end to end, which is simple and clear.

3.3.2 Data services

Data service means that Tonggu API provides certain data functions to meet the needs of business system calls.

  • Flexible in use, multiple data service APIs can be combined into a data application system;
  • The display is flexible and the API can be connected with various visual components to meet the unique needs of business-side interaction.

3.3.3 Data application

Data application refers to providing a combination of data functions with interactive interfaces to the business side, which is a systematic presentation of data application results.

4. How to operate labels

4.1 Full life cycle operation of tags

4.1.1 Label design

Data asset designers carry out label design work based on business research, data research and other preliminary work, and produce label category system architecture diagrams and label design documents, including label objects, category systems, label names, label processing types, label logic, and values. Meta tag information such as fields, value types, examples, update cycles, etc.

4.1.2 Tag Development

After the label design is completed, the labels are classified according to the processing type, and then submitted to the data development engineers and algorithm engineers, who will develop various labels. The original and statistical labels are handed over to data development engineers, and the algorithmic labels are handed over to algorithm engineers. After the tag development is completed, the data development engineer will add the physical storage information of the complete tag, such as table name, field name, person in charge, completion time, etc., to complete the mapping of the tag to the data layer.

4.1.3 Label Shelving

After the tag is developed and complete meta tag information is added, the tag needs to be listed in the tag management system. After the label is put on the shelf, it can be opened and displayed to business personnel at all ends for viewing, consultation, and use through the label portal. During this process, the system will determine the data viewing and application permissions for different accounts based on the tag's security level, department role and other information. Permissions include the range of visible tag sets, the range of tag details, the range of applicable tag sets, etc.

4.1.4 Label usage

Tags are only valuable if they are used by the business. There are three ways to use tags:

  • Data synchronization: refers to synchronizing processed label data directly to the database of the business system. Generally, only core businesses will use it this way.
  • Data application: refers to encapsulating the tag function into a product interaction form for external use. The skill tracks the tag call status and evaluates the effect of tag usage. This method is deeply bound to the business side. Due to the different usage habits of business personnel and many business formulation requirements, general-purpose products are difficult to meet the individual needs of many business front-ends, and the scalability is limited.
  • Data service: The tag usage method is developed into an API form and connected to the business system. Business personnel can use tags flexibly without directly copying tag data, and the call status is easy to track and monitor. Data services are an ideal way to use tags, which can best reflect and bring into play the extensive value of tags. During the use of tags, it is necessary to monitor their calling status to audit stability, security and compliance.

4.1.5 Tag governance

  • Lineage information: The path of tag production is lineage, which records the source, processing process, application docking status, etc. of each tag based on historical facts.
  • Meta tag specification: Each tag needs to be registered with business and technical meta tag information. Meta tag management needs to form a unified specification system, and uniform information registration and inspection of tags is required.
  • Quality management: Label quality management should run through the entire process of labels from design, use to archiving. Its core is to formulate a set of label instruction management rules, follow label quality standards, and be equipped with a visual label quality monitoring platform, label cross-validation tools, etc. Technical Support.
  • Safety management: "Three horizontal and three vertical" label safety assurance systems. The "three verticals" refer to the security concept and overall strategy: first, the use of tags must comply with national big data-related policies and regulations; second, the security of all data assets of all customers must be guaranteed; finally, during the specific use process, tag sensitivity registration must be assessed , formulate corresponding security management strategies and security implementation plans. "Three horizontals" refers to the adoption of core solutions: the first is a triple encryption mechanism, the second is an invisible tag security system, and the third is a core ID generated from all IDs.

4.1.6 Label marketing

After the label development is completed, the value of the label needs to be sorted out, publicized and promoted externally, so that business department personnel can understand various label information as soon as possible.

Enterprises must focus on the realization of tag value and continuously operate the entire life cycle of tags. Through value-driven and reverse tag management optimization, tag usage performance is stable, tags are shared on shelves, tag development efficiency is improved, new tags are expanded, and tag source data is Only through expansion and other link goals can we ultimately achieve sustained and stable growth in the value of data assets.

4.2 Responsible units in the label operation link

  • In the early stages of building a label category system and when enterprises need to build unified labels at the enterprise level, it is recommended that the data department design, develop, manage, and operate labels in a unified manner.
  • After each business department has formed a certain depth of data thinking and mastered the label construction method, the authority of label design and label development can be opened to the business department, that is, the data team of the business department.
  • After the labels designed by each business end are developed, they can be put on the shelves as private labels for use only by their own business departments.
  • The enterprise data department and each business department can set the degree of openness of their own tags: level 01 is open to the public and does not require review by the department when used by other departments; level 02 is open to the public but requires review by the department when used by other departments; level 03 Level 04 is oriented open but needs to be reviewed by the department when it is used by targeted departments.
  • The label operation team must review whether the label naming is standardized, whether the label is suitable for public disclosure, whether the label information is complete, etc.; judge the label quality through a unified monitoring background or feedback mechanism, and make decisions on governance optimization; adopt operational methods and be value-oriented , realize the stable development of the label's full life cycle, and form an operating ecology with strong business participation.

4.3 Operation Closed Loop of Tags

  • The first ring is the design ring, including the design and development of labels and putting them on shelves. In this link, data asset designers not only develop labels for who is needed in the current business scenario, but also purposefully and prospectively design labels for possible future scenarios.
  • The second ring is the use ring, including label selection, application, and calling. Throughout the entire process, business personnel can centrally select appropriate labels and apply for use through the label design and development in the first stage. At the same time, business personnel are supported to propose new labels based on actual needs.
  • The third link is the management link, which includes the registration of basic tag information, evaluation of usage, tag optimization to improve usage effects, etc.

5. How to see the label quality

The quality of labels can be evaluated from three major dimensions: data source, label processing process and label usage process.

5.1 Related indicators of data sources

  • Data source security: The security level of the data source, whether it is obtained legally, whether it is authorized by the user, etc. will indirectly affect the data security of the tag.
  • Data source accuracy: The accuracy of the data source, whether it is obtained on-site, indirectly, or edge-calculated, is related to the final accuracy of the label.
  • Data source stability: The stability of data source data generation, including the stability of the generation period, the stability of the generation period, the stability of the amount of generated data, the stability of the generated data format, the stability of the generated data values, etc.
  • Data source timeliness: The time interval between the data source data being generated at the first site and being transmitted and entered. The timeliness of behavioral data will indirectly affect the accuracy of the label.
  • Comprehensiveness of data sources: Whether the data source data is comprehensive, and whether data at all levels can be integrated and opened up to perform global calculations.

5.2 Indexes related to the labeling process

  • Tag test accuracy rate: The accuracy rate obtained during the modeling and testing process of the tag is an initial accuracy rate similar to the experimental nature, for reference.
  • Label output stability: The stability of label calculation, processing, and output time every day, and whether the label can be produced on time are also key indicators that business personnel consider when using labels.
  • Timeliness of label generation: the time interval for label generation, the shorter the time interval, the stronger the timeliness. Timeliness is especially important for real-time class labels.
  • Tag value coverage: the number of individual objects with a valid tag value for a certain tag. The degree of data perfection for each individual subject is different, and the same label can cover different subject groups.
  • Tag completeness: Tags have a lot of meta tag information, that is, the "tags" of tags. The completeness of these meta tag information is a usability indicator for business use.
  • Tag normativeness: The metatag information of tags needs to be registered in a standardized format, including whether the metadata information of existing tags is compliant and to what extent.
  • Label value dispersion: whether the label values ​​are concentrated in a certain numerical range or a few values, or are relatively distributed. Dispersion is not absolutely good or bad. In general scenarios, the higher the dispersion, the better, which means that various groups with different characteristic values ​​can be found.

5.3 Indicators related to label usage process

  • Tag usage accuracy: During the tag use process, the tag accuracy obtained through business scenario verification and feedback is a more realistic accuracy judgment.
  • Tag call volume: The label's average daily call volume, today's current cumulative call volume, historical cumulative call volume, and historical call volume peak can all be referenced, reflecting the number of times the label has been called by the business.
  • Tag audience popularity: How many business departments, business scenarios, and business personnel apply for the tag, which can reflect the applicability and generalization ability of the tag.
  • Tag call success rate: In the real usage scenario of a tag, the ratio of the number of successful calls (the total number of historical calls - the number of failed calls) to the total number of calls.
  • Tag failure rate: The proportion of cumulative failure time to the total service time of a tag in real usage scenarios.
  • Tag attention popularity: the popularity calculated by comprehensively calculating the popularity of tags in tag portals such as searches, browsing, collections, consultations, discussions, etc.
  • Label continuous optimization degree: Whether the label is continuously iteratively optimized by developers, or is still in a development stage, reflects the degree to which the label has been repeatedly tempered and continuously optimized.
  • Continuous use of tags: After a tag is used by a business application, the average calling time, frequency, and promotion status reflect whether the tag really brings value to the business.
  • Label cost/performance ratio: Comprehensive calculation of the data source cost, computing cost, and storage cost generated during the label processing process and the value it brings to the business, call volume, application importance, etc., and the resulting cost/performance index is a comprehensive view of cost and Balance parameter of value.

6. How to see the label cost

6.1 The cost of tag data source collection and storage

6.1.1 Information construction

As a result of informatization construction, the storage cost of source data needed for tag development is one of the sources of tag collection and storage costs.

6.1.2 Data burying points

Data burying is a way to obtain online system data. There is a large amount of low-value information in the log data obtained by data burying. Therefore, algorithm technology needs to be used to model and mine these behavioral data to find out the truly valuable data. The technical input cost of data embedding according to tag requirements and the storage cost of embedding data are the second sources of tag collection and storage costs.

6.1.3 Data supplementary recording

For some offline data information outside the core information system, it can be supplemented by not recording the system or by supplementing the information in the existing system. The technical investment cost of supplementary data recording according to tag needs and the storage cost of supplementary data are three sources of tag collection and storage costs.

6.1.4 Data crawler

Through crawler technology, enterprises can crawl information other than their own operations, business, and knowledge comprehension, and make full use of public wisdom that has been disclosed. The technical investment in data crawling based on label requirements and the cost of crawler data storage are the fourth sources of label collection and storage costs.

6.1.5 Data Acquisition

The capital cost of data acquisition according to tag requirements and the storage cost of acquired data are the fifth sources of tag acquisition and storage costs.

6.1.6 Data Cooperation

The shared data is often statistical result data, and enterprises cannot obtain detailed data records, so they can only use it as a supplement to some information. The input cost of data cooperation according to tag needs and the storage cost of cooperative data are the sixth sources of tag collection and storage.

6.2 Label design and processing costs

The label design process includes data research, industry business scenario research, label category system and specific label design, etc. The costs incurred in these processes are basically labor costs; the label processing process includes data synchronization, data cleaning, data development, and data governance. Other sub-links will incur labor costs, technical investment costs, and data calculation and storage costs.

6.3 Label usage and marketing costs

The cost of using tags mainly includes computing resource consumption costs, labor costs, and tag information system development and operation and maintenance costs. A relatively large proportion of this is the cost of computing resources consumed during the use of tags. Different computing engines consume different data storage and computing costs. Generally, the more complex the scene, the higher the performance requirements, and the higher the cost of the required computing engine.

By sorting out the costs of collection, storage, design, processing, usage and marketing, and tracing and allocating them to each label, the cost of each label or label service can be calculated. This is very important for the commercial operation of labels and labeling services.

7. How to read label value?

7.1 Classification of tag value

7.1.1 Optimization of internal business management of enterprises

Using tags in data applications such as data analysis and monitoring and early warning can help business operators better analyze the status of core links in their business processes, determine whether abnormal alarms occur, and handle them as soon as possible.

7.1.2 Enterprise’s external data business empowerment

Tags cooperate with corresponding data engines to generate data service interfaces or data applications, and enterprises provide these data services or data applications to the outside world as a new type of data service. This data business will bring business revenue to the enterprise.

7.1.3 Compliant data trading industry

During the data transaction process, ensuring data compliance, security and fairness is a top priority. If a new mechanism can be explored, where users of tag services will pay to use tags, then the tag value can be calculated through the service usage fees measured by the platform, and reverse traceability can ultimately be achieved.

7.1.4 The social value of benefiting people’s livelihood

In addition to enterprises, governments, institutions, etc. also need data asset empowerment. The digital brains and smart cities being built in many cities all belong to big data support modules. Through a large amount of data, governments and institutions can make reasonable assessments of the current situation, predict and warn development trends and risks, and make overall plans.

7.2 How to measure label value

7.2.1 Income approach

In the process of internal business management and external data business empowerment, the income method can be used to measure the value of label services. How much internal costs have been reduced, how much external business income has been increased, and the monetary quantification of these benefits are all considered to be the specific value that labeling services bring to enterprises.

7.2.2 Market approach

In the compliant data trading industry, tag services are quoted by a certain production provider, and consumers make counter-offers based on actual needs, or purchase other tag services at lower prices.

7.2.3 Cost method

For data services open to the general public, how much money has the government, institutions, and enterprises cumulatively invested in the design, construction, and ongoing operation? This continuous investment in data construction costs can be used as a measure of the value of labeling services.

8. Similarities and differences between labeling methodology and data warehouse modeling

Both labeling methodology and data warehouse modeling explore how to extract, operate, and process data assets. However, data warehouse modeling focuses on data governance, data specification, and domain-based modeling. Through domain modeling, you can see slices of existing data in a certain business scenario to solve current data problems.

Guess you like

Origin blog.csdn.net/baidu_38792549/article/details/126664279