Hairuisi Sharing | Discussion on Enterprise Data Quality Issues

1. Data quality problem scenarios

In daily work, business leaders often use the BI system to understand the performance of various businesses. If one day, he opens a certain core report and finds that the data in a certain area of ​​that day has been blank. After several hours of investigation and analysis, the BI developers found that the sales data in the region on that day had missing product type attribute values.

The above is the problem caused by typical data integrity problems.

With the advent of the digital age, data quality has gradually become one of the key factors for enterprise development. Data quality management should adhere to the concept of prevention first, adhere to the "pre-control as the core, meet the business needs as the goal" as the fundamental starting point and goal of the work, and help enterprises improve data quality in an all-round way. Only through this approach can companies take full advantage of data-driven business development and achieve long-term competitive advantage.

2. Data Quality—Quality Rules

According to DAMA's measurement standard for data quality, there are six core elements of data quality: completeness, uniqueness, consistency, accuracy, validity, and timeliness. The system provides rich quality rule templates and supports user-defined creation.

3. Data Quality - Quality Check

Data quality is controlled in various links such as data integration, data processing, and data analysis through preset data quality rules. For special processing of data that does not meet the quality rules, the process can be interrupted according to the quality threshold to avoid affecting subsequent business.

illustrate:

Interrupt process when the score is lower than the threshold: If this button is enabled, when the data quality score is lower than the set threshold, the task execution will be interrupted.

Abnormal data processing strategy: For data that does not meet the quality rules, there are three processing strategies: discarding, saving, and normal standardization.

Discard: Data that does not conform to the rules will not be dropped into the database, nor will it be saved.

Save: Data that does not comply with the rules will not be dropped into the database, but will be saved for subsequent supplementary recording.

Normal standardization: Data that does not conform to the rules will also be stored normally.

4. Data Quality - Quality Closed Loop

Problems found in quality inspection—"Generate problem list—"Question list is assigned to the person in charge of quality for processing—"Quality person in charge solves the problem sheet—"Reviews the completion of processing of the problem sheet—"Close the problem sheet

1. Generate a question list

On the quality task result monitoring page, if the score does not meet 100 points, a problem ticket can be generated for the specified task to process and track abnormal data; in addition, it can also be set that the abnormal data detected by the task does not need to be processed.

Generate a problem ticket page, you can set the severity of the quality problem and designate the person responsible for solving the quality problem.

After the problem list is created, the corresponding responsible person will be notified through the site letter, email, DingTalk, Feishu, etc.

2. Troubleshooting

View pending issue tickets on the Quality Issue Handling page.

On the trouble ticket processing page, you can view the basic information of the trouble ticket, including associated detection tasks, scores, associated data resources, trouble ticket status, and description; in addition, you can view the abnormal data associated with the trouble ticket, and support exporting abnormal data for modification.

After the processing of abnormal data is completed, submit opinions on the processing of abnormal data. The problem ticket will be automatically transferred to the initiator of the problem ticket and confirmed by the initiator. Relevant personnel will be notified through the station letter, email, DingTalk, Feishu, etc. when the problem ticket is transferred.

3. Question list review

You can view the processing results of the trouble tickets I initiated on the trouble ticket initiation record page. Review the tickets that have been processed.

The initiator can verify the resolution of the abnormal data of the problem ticket by rerunning the quality inspection task, and support the return and close operations of the problem ticket.

5. Data Quality - Post Audit

Provides visual development functions, supports users to build audit models, and discovers and analyzes data quality in complex scenarios.

6. Summary

OceanMind Hai Ruisi data center, according to the DCMM data quality capability domain, takes "pre-event prevention, in-process control, and post-event audit" as the core, and is mature in data quality requirements, data quality inspection, data quality analysis, and data quality improvement. methodology and rich implementation experience.

Pre-prevention: Through optimizing the process system and training to improve quality awareness and improve the data quality of enterprise business systems.

In-process control: In the research phase, the output business blueprint includes business processes, data quality requirements, and data responsible departments; in the implementation phase, data quality requirements are implemented as data quality rules and quality inspection tasks, and the data responsibility is passed through the quality problem list. The department carried out rectification to solve the data quality problem.

Post-event audit: Regularly conduct a comprehensive "physical examination" on the quality of enterprise data to find the "cause" of the problem, so as to achieve continuous improvement of data quality.

Guess you like

Origin blog.csdn.net/iOceanMind/article/details/131398302