Ali employees talk about the implementation steps of database, data warehouse and report platform, Xiaobai and advanced literacy series

The establishment of a data warehouse is a process of solving the application of enterprise data problems, an indispensable step for the development of enterprise informatization to a certain stage, and an important foundation for the development of data management. There are many books and articles in the knowledge market of Datacang, but the actual implementation depends on different industries, and the core demands of enterprises are different, from technology to methodology.

How to implement the data warehouse project, this article first cuts into the data warehouse of the traditional industry, and talks about the implementation methodology of the data warehouse as a whole!

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

General implementation steps of data warehouse

1. Demand analysis

Requirement analysis is the most important part of a data warehouse project. After all, data warehouses still serve and support the business. If the requirements analysis is inaccurate, no one will use it. If it is not used, it will directly affect the business/customer. Use, eventually lead to the failure of the project. In order to avoid the worst case, sharpening the knife and not cutting firewood by mistake, we must pay attention to the research, excavation and analysis of demand in the early stage, and adopt some rigorous scientific measures and methods to do demand analysis.

Share several experiences in the actual research process:

1. Analyze the requirements with the business side/client side as much as possible, and guide the other party to clarify the overall framework and business details to be achieved by the project. The best way is for the requirements personnel and the designers to discuss based on the prototype, so as to correctly understand the actual Business needs.

2. The goals that the data warehouse can achieve and the problems that are not easy to solve must be clear and negotiated realistically. There are many pitfalls in this link. The IT side is eager to go online, and the business side is still at a half-knowledge about the project, and may even avoid the importance of the project when it is promoted. And wrangling, it is the trust of the other party that is consumed.

Therefore, on the basis of requirements discussion, you need to understand the business workflow. Of course, if you already have rich business knowledge in this industry, you can ask the other party to complete the data warehouse system functions according to their own ideas as much as possible during the needs research. design.

3. Classification of demand-side groups. The final use objects of BI projects can be divided into the following categories: data queryers, bodyguards queryers, and corporate decision makers

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

The needs and characteristics of these three groups of people are completely different. When communicating, you need to distinguish and deeply understand

4. No matter how perfect the demand survey is, demand changes cannot be avoided. The reality is that in many cases, the demand is uncertain, and the business side is unable to put forward a valuable demand. The demand today is that A will become B tomorrow and cannot be achieved in one step. This is normal. As a project implementer, we must do well. Psychological expectations.

Under normal circumstances, what the business side can provide is the overall framework part of the demand or part of the actual demand. It cannot foresee the need to increase the demand in the future. This also destined the data warehouse project to be a continuous loop, feedback, and continuous improvement of the system. The process of growth.

Risks cannot be avoided but risks can be reduced, so scientific research is particularly important. The following is the survey template. When the demand survey is completed, the collected results need to be analyzed, summarized, and sorted to form a complete demand analysis report.

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

Second, the logical analysis of the data warehouse

The data warehouse can be logically divided into operational database, data warehouse layer, data mart layer, data analysis application layer and report display layer. Its structure is shown in the following figure:

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

Three, design ODS system

ODS can have two forms: ODS data buffer and ODS unified information view area.

① ODS data buffer

The ODS data buffer is the first storage area in the business data flow process, which realizes the process of data warehouse extracting data from the data sources of each business system and loading it into the ODS data buffer, thus realizing a unified global The enterprise data platform of the company lays a solid foundation for the subsequent data extraction, cleaning, and conversion processes.

The data source of the data can be extracted in an incremental manner, and the data that is frequently changed and updated is generally extracted in a full amount. The ODS data buffer has real-time characteristics. The ODS system integrates the production and operation data of various isolated business systems to form a unified, global enterprise data exchange platform.

② ODS unified information view area

ODS unified information view area refers to the selective integration of various business source data, data extraction, cleaning, and conversion operations, and the data subject domain as the basis of data integration, classification and organization of data, so that users can pass unified information The view area obtains real-time data related to a certain subject area.

Each business system and the ODS unified information view area can be mutually accessed, can generate real-time operational reports and query all recent information on a topic.

③ Differences and common points between ODS data buffer and ODS unified information view area

ODS data buffer mainly provides intermediate data buffer function for extracting business source data into the data warehouse, such as

The biggest difference in the ODS unified information view area is the conversion rules for data extraction, cleaning, conversion, and loading, and the data storage methods. The ODS unified information view area is to store data completely according to the theme, providing users with the functions of fast report display and real-time data query.

The ETL rules of the ODS data buffer generally only perform simple summarization, calculation, or direct extraction from an operational database without any conversion in the middle. The data in the ODS unified information view area is generally extracted from the ODS data buffer.

Four, data warehouse modeling

Data warehouse modeling has been introduced in detail above. The data warehouse model is a set of language and platform for IT technology developers, business personnel, and decision managers to communicate with each other.

For data modeling engineers, a deep understanding of the business is the primary task, because data warehouse modeling is divided into three stages: conceptual model design, logical model design, and physical model design. The models are generally analyzed in a top-down order. Design.

 

 

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

Five, data mart modeling

The construction of the general data mart model is based on the results obtained from the demand analysis. The modeling of the data mart is mainly aimed at the design of fact tables and dimension tables.

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

Six, data source analysis

The so-called data source analysis is the process of analyzing and summarizing the source data to obtain the scope, format, update method, update frequency, and quality of the source data.

7. Data acquisition and integration

Data acquisition and integration exist in all stages of the data warehouse project. A very important function of the data warehouse is to integrate the data scattered in various business systems, standardize the irregular data, and put it in the data warehouse in a convenient way for analysis and application for front-end application analysis.

The ETL process is actually the process of data flow, that is, from different data sources to a unified target database. The acquisition and integration of data is the most complicated process to complete the construction of the data warehouse. It is related to the quality of the data and is the foundation of the construction of the data warehouse project.

8. Data application and report display

Reporting is definitely a painful thing. The format is complex and the requirements change. If the business is fine, change the requirements or add a few more. Although it feels very old-fashioned when talking about the report, it is indeed the value of the entire data warehouse project.

People who do a lot of reports will basically make their own tools, or at least an engine, to define the reports they need in a structured and dynamic way according to their own understanding, and they can choose the data they need flexibly. , Design and display styles to generate reports.

Nowadays, professional low-code reporting tools are generally used to make reports, such as FineReport to make reports, improve development efficiency, and focus on application analysis. After all, no one wants to be entangled with reports all the time.

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

Combining the data layering mechanism mentioned earlier, you will find that no matter which layer is based, there is a need for reporting. I personally think that the focus of the report is not on the preparation of the report, but on how to use the report to find value for the business and the project.

Ali employees talk about the implementation steps of data warehouse and report platform, Xiaobai and advanced literacy series

 

Large companies will have project personnel responsible for report analysis. The work extended for reports, report demand analysis, indicator system planning, and report classification for business management and grassroots personnel, and hierarchical design around the business .

For grassroots employees, the most commonly used report is to record data and query data. For example, shop assistants in shopping malls browse the data to check the sales of goods, so as to replenish goods in time, and enter daily sales data every day.

For some business personnel, the report is no longer simple display and entry, and some analysis needs will be derived, such as a purchasing manager, who needs to decide which brands of goods to purchase, which supplier to purchase from, and how to plan the store's products .

The method is to look at the report to see which products are good to buy, so as to consider whether to purchase additional branded products, give up those branded products or engage in promotions. Gao Da's rhetoric is to use data to optimize product structure and select suppliers.

For corporate management, it is more to do dashboard monitoring for indicators and performance analysis (time, regional latitude, etc.) . And this process also uses data to make it easier for management to make decisions in accordance with standard management methods (if employees are judgments, leaders are decision-making...)

Follow me, forward the article, reply "report" by private message, and get the report tool and data warehouse construction plan.

Guess you like

Origin blog.csdn.net/yuanziok/article/details/108400583