Why do data analysis to build a data warehouse?

        Recently saw an article on the blog garden, on construction-related data warehouse. The project has been done before the data processed by PowerQuery, then use PowerBI Desktop data analysis show, there is no concept of building a data warehouse alone. Found in soft ETS official website through this article for some free ETL tool for presentations and video found by the original data warehouse ETL tools can indeed relatively simple. (Note: here is relatively simple, the main data warehouse or do we need to have the relevant knowledge, especially data warehouse dimensional modeling, post I will write a specific dimensional modeling dimensional modeling series of articles.)

 

       The figure we can see that there is no difference between the data warehouse and data warehouse. Performing data analysis, we always encounter some terms, such as data warehouse. A data warehouse is a data analysis more important things, the data warehouse is a subject-oriented, integrated, relatively stable, the data reflects the historical changes of the set. Here we give to explain the data warehouse data analysis.

       Data analysis to understand we should all be familiar, there are many processes of data analysis, you first need to understand the business, then that is understanding of the data, data mining, data processing, data analysis, data presentation, these steps will It can bring a good data analysis for everyone.

       But the work of data analysis is the most important data processing, data analysis due to data quality, the format required is relatively high, understanding of the data must also be very deep, so that the data fit the business needs that a certain process, as I do analysis of empirical data, the entire data analysis process, the processing time for the data often occupies more than 70%. So, how to efficiently and quickly perform data processing and understanding, often determines the progress and quality of data analysis projects. The data warehouse has an integrated, stable, high-quality and other characteristics, based on data warehouse data analysis provides data, it can often be more to ensure data quality and data integrity.

        If we want to do data analysis, when you want to use to build a data warehouse ETL tool to enhance the effectiveness analysis of data from three aspects need. Namely, data understanding, system-related data quality, data cross.

        First, understand the data

 

       我们都知道,数据仓库是面向主题的,所以其自身与业务结合就相对紧密和完善,更方便数据分析师基于数据理解业务。而数据仓库是有很多的主题组成,包括了很多的数据。当我们需要对数据进行分析的时候,如果理解数据仓库的模型,数据理解也就水到渠成了。

       第二、数据质量

 

       我们在做数据分析的时候要求数据是干净、完整的,而数据仓库已经对源系统的数据进行了业务契合的转换,以及脏数据的清洗,这就为数据分析的数据质量做了较好的保障。

        第三、数据跨系统关联

        数据跨系统关联数据仓库的一个简单架构,各业务源系统的数据经过ETL过程后流入数据仓库,当不同系统数据整合到数据仓库之后,至少解决了数据分析中的两个问题:

       第一,跨系统数据收集问题,在金融分析中同一个客户的储蓄交易和理财交易我们在同一张事实表就可以找到;

       第二,跨系统关联问题,进行数据整合时,总是需要找到共同点来关联来自不同系统的信息,而数据仓库在ETL过程中就会整合相关客户信息,完美解决跨系统关联问题。

       通过上面的内容我们不难发现数据仓库确实能够给大家带来很多的帮助,大家在学习数据分析之余需要对数据分析中的数据仓库进行了解,这样才能够更好地去进行数据分析工作。希望这篇文章能够给大家带来帮助,最后感谢大家的阅读。

       欢迎大家一起加入高效数据处理ETL交流群,一起讨论数据分析前ETL过程的问题,一起学习一起成长。 PowerBI高效数据处理ETL

 

      扫码加群:

Guess you like

Origin www.cnblogs.com/fly-bird/p/11311589.html
Recommended