Building High-Performance Data Exchange Systems: Key Technologies in Big Data Analytics

Author: Zen and the Art of Computer Programming

Big Data and Data Warehouse Technology

With the vigorous development of applications such as the Internet, e-commerce, and social networks, more and more data are being generated. These data will not only expand to tens of megabytes, or even hundreds of megabytes, but also the exponential growth trend of data volume is becoming more and more obvious. Nowadays, massive amounts of raw data are generated every day, and the cost of data storage is gradually decreasing. Therefore, how to quickly and efficiently process massive data has become an increasingly important issue.

Traditional data warehouse technology can meet the analysis and query requirements of historical data, but the traditional data warehouse technology still cannot meet the analysis and query requirements of real-time data. Real-time requirements have become an important requirement for data warehouses in enterprises. In the era of big data, "real-time" means a response speed of seconds or even milliseconds, and traditional data warehouse technology cannot meet this demand. The new generation of data analysis technology is moving towards the direction of meeting the needs of real-time analysis.

data mart

The emergence of data marts is to meet the needs of real-time data analysis and query. Currently, data marts mainly include offline data warehouses and distributed real-time data streaming platforms. Among them, the realization of offline data warehouse depends on periodic data import, ETL processing, data cleaning, computing processing and other processes, which requires large hardware resources and maintenance costs; the distributed real-time data flow platform adopts the method of stream processing, through Real-time calculation and streaming data to achieve fast data storage, efficient query and analysis.

Hadoop ecosystem

Hadoop(

おすすめ

転載: blog.csdn.net/universsky2015/article/details/131757689