Big Data and Cloud Computing

There are often misconceptions about the relationship between big data and cloud computing. And they will be mixed and explained in one sentence: cloud computing is the virtualization of hardware resources; big data is the efficient processing of massive data.

Although the above sentence explanation is not very appropriate, it can help you to understand the difference between the two easily. In addition, if a more vivid explanation is given, cloud computing is equivalent to our computer and operating system. After virtualizing a large number of hardware resources, it is allocated and used. The current leader in the field of cloud computing should be Amazon, which can be said to be cloud computing. It provides commercial standards, and VMware is also worth paying attention to (in fact, it can help you understand the relationship between cloud computing and virtualization). The most dynamic open source cloud platform is Openstack;

big data is equivalent to massive data The "database" of Hadoop, and the development of the big data field can also be seen that the current big data processing has been developing in a direction similar to the traditional database experience. The cluster, which brings traditional and expensive parallel computing and other concepts to our front, but it is not suitable for data analysts (because of the complexity of MapReduce development), so PigLatin and Hive appeared (respectively initiated by Yahoo! and Facebook) The project, let me add that, in the field of big data, leading Internet companies such as Google, facebook, twitter and other leading Internet companies have made very positive and strong contributions), bringing us SQL-like operations, and the operations here are like SQL, However, the processing efficiency is very slow, which is absolutely different from the processing efficiency of traditional databases. Therefore, people are thinking about how to process big data not only in terms of SQL-like operations, but also in processing speed. Google is for us. Bringing technologies such as Dremel/PowerDrill, Cloudera (the company with the strongest commercialization of Hadoop, where the father of Hadoop cutting is responsible for technical leadership) also appeared in Impala.

On the whole, the future trend is that cloud computing, as the bottom layer of computing resources, supports the upper layer of big data processing, and the development trend of big data is real-time interactive query efficiency and analysis capabilities, borrowing from a Google technical paper. If so, isn’t it exciting to “manipulate petabytes of data in seconds with a single mouse click”?

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326660589&siteId=291194637