Big Data Overview (II)

A big data is Hadoop?

Mention of big data, the first thought is Hadoop, Hadoop was developed by the Apache Foundation's distributed system architecture, users can develop distributed applications in distributed without knowing the underlying details. What is Hadoop and Big Data? Hadoop open source technology may provide a relatively inexpensive distributed storage system, by MadReduce techniques, concurrent, efficient energy calculation. Big data engine is essentially complete storage and computing, but the real storage results but also with traditional business systems and other applications to use.

IBM Software Group Greater China business analysis and insight into the wisdom of the Earth solve general manager Bo Xiaojun believes that the decision to do now identifiable information from, and the world of information is generated for each dimension, if not just analyze data exists in the database data, can be found only analyzes the intrinsic value of the data. Big data is the original data, showing different angles data, some of the data is not used, the analysis or the authenticity of the need for data quality, the most important is the results of the analysis must be accurate.

II. What is Hadoop?

Big Data is currently divided into four blocks: Big Data technologies, big data projects, big data science and big data applications. Which belong to the category of cloud computing is big data technology, is a kind of way and services computing model resource virtualization dynamically scalable over the Internet. Then this computing model how to achieve it, Hadoop advent solve this problem, Hadoop is an Apache (Apache) is an open source project, it is a large amount of data distributed processing software architecture, the organization under this framework members of the HDFS (Hadoop distributed file system), MapReduce, Hbase, Zookeeper (a reliable system for the coordination of large-scale distributed systems), hive (a Hadoop-based data warehousing tools) and so on.

Big Data is currently divided into four blocks: Big Data technologies, big data projects, big data science and big data applications. Which belong to the category of cloud computing is big data technology, is a kind of way and services computing model resource virtualization dynamically scalable over the Internet. Then this computing model how to achieve it, Hadoop advent solve this problem, Hadoop is an Apache (Apache) is an open source project, it is a large amount of data distributed processing software architecture, the organization under this framework members of the HDFS (Hadoop distributed file system), MapReduce, Hbase, Zookeeper (a reliable system for the coordination of large-scale distributed systems), hive (a Hadoop-based data warehousing tools) and so on.

We began to talk about the next chapter sub-section: Hadoop ethnic group.

Guess you like

Origin www.cnblogs.com/zuozhesang/p/11568783.html