what is hadoop?

  In today's rapid development of information, hadoop is becoming more and more popular, so what is the reason for hadoop so popular? Next, Brother Xinba will introduce it in detail and let you quickly know what any hadoop is?

  Origin of hadoop idea: Google

  Xinba has noticed Google search engine, Android, translation, etc., there are many advanced technologies, but now domestic users can't access Google search engine, in fact, Google brings a lot to our life From the smartphone Android system we use to Google Translate, Google Scholar, Google+ and so on, there is a lot of new knowledge waiting for us to learn.

  Google's low-cost way

  Google's powerful search engine does not use supercomputers and does not use storage. A large number of PC servers are used, because the data in the Internet is huge, and there is a good architecture that can provide data storage and data access, and provide redundant cluster services. For example, the storage used by Taobao before is oracle. Due to the increase in the amount of data, Taobao is going to the Ieo mode and does not use storage.

  Google has multiple data centers around the world, some of which are still equipped with power plants, which can meet the search needs of users around the world. At the same time, there is an important point that the operator pays back to Google.

  Brother Xinba discovered that Hadoop is a software platform for developing and running large-scale data processing. It is an open-source software framework of Appach using the most popular java language to realize distributed computing of massive data in a cluster composed of a large number of computers. The

  core design of the Hadoop framework is: HDFS and MapReduce. HDFS provides the storage of massive data, and MapReduce provides the calculation of the data.

  HDFS (Hadoop Distributed File System, Hadoop Distributed File System), it is a highly fault-tolerant system suitable for deployment on inexpensive machines. HDFS can provide high-throughput data access and is suitable for applications with large data sets.

  MapReduce is a programming model that extracts and analyzes elements from massive source data and finally returns a result set. Distributed storage of files to hard disk is the first step, and extracting and analyzing what we need from massive data is what MapReduce does.

  In practical applications, Hadoop is very suitable for big data storage and big data analysis applications. It is suitable for cluster operation with thousands to tens of thousands of servers, and supports PB-level storage capacity. This is a point that cannot be surpassed by traditional databases and is also the most advantageous point.

  After understanding hadoop, you will find that typical applications of Hadoop are: search, log processing, recommendation system, data analysis, video image analysis, data storage, etc. The function is really powerful. In fact, there is still a lot of content about hadoop. If you are a hadoop enthusiast, welcome to pay attention. Xinba will regularly update the knowledge of big data.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327091139&siteId=291194637