1, the inadequacies of Apache Hadoop
• Versioning confusion
• deployment process cumbersome, complicated upgrade process
• poor compatibility
• Low security
2, Hadoop release
• Apache Hadoop
• Cloudera’s Distribution Including Apache Hadoop(CDH)
• Hortonworks Data Platform (HDP)
• MapR
• EMR
• …
3, CDH which can solve the problem
• 1000 server cluster, at least how long it takes to build a good Hadoop cluster, including Hive, Hbase, Flume, Kafka, Spark , etc.
• only give you one day to complete the above work?
• Conduct For more than a cluster hadoop version upgrade, what would you choose to upgrade the program, at least how long it takes?
• The new version of Hadoop, compatible with Hive, Hbase, Flume, Kafka, Spark , and so on?
4, CDH Introduction
Cloudera's Distribution's •, Including Apache Hadoop
• Hadoop is one of many branches, and by the maintenance of Cloudera, based on the stable version of Apache Hadoop to build
• provides the core of Hadoop
- scalable storage
- Distributed Computing
• Web-based user interface
5, CDH advantages
• a clear division version
• version update speed
• Support for Kerberos security authentication
• Document clear
• Support for a variety of installation (Cloudera Manager mode)
6, CDH installation
• Cloudera Manager
• Yum
• Rpm
• Tarball
7, CDH Download
• CDH5.4
http://archive.cloudera.com/cdh5/
•Cloudera Manager5.4.3:
http://www.cloudera.com/downloads/manager/5-4-3.html