Difference between CDH and native hadoop

Several issues to be aware of
---------------------------------------------------------------------------------------------------------------------------
1. How many versions of hadoop are there?
2. How many installation methods are there for CDH?
3. What changes has CDH made in terms of installation certification?
-------------------------------------------------- -------------------------------------------------- ------------------------
Differences between Cloudera's CDH and Apache's Hadoop ), respectively: Apache (the most original version, all distributions are based on this version to improve), Cloudera version (Cloudera's Distribution Including Apache Hadoop, referred to as CDH), Hortonworks version (Hortonworks Data Platform, referred to as "HDP"), for Domestically, the vast majority choose the CDH version. The main differences between CDH and Apache versions are as follows:  

(1) CDH has a very clear division of Hadoop versions. There are only two series of versions, namely cdh3 and cdh4, which correspond to the first generation of Hadoop respectively. (Hadoop 1.0) and the second generation of Hadoop (Hadoop 2.0), the Apache version is much more confusing than Apache hadoop in terms of compatibility, security, and stability.  

(2) The CDH3 version is improved based on Apache hadoop 0.20.2 and incorporates the latest patch. The CDH4 version is improved based on Apache hadoop 2.X. The CDH
always applies the latest bug fixes or Feature patches, and is better than Apache hadoop is released earlier with the same function, and the update speed is faster than the official Apache.

(3) Secure CDH supports Kerberos security authentication, while apache hadoop uses simple username matching authentication 

(4) CDH documentation is clear, and many users of Apache version will read the documentation provided by CDH, including installation documentation, upgrade documentation, etc. 

(5) CDH supports Yum/Apt package, Tar package, RPM package, CM installation, and Cloudera Manager installation. Apache hadoop only supports Tar package installation.


Note: When CDH uses the recommended Yum/Apt package to install, it has the following advantages: 
1. Network installation and upgrade, which is very convenient 
2. Automatically download dependent packages 

3. Hadoop ecosystem packages are automatically matched, and you do not need to search for the current For Hbase, Flume, Hive and other software matched by Hadoop, Yum/Apt will automatically find the software package of the matching version according to the currently installed Hadoop version, and ensure compatibility.

4. Automatically create relevant directories and soft-link them to appropriate places (such as conf and logs); automatically create hdfs and mapred users. The hdfs user is the highest privileged user of HDFS, and the mapred user is responsible for the permissions of the relevant directories during the execution of mapreduce. .
 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326293578&siteId=291194637