hadoop CDH

Hadoop CDH
currently has a lot of Hadoop distributions, including Huawei distribution, Intel distribution, Cloudera distribution (CDH), etc. All of these distributions are derived from Apache Hadoop. The reason why there are so many versions is entirely by Apache Hadoop's open source license dictates: anyone can modify it and release/sell it as an open source or commercial product. (http://www.apache.org/licenses/LICENSE-2.0).

The vast majority of domestic company distributions are charged, such as Intel distribution, Huawei distribution, etc. Although these distributions have added many new features that the open source version does not have, most companies will consider whether to charge or not when choosing the Hadoop version. Important indicators, there are three main Hadoop versions without charge (all are foreign manufacturers), namely:
        Cloudera version (Cloudera's Distribution Including Apache Hadoop, referred to as "CDH"),
Apache Foundation hadoop,
Hortonworks version (Hortonworks Data Platform, referred to as "CDH") HDP")------represents in order, the domestic usage rate, although CDH and HDP are paid versions, but they are open source, and only charge service fees.

For the domestic, the vast majority choose the CDH version, the main reasons are as follows:

(1) CDH has a very clear division of Hadoop versions, there are only two series of versions (now updated to CDH5.20, based on hadoop2.x), They are cdh3 and cdh4, which correspond to the first generation Hadoop (Hadoop 1.0) and the second generation Hadoop (Hadoop 2.0) respectively. In comparison, the Apache version is much more confusing;
(2) The CDH documentation is clear, and many users who use the Apache version will read the documentation provided by cdh, including installation documentation, upgrade documentation, etc.

    Correspondence between CDH and Apache versions:
    cdh3 version is based on apache hadoop 0.20.2
    cdh3u6 corresponds to the latest version of apache hadoop (Hadoop 1.x)
    cdh4 corresponds to apache hadoop 2.x

The HDP version is a relatively new version and is currently basically synchronized with apache. Because most of the employees inside Hortonworks are apache code contributors, especially Hadoop 2.0 contributors.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326293179&siteId=291194637