B05 - Basic Concepts 003, Hadoop cluster

0, this chapter outlines the learning catalog - the basic concepts of Hadoop clusters

Beginner consuming: 1h

Note: CSDN end of the phone does not support chapter jumps within the chain, but the chain is available, also requested a better experience on the PC side.

A, Hadoop release
  1.1 Community Edition, the commercial version.
  1.2 Hadoop version.
  1.3 Hadoop current stable version.
  1.4 division of the cluster concept.

Second, the basic concept of Hadoop cluster
    concept 2.1 Hadoop core components.
    2.2 Hadoop deployment of.

Three, Hadoop cluster installation
  3.1 Hadoop, CDH5.14.0 - 2.6.0, compile the source package.
  3.2 pseudo-distributed learning environment [].
  3.3 fully distributed - HA HA [work] environment.



Bowen extended link Recommended:
   B02 - 002, Hadoop, CDH5.14.0 - 2.6.0, compiled source package
   B02 - 003, Hadoop, fully distributed, CDH5.14.0 - 2.6.0, offline storage, calculation, management
   B02 - 025, Hadoop, pseudo-distributed , CDH5.14.0 - 2.6.0


Beverage giant comfort zone Akzo  ||  ♂ ♀ tired feel no love





A, Hadoop release

  1.1 to Community Edition, the commercial version.

  • Hadoop releases into the open source community and commercial versions, Community Edition refers maintained by the Apache Software Foundation version is the official version of the maintenance system.
  • Refers to the commercial version of Hadoop made some changes in the community version of Hadoop by a third party on the basis of commercial companies, as well as the integration of the various service components compatibility testing and release version, more well-known cloudera of CDH, mapR and so on.
  • We learn that the community version: cloudera of CDH.

Community Edition (official version): Apache Software Foundation to maintain the latest version of the most versatile and least compatible.
Business Edition: commercial company for commercial release (CDH) Community-based version compatibility version is slightly lower

  1.2 ~ Hadoop version.

  • If you do not follow instructions refer to the version of CDH very special version Hadoop, it is composed of multiple parallel branches of developing. Big point of view is divided into three major series version: 1.x, 2.x, 3.x.
  • Hadoop1.0 of a distributed file system and a calculated offline HDFS MapReduce framework composition.
  • HDFS Hadoop 2.0 contains a lateral support extension NameNode, a resource management system, and a run off-line calculation YARN frame YARN on the MapReduce. Compared to Hadoop1.0, Hadoop 2.0 more powerful and has better scalability, performance, and supports a variety of computing framework.
  • Hadoop Hadoop 3.0 2.0 compared to the previous series of enhancements. But still alpha version, there are a lot of bug, and can not guarantee the stability and quality of the API.

  1.3 ~ Hadoop current stable version.

  • Our courses are using the most current stable version 2 series: CDH 2.6.0 - CDH14.0 .

  1.4 to cluster partitioning concept.

alt



Do the master of life, self-care, self-discipline, self-reliance.

- - - - - - - - - - - - - - - - - - - - - - - - - - - -


Two, Hadoop clusters basic concepts

  2.1 to the concept of Hadoop core components.

HADOOP particular cluster comprises two clusters: cluster and separated YARN HDFS cluster, both logical, but physically often together.

  • HDFS 集群负责海量数据的存储,集群中的角色主要有:NameNode、DataNode、SecondaryNameNode。
  • YARN 集群负责海量数据运算时的资源调度,集群中的角色主要有:ResourceManager、NodeManager。

alt

那 MapReduce 为什么没有集群呢?


MapReduce是计算程序,是代码层面的组件,不需要在物理服务器上部署,没有集群的概念。要想使用MapReduce就写Java代码来执行。

它其实是一个分布式运算编程框架,是应用程序开发包,由用户按照编程规范进行程序开发,后打包运行在 HDFS 集群上,并且受到 YARN 集群的资源调度管理。

  2.2 ~ Hadoop的部署方式。

Hadoop 部署方式分三种,Standalone mode(独立模式)、Pseudo-Distributed mode(伪分布式模式)、Cluster mode(群集模式),其中前两种都是在单机部署。

    2.2.1 .  Standalone mode(独立模式)
  • 独立模式又称为单机模式,仅 1 个机器运行 1 个 java 进程,主要用于调试。
    2.2.2 .  Pseudo-Distributed mode(伪分布式模式)
  • 伪分布模式也是在 1 个机器上运行 HDFS 的 NameNode 和 DataNode、YARN 的
    ResourceManger 和 NodeManager,但分别启动单独的 java 进程,主要用于调试。
    2.2.3 .  Cluster mode(群集模式、完全分布式)
  • 集群模式主要用于生产环境部署。
  • 会使用 N 台主机组成一个 Hadoop 集群。
  • 这种部署模式下,主节点和从节点会分开部署在不同的机器上。


做生活的主人,自理、自律、自强。

- - - - - - - - - - - - - - - - - - - - - - - - - - - -




Third, preparation

  3.1 ~ Hadoop, CDH5.14.0 - 2.6.0, compile the source package.

   B02 - 002, Hadoop, CDH5.14.0 - 2.6.0, compile the source package

  3.2 to pseudo-distributed learning environment [].

   B02 - 003, Hadoop, fully distributed, CDH5.14.0 - 2.6.0, offline storage, computing, management

Note: The first learning to use pseudo-distributed configuration is simple, easy to reap the rewards learning.

  3.3 ~ fully distributed - HA HA [work] environment.

   B02 - 025, Hadoop, pseudo-distributed, CDH5.14.0 - 2.6.0



Do the master of life, self-care, self-discipline, self-reliance.

- - - - - - - - - - - - - - - - - - - - - - - - - - - -



^ At this point, Hadoop cluster to complete the basic concepts.


- - - - - - - - - - - - - - - - - - - - - - - - - - - -


※ worldly temptations so great that the firm always moved.

What are the common method of lock-free programming have?

...
A, for the counter, can be used plus atoms.
B, only a producer and a consumer, it can be done free access lock ring buffer (Ring Buffer).
C, RCU (Read-Copy- Update), a copy of the old and new handover mechanism, for a copy of the old practice of delayed release can be employed.
D, CAS (Compare-and- Swap), as no lock stacks, lock-waiting queue.
...
ABCD
alt



Do the master of life, self-care, self-discipline, self-reliance.

- - - - - - - - - - - - - - - - - - - - - - - - - - - -


Note: CSDN end of the phone does not support chapter jumps within the chain, but the chain is available, also requested a better experience on the PC side.

I know my weakness, I know what you are picky, but I just I do not like fireworks, thank you for pointing, creating a piece of me :)!



Do the master of life, self-care, self-discipline, self-reliance.


Guess you like

Origin blog.csdn.net/weixin_42464054/article/details/91529458