hadoop-2.6.5 Cluster Setup

1. Modify the host name: vi / etc / sysconfig / network

NETWORKING=yes
HOSTNAME=node1

2. Modify the domain name mapping: vi / etc / hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 // there are
:: 1 localhost localhost.localdomain localhost6 localhost6.localdomain6 // there are
192.168.10.11 node1
192.168.10.12 node2
192.168.10.13 node3
192.168.10.14 Node4

3. Set the date synchronization:

1) yum install ntp // If the server does not install
  1.1) chkconfig ntpd on // set the boot from Kai
2) ntpdate ntp.api.bz // time server
. 3) the ntpd Start-Service / STOP / the restart / reload
. 4) provided timing synchronization: crontab -e
  * / 10 * * * * ntpdate time.nist.gov // sync once every 10 minutes
  we can chkconfig --list 4.1) | start to see the situation of grep cron command cron service
    crond 0: off 1: off 2: enabled 3: enabled 4: enable 5: enable 6: turn off the
    system start level if it is 1-4, cron service will boot automatically activated
  4.2) set crond boot from the start: ON chkconfig crond
  4.3) using crontab parameters
    -e [UserName]: the implementation of a text editor to set the time-table, the default text editor is vi
    -r [UserName]: delete the current time-table
    -l [UserName]: list the current time-table
    -v [UserName]: lists the status of the user cron jobs

4. Close the firewall: chkconfig iptables off

5. Close the safety mechanisms: vi / etc / selinux / config

SELINUX=disabled
SELINUXTYPE=targeted

6.ssh avoid dense Login

. 1) yum List | grep SSH
2) -Y yum the install OpenSSH OpenSSH-Server-Clients
. 3) the sshd Start-Service
. 4) the chkconfig the sshd ON
. 5) // SSH-keygen generated secret key
6) ssh-copy-id node1 // Free The current density login server can log on node1 avoid dense
set namenode and resourcemanager server to log all free secret server (namenode + datanode)

7.Hadoop fully distributed cluster structures:

1)配置文件
  1.1 vi + /etc/profile
    #JAVA_HOME
    export JAVA_HOME=/opt/module/jdk1.8.0_171
    #HADOOP_HOME
    export HADOOP_HOME=/opt/module/hadoop-2.6.5
    export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
  1.2 hadoop-env.sh mapred-env.sh yarn-env.sh
    export JAVA_HOME=/opt/module/jdk1.8.0_171
  1.3 hdfs-core.xml
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://node1:8020</value>
    </property>
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/opt/data/hadoop</value>
    </property>
  1.4 hdfs-site.xml
    <property>
      <name> dfs.replication </ name>
      <value> 2 </ value>
    </ Property>
    <Property>
      <name> dfs.namenode.secondary.http-address </ name>
      <value> node2: 50090 </ value >
    </ Property>
  1.5 slaves
    node2
    node3
    Node4
  1.6 format file system: ./ bin / hdfs namenode -format
    view help: ./ bin / HDFS the NameNode -h
  1.7 start the cluster: ./ sbin / start-dfs.sh
  1.8 View the UI Web: IP: 50070:
    node1: 50070
  1.9 help:
    HDFS
    HDFS the DFS

    Create a directory: hdfs dfs -mkdir -p / user / root
    view directory: hdfs dfs -ls /
    upload files: the DFS -put HDFS hadoop-2.6.5.tar.gz / the User / root
  1.10 Stop Cluster: ./ sbin / stop -dfs.sh

8.Hadoop-HA building

  1)配置文件
    1.1 vi + /etc/profile
      #JAVA_HOME
      export JAVA_HOME=/opt/module/jdk1.8.0_171
      #HADOOP_HOME
      export HADOOP_HOME=/opt/module/hadoop-2.6.5
      #ZOOKEEPER_HOME
      export ZOOKEEPER_HOME=/opt/module/zookeeper-3.4.6
      export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin
    1.2 hadoop-env.sh mapred-env.sh yarn-env.sh
      export JAVA_HOME=/opt/module/jdk1.8.0_171
    1.3 hdfs-core.xml
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
      </property>
      <property>
        <name>hadoop.tmp.dir </ name>
      <value>/opt/data/hadoop</value>
      </property>
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>node2:2181,node3:2181,node4:2181</value>
      </property>
    1.4 hdfs-site.xml
      <property>
        <name>dfs.replication</name>
        <value>2</value>
      </property>
      <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
      </property>
      <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
      </property>
      <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>node1:8020</value>
      </property>
      <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>node2:8020</value>
      </property>
      <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>node1:50070</value>
      </property>
      <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>node2:50070</value>
      </property>
      <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value>
      </property>
      <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
      </property>
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
      </property>
      <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <!-- 如果文件是id_dsa这后边需要改成id_dsa -->
        <value>/root/.ssh/id_rsa</value>
      </property>
      <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/opt/data/hadoop/journal</value>
      </property>
      <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
      </property>
    1.5 slaves
      node2
      node3
      node4
    1.6 zookeeper集群搭建
      zoo.cfg
      tickTime=2000
      dataDir=/opt/data/zookeeper
      clientPort=2181
      initLimit=5
      syncLimit=2
      server.1=node2:2888:3888
      server.2=node3:2888:3888
      server.3=node4:2888:3888
      / opt / Data / ZooKeeper / MyID content are [1,2,3]
    1.7 zk on each node to perform: zkServer.sh start
      to see whether a successful start: zkServer.sh Status
    1.8 per journalnode nodes perform: hadoop-daemon.sh start journalnode // must be started before you start the Hadoop cluster journalnode
    1.9 synchronization edit log
      If there is a single cluster and namenode
        HDFS namenode -initializeSharedEdits (executed on the format already namenode)
        hadoop-daemon.sh Start namenode
        (namenode not be performed on the format) hdfs namenode -bootstrapStandby
      If the new cluster
        hdfs namenode -format
        start namenode hadoop-daemon.sh
        HDFS namenode -bootstrapStandby (not performed on the format namenode)
    1.10 zookeeper format and start
      hdfs zkfc -formatZK (namenode in which a node can be formatted)
      hadoop-daemon.sh start zkfc (two zkfc (ie namenode) node start) or directly all started start-dfs.sh

9.yarn build

1)配置文件
  mapred-site.xml
    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
  yarn-site.xml
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.resourcemanager.ha.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>yarn.resourcemanager.cluster-id</name>
      <value>cluster1</value>
    </property>
    <property>
      <name>yarn.resourcemanager.ha.rm-ids</name>
      <value>rm1,rm2</value>
    </property>
    <property>
      <name>yarn.resourcemanager.hostname.rm1</name>
      <value>node3</value>
    </property>
    <property>
      <name>yarn.resourcemanager.hostname.rm2</name>
      <value>node4</value>
    </property>
    <property>
      <name>yarn.resourcemanager.zk-address</name>
      <value>node2:2181,node3:2181,node4:2181</value>
    </property>
2)启动
  start-yarn.sh (这个只启动nodemanager)
  yarn-daemon.sh start resourcemanager (在两台resourcemanager节点上都启动)

3)测试wordcount
  hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount /user/jqbai/test.txt /user/jqbai/wordcount

10. development environment to build windows

Add environment variables:
  1) HADOOP_USER_NAME root =
  2) HADOOP_HOME = D: \ Software \ hadoop-2.6.5 (which is dedicated under Windows)

Guess you like

Origin www.cnblogs.com/jqbai/p/10989925.html