Distributed installation using a docker hadoop (Ali hadoop Mirror)

Use a docker in front of a mirror that comes with the installation, then this build hadoop cluster using mirrored Ali, also refer to the online portion of a podcast, but more or less there is a problem, I go through all this podcast is practice tested successfully run up.

1, the mirror mounting hadoop

1) pull the mirror

Ali pulled the hadoop mirror
docker pull registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop
Here Insert Picture Description
view mirror
docker images
Here Insert Picture Description

3) Create a container hadoop

(1) Create a master node
docker run --name master -d -h master registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop
Parameters:
-h hostname container set
-name Settings Container name
-d run in the background

Creating slave1 and slave2 node (2) in this way
docker run --name slave1 -d -h slave1 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop
docker run --name slave2 -d -h slave2 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop
(3) View container
docker ps -s
Here Insert Picture Description
(4) into the container to view the JDK
docker exec -it master bash
the JDK already comes with
Here Insert Picture Description
(5) configured to generate ssh keys, all the nodes must be configured
into the container after the
start ssh
/etc/init.d/ssh start
generation keys
* ssh-keygen -t rsa
Here Insert Picture Description
(5) respectively to each node, the public key of other nodes are also copied to the authorized_keys, which means that each> authorized_keys file stored in the public key is 3 and is the same as
the container file centos copied to the local
- docker cp 容器id/容器名称:/root/.ssh/authorized_keys /home/hadoop/authorized_keys_masterHere Insert Picture Description
these three files to copy a file
cd /home/hadoop/
cat authorized_keys_master authorized_keys_slave1 authorized_keys_slave2 > authorized_keys
cat authorized_keys
Here Insert Picture Description
to copy the files to the local container centos
docker cp /home/hadoop/authorized_keys 容器id/容器名称:/root/.ssh/authorized_keys
Here Insert Picture Description
each node ip address configuration (6), respectively,
into the container, this container can be used directly view the command ip address ip addr
Here Insert Picture Description
for each container to set the address, vi /etc/hostsconfigure
Here Insert Picture Description
ssh master test, the test is successful
Here Insert Picture Description

2, configure hadoop (directory profile generally in /opt/tools/hadoop-2.7.2/etc/hadoop/ below)

1) Configuration hadoop-env.sh , arranged jdk

(1) into the container storage location to find hadoop-env.sh

find / -name hadoop-env.sh
(2) View hadoop-env.sh file
Here Insert Picture Description

export JAVA_HOME=/opt/tools/jdk1.8.0_77

Here Insert Picture Description

2) Configuration core-site.xml, configure the address and port number hdfs
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://master:9000</value>
  </property>
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/hadoop/tmp</value>
    </property>

Here Insert Picture Description

3) Configure hdfs-site.xml, hdfs number of backup configuration, the configuration of the data path namenode and datanode

/ Hadoop / data and / hadoop / name to create this folder in advance
mkdirp -p /hadoop/data
mkdirp -p /hadoop/name

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/hadoop/data</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/hadoop/name</value>
  </property>
</configuration>

Greater than or equal to the number of slave backed, not those who will complain
Here Insert Picture Description

4) arranged mapred-site.xml, designated MapReduce run on yarn, arranged JobTracker address and port.
  <configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
</property>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

Here Insert Picture Description

5) disposed yarn-site.xml, configure the number of backup hdfs

Here Insert Picture Description
Configuration parameters

 <property>
     <name>yarn.resourcemanager.address</name>
    <value>master:8032</value>
 </property>
 <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
   <property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>master:8030</value> </property> <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
    <value>master:8033</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>master:8089</value>
</property>

Here Insert Picture Description

6) sends these parameters to other nodes

scp /opt/tools/hadoop-2.7.2/etc/hadoop/yarn-site.xml slave1:/opt/tools/hadoop-2.7.2/etc/hadoop/
The the site.xml-Core hadoop-env.sh transmission hdfs-site.xml mapred-site.xml yarn- site.xml slave2 node and to slave1Here Insert Picture Description

3, run hadoop

1) Configure slaves

Here Insert Picture Description
Here Insert Picture Description

2) on the master format namenode

hadoop namenode -format
Here Insert Picture Description

3) Start the cluster on the master

cd /opt/tools/hadoop/sbin/
./start-all.sh
Here Insert Picture Description

3) jps view the process view to illustrate started

Here Insert Picture Description

4) into the process View slave1

Here Insert Picture Description

Guess you like

Origin blog.csdn.net/k393393/article/details/91410409