A master plan
3 hosts: a master, 2 th slaver / worker
ip addresses docker default allocation:
master:
Host name: hadoop2, ip address: 172.17.0.2
slaver1:
Host name: hadoop3, ip address: 172.17.0.3
Host name: hadoop4, ip address: 172.17.0.4
Second, software installation
1, mounted in the docker centos image, and start centos container mounting ssh. - see "centos mirror mounted on Docker" article.
2, is connected to the through ssh centos container installation jdk1.8, hadoop3.0
It can be installed according to conventional methods linux software installed by the tar package hadoop jdk and uploaded to the host.
Mirroring get centos7
$ docker pull centos
Probably more than 70 M, such as the use of cloud Ali Docker accelerator, then quickly downloaded, after which you can see in the mirror list
View mirror list command:
Install SSH
In centos7 image constructed based on the SSH function with a centos
content:
The effect of this content is: centos mirror-based, SSH installation of related packages, set the root password is abc123, and start the SSH service
Mirroring a build command, the new image named centos7-ssh
After the execution is complete, you can see in the mirror list
Construction of Hadoop Mirror
The above operation is centos three containers, each container needs to be installed separately Hadoop environment, as we can construct the same image SSH, Hadoop build a mirror, and then run Hadoop container 3, so that even simpler
content:
FROM centos7-ssh ADD jdk-8u151-linux-x64.tar.gz /usr/local/ RUN mv /usr/local/jdk1.8.0_151 /usr/local/jdk1.8 ENV JAVA_HOME /usr/local/jdk1.8 ENV PATH $JAVA_HOME/bin:$PATH ADD hadoop-3.1.0.tar.gz /usr/local RUN mv /usr/local/hadoop-3.1.0 /usr/local/hadoop ENV HADOOP_HOME /usr/local/hadoop ENV PATH $HADOOP_HOME/bin:$PATH RUN yum install -y which sudo
This is based on centos7-ssh this image, the JAVA and Hadoop environments are configured the
Premise: Ready jdk-8u101-linux-x64.tar.gz and hadoop-2.7.3.tar.gz in the directory where Dockerfile
Build command execution, the new image named hadoop
Add 3 hosts in / etc / hosts file corresponding to the host name and address information ip
172.17.0.2 hadoop2
172.17.0.3 hadoop3
172.17.0.4 hadoop4
Directly modify / etc / hosts file docker, the restart will be reset after the container cover. So you need to start --add-host parameter script docker run by container incoming correspondence between the host and ip address, the container will be written after starting hosts file. Such as:
docker run --name hadoop2--add-host hadoop2:172.17.0.2 --add-host hadoop3:172.17.0.3 --add-host hadoop4:172.17.0.4 hadoop
docker exec -it hadoop2 bash
hadoop deployment
1. Define working node in the workers file
New workers file etc / hadoop hadoop directory in the root directory, and add the working node host information.
Follow the steps in a host planning, work node and host hadoop3 hadoop4 two hosts. Such as:
[root@9e4ede92e7db ~]# cat /usr/local/hadoop/etc/hadoop/workers hadoop3 hadoop4
2. Edit Profile Information
a, in hadoop-env.sh added information JAVA_HOME
[root@9e4ede92e7db ~]# cat /usr/local/hadoop/etc/hadoop/hadoop-env.sh |grep JAVA_HOME # JAVA_HOME=/usr/java/testing hdfs dfs -ls # Technically, the only required environment variable is JAVA_HOME. # export JAVA_HOME= JAVA_HOME=/usr/local/jdk1.8
b、core-site.xml
configuration><property> <name>fs.default.name</name> <value>hdfs://hadoop2:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> </configuration>
c, hdfs-site.xml
d, yarn-site.xml
e, mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop2:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop2:19888</value> </property> </configuration>
f, in order to prevent the intake pit to prepare in advance
we start-dfs.sh we stop-dfs.sh
HDFS_DATANODE_USER=root #HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs
we start-yarn.sh we stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
note:
The above step is stopped after completion of the current container, and use the command holding docker to a new image. Use the new image restart the cluster, the cluster so each machine has the same account, configuration and software, without having to re-configure. Such as:
a, stop the container
b, save image
test
1, port mapping
After the cluster is up, need to observe the operation of the web interface of the cluster, and therefore need to be mapped to a port container port on the host mainframe, can be done by the -p option docker run command. such as:
Scheduling the yarn into the host port mapping host port 8088:
2, starting from the new image three containers
docker run --name hadoop2 --add-host hadoop2:172.17.0.2 --add-host hadoop3:172.17.0.3 --add-host hadoop4:172.17.0.4 -d -p 5002:22 -p 9870:9870 -p 8088:8088 -p 19888:19888 hadoop_me:v1.0 docker run --name hadoop3 --add-host hadoop2:172.17.0.2 --add-host hadoop3:172.17.0.3 --add-host hadoop4:172.17.0.4 -d -p 5003:22 hadoop_me:v1.0 docker run --name hadoop4 --add-host hadoop2:172.17.0.2 --add-host hadoop3:172.17.0.3 --add-host hadoop4:172.17.0.4 -d -p 5004:22 hadoop_me:v1.0
3. Formatting
into the / usr / local / hadoop directory
formatting commands
Hadoop2 in hadoop modify a configuration file etc / hadoop / slaves
delete all the original content, amended as follows
Execute commands in hadoop2
4. Execute the script starts start-all.sh executed on the cluster hadoop / usr / local under / hadoop directory on the master host: sbin / start-all.sh
5. Access through web pages