Manual mode to build a dual Namenode+Yarn Hadoop cluster (Part 1)

Dual NameNode implementation principle and application architecture

1. What is dual NameNode

In the distributed file system HDFS, the NameNode is the master. When the NameNode fails, the entire HDFS will be unavailable, so it is very important to ensure the stability of the NameNode. In the Hadoop 1.x version, HDFS only supports one NameNode. To ensure stability, it can only be achieved by the SecondaryNameNode. The SecondaryNameNode cannot be hot backup, and the restored data is not the latest metadata. Based on this, starting from the Hadoop 2.x version, HDFS began to support multiple NameNodes, so that not only the high availability of HDFS can be achieved, but also the storage scale of HDFS can be expanded.

In actual enterprise applications, the dual NameNode architecture is most used, that is, one NameNode is in Active (active) state, and the other NameNode is in Standby (standby) state. Through this mechanism, dual-system hot standby high availability of NameNode is realized. Features.

2. Operating principle of dual NameNode

In the highly available NameNode architecture, only the Active NameNode is working normally, and the Standby NameNode is in a standby state at any time, and it synchronizes the metadata of the Active NameNode at all times. Once the Active NameNode cannot work, you can manually or automatically switch the Standby NameNode to the Active state to keep the NameNode working continuously. This is the implementation mechanism of two highly reliable NameNodes.

The switching between the active and standby NameNodes can be achieved manually or automatically. As an online big data environment, the switching is achieved automatically. To ensure automatic switching, the NameNode uses the ZooKeeper cluster for arbitration elections. The basic idea is that both NameNodes in the HDFS cluster are registered in ZooKeeper. When the Active NameNode fails, ZooKeeper can immediately detect this situation, and it will automatically switch the Standby state to the Active state.

As a highly reliable system, the ZooKeeper (ZK) cluster can provide monitoring for cluster collaboration data and feedback data changes to the client at any time. The hot standby function of HDFS relies on two features provided by ZK: error detection and active node election. The mechanism for HDFS to achieve high availability through ZK is as follows.

Each NameNode will register and persist a session identifier in ZK. Once the NameNode fails, the session will expire, and ZK will also notify other NameNodes to initiate a failover. ZK provides a simple mechanism to ensure that only one NameNode is active, that is, an exclusive lock. If the current active NameNode fails, another NameNode will acquire the exclusive lock in ZK, indicating that it is an active node.

ZKFailoverController (ZKFC) is the client of the ZK cluster, which is used to monitor the status information of NN. Each node running NameNode must run a ZKFC. ZKFC provides the following functions:

Health check, ZKFC periodically initiates a health-check command to the local NN. If the NN returns correctly, then the NN is considered OK, otherwise it is considered a failed node;
session management, when the local NN is healthy, ZKFC will Holding a session in ZK, if the local NN happens to be Active, then ZKFC will hold a short-lived node as a lock. Once the local NN fails, then this node will be automatically deleted;
basic election, if the local NN is If it is healthy and ZKFC finds that no other NN holds this exclusive lock, it will try to acquire the lock. Once it succeeds, it will start to execute Failover and then become an NN node in the Active state; the Failover process is divided into two steps, First perform isolation on the previous NameNode (if necessary), and then switch the local NameNode to Active.
3. How to ensure metadata consistency in the dual NameNode architecture

You may be smart enough to ask, how is the metadata between the two NameNode architectures shared?

Since the Hadoop 2.x version, HDFS has adopted a new metadata sharing mechanism, that is, data sharing through Quorum Journal Node (JournalNode) cluster or network File System (NFS). NFS is at the operating system level, while JournalNode is at the Hadoop level. It is mature, reliable, and easy to use. Therefore, here we use the JournalNode cluster for metadata sharing.

Refer to the figure below for how to share metadata between the JournalNode cluster and the NameNode.

As can be seen from the figure, the JournalNode cluster can pull metadata from the NameNode in almost real time, and then save the metadata to the JournalNode cluster; at the same time, the NameNode in the standby state will also synchronize the JNS data on the JournalNode cluster in real time. In this way, The data synchronization between the two NameNodes is realized.

So, how is it implemented inside the JournalNode cluster?

In order to synchronize data, the two NameNodes communicate with each other through a set of independent processes called JournalNodes. When there are any changes to the Active NameNode metadata, most of the JournalNodes processes will be notified. At the same time, the NameNode in the Standby state will also read the change information in the JNs, and always monitor the EditLog (transaction log) changes, and apply the changes to its own namespace. Standby can ensure that the metadata state is completely synchronized when the cluster fails.

The figure below is a diagram of the internal operation architecture of the JournalNode cluster.

As can be seen from the figure, JN1, JN2, JN3, etc. are the nodes of the JournalNode cluster. The basic principle of QJM (Qurom Journal Manager) is to use 2N+1 JournalNodes to store the EditLog. Each write operation has N/2+1 nodes that return success. , Then this write operation is considered successful, ensuring high data availability. Of course, what this algorithm can tolerate is that at most N machines hang up. If more than N machines hang up, the algorithm will fail.

ANN represents the NameNode in Archive state, SNN represents the NameNode in Standbye state, QJM reads data from ANN and writes it to EditLog, then SNN reads data from EditLog, and then applies it to itself.

4. Dual NameNode high-availability Hadoop cluster architecture

As the second version of Hadoop, the biggest change in Hadoop 2.x is the high availability of NameNode and the computing resource manager Yarn. In this lesson, we will focus on how to build an online high-availability Hadoop cluster system. There are two key points here. One is the construction of NameNode high-availability, and the other is the implementation of the resource manager Yarn, which enables real distributed computing through Yarn. Integration with multiple computing frameworks.

The following figure is a schematic diagram of a highly available Hadoop cluster operation.

This architecture mainly solves two problems, one is the problem of NameNode metadata synchronization, and the other is the problem of switching between the active and standby NameNodes. As can be seen from the figure, the metadata synchronization of the active and standby NameNodes is completed through the JournalNode cluster, while the main and standby are resolved. NameNode switching can be done through ZooKeeper.

ZooKeeper is an independent cluster. A failoverController (zkfc) process needs to be started on the two NameNodes. This process exists as a client of the ZooKeeper cluster. The interaction with the ZooKeeper cluster and status monitoring can be realized through zkfc.

The process of constructing HDFS high-availability Hadoop cluster with dual NameNode + Yarn

1. Host, software functions, and disk storage planning before deployment

The roles involved in the dual NameNode Hadoop cluster environment include Namenode, datanode, resourcemanager, nodemanager, historyserver, ZooKeeper, JournalNode, and zkfc. These roles can be run on a single server, or some roles can be combined to run on one server. Machine.

In general, the NameNode service needs to be deployed independently, so that two NameNodes need two servers, and the datanode and nodemanager services are recommended to be deployed on one server. The resourcemanager service is similar to the NameNode, and it is also recommended to be deployed independently on one server. The historyserver is generally put together with the resourcemanager service. ZooKeeper and JournalNode services are based on a cluster architecture, so at least 3 cluster nodes are required, that is, 3 servers are required. However, ZooKeeper and JournalNode clusters can be put together and share 3 server resources. Finally, zkfc arbitrates the resources of the NameNode, so it must run with the NameNode service, so that zkfc does not need to occupy a separate server.

The following deployment is implemented by 5 independent servers, the operating system adopts Centos 7.7 version, and the host name, IP address and function role of each server are shown in the following table:

As can be seen from the table, namenodemaster and yarnserver are the primary and backup nodes of NameNode, and yarnserver also acts as ResourceManager and JobHistoryServer. If the server resources are sufficient, you can put the ResourceManager and JobHistoryServer services on a separate machine.

In addition, the ZooKeeper cluster and JournalNode cluster are deployed on the three hosts slave001, slave002, and slave003, and they also act as DataNode and NodeManager.

In software deployment, the version adopted by each software is shown in the following table:

Finally, you need to consider disk storage planning. The data blocks of the HDFS file system are stored on each local datanode. Therefore, each datanode node must have a large-capacity disk. The disk type can be an ordinary mechanical hard disk. If possible, the SSD hard disk is the best. A single hard disk recommends 4T or 8T. These hard disks do not need to be RAID, and can be used as a single disk. Because HDFS itself already has a copy fault tolerance mechanism.

In the environment introduced in this lesson, each of my datanode nodes has 2 large-capacity hard disks to store HDFS data blocks. In addition, the host where the NameNode node is located must store the metadata information of HDFS. These metadata control the storage and reading and writing of the entire HDFS cluster. Once lost, HDFS will lose data or even become unusable, so ensure the security of the HDFS metadata of the NameNode node Sex is essential.

It is recommended to configure 4 disks of the same size on the NameNode node, make a raid1 for every two, make a total of two sets of raid1, and then store the metadata mirroring on these two raid1.

2. The basic environment of the automated installation system

Before deploying the Hadoop cluster, you need to configure the basic environment of the system, including the host name, local analysis of the hosts file, the ansible management machine to the cluster node to establish ssh trust, system parameter modification (ulimit resource configuration, close selinux), and create Hadoop Five aspects of users. These five aspects can be automated through the ansible playbook script, which will be introduced in turn below.

(1) Establish the management machine to the cluster node SSH login without password

In the big data operation and maintenance, the installation of the operating system is basically completed automatically. After the system is installed, the passwords of all hosts are also the same. For the convenience of automatic operation and maintenance, it is necessary to establish a one-way relationship between the ansilbe management machine and all nodes of the Hadoop cluster. No password login permission.

First, modify the ansible configuration file hosts (/etc/ansible/hosts in this example) to add host group information, as follows:

[hadoophosts]
192.168.1.31   hostname=namenodemaster
192.168.1.41   hostname=yarnserver
192.168.1.70   hostname=slave001 myid=1
192.168.1.103  hostname=slave002 myid=2
192.168.1.169  hostname=slave003 myid=3

In this host group, the front is the IP address, and the back is the host name of each host. By defining the hostname variable, the host name can be automatically modified through ansible.

Next, create the /etc/ansible/roles/vars/main.yml file with the following content

[root@namenodemaster ansible]# vim roles/vars/main.yml 

zk1_hostname: 192.168.1.70
zk2_hostname: 192.168.1.103
zk3_hostname: 192.168.1.169
AnsibleDir: /etc/ansible
BigdataDir: /opt/bigdata
hadoopconfigfile: /etc/hadoop

There are 6 role variables defined here, which will be used in the playbook later. Finally, write the playbook script as follows:

- hosts: hadoophosts
  gather_facts: no
  roles:
   - roles
  tasks:
   - name: close ssh yes/no check
     lineinfile: path=/etc/ssh/ssh_config regexp='(.*)StrictHostKeyChecking(.*)' line="StrictHostKeyCheck
ing no"
   - name: delete /root/.ssh/
     file: path=/root/.ssh/ state=absent
   - name: create .ssh directory
     file: dest=/root/.ssh mode=0600 state=directory
   - name: generating local public/private rsa key pair
     local_action: shell ssh-keygen -t rsa  -N '' -y -f /root/.ssh/id_rsa
   - name: view id_rsa.pub
     local_action: shell cat /root/.ssh/id_rsa.pub
     register: sshinfo
   - set_fact: sshpub={
   
   {sshinfo.stdout}}
   - name: add ssh record
     local_action: shell echo {
   
   {sshpub}} > {
   
   {AnsibleDir}}/roles/templates/authorized_keys.j2
   - name: copy authorized_keys.j2 to all
     template: src={
   
   {AnsibleDir}}/roles/templates/authorized_keys.j2 dest=/root/.ssh/authorized_keys mode
=0600
     tags:
     - install ssh

Name this playbook script sshk.yml, and then execute the following command on the command line to complete the ssh one-way letter

[root@namenodemaster ansible]# pwd
/etc/ansible
[root@namenodemaster ansible]# ansible-playbook  sshk.yml -k

Verify as follows

[root@namenodemaster vars]# ssh 192.168.1.41
Last login: Tue Feb  2 08:47:22 2021 from 192.168.1.31
[root@yarnserver ~]# exit
logout
Connection to 192.168.1.41 closed.
[root@namenodemaster vars]# ssh 192.168.1.70
Last login: Tue Feb  2 08:47:22 2021 from 192.168.1.31
[root@salve001 ~]# exit
logout
Connection to 192.168.1.70 closed.
[root@namenodemaster vars]# ssh 192.168.1.103
Last login: Tue Feb  2 21:47:22 2021 from 192.168.1.31
[root@slave002 ~]# exit
logout
Connection to 192.168.1.103 closed.
[root@namenodemaster vars]# ssh 192.168.1.169
Last login: Tue Feb  2 08:52:00 2021 from 192.168.1.31
[root@slave003 ~]# 

(2) Automatically modify the host name

Immediately after the above ansible configuration environment, to realize batch automatic modification of hostnames, execute the following playbook script, name this playbook script hostname.yml, and then execute the following command on the command line to complete the hostname modification:

[root@namenodemaster ansible]# cat hostname.yml 
- hosts: hadoophosts
  remote_user: root
  tasks:
  - name: change name
    shell: "echo {
   
   {hostname}} > /etc/hostname"
  - name:
    shell: hostname {
   
   {hostname|quote}}

(3) Automatically build a local analytic hosts file

Immediately following the ansible configuration environment above, to automatically build a local parsing hosts file, it can be achieved through the following playbook script:

[root@namenodemaster ansible]# vim hosts.yml

- hosts: hadoophosts
  remote_user: root
  roles:
  - roles
  tasks:
   - name: add localhost
     local_action: shell echo "127.0.0.1   localhost" > {
   
   {AnsibleDir}}/roles/templates/hosts.j2
     run_once: true
   - set_fact: ipaddress={
   
   {inventory_hostname}}
   - set_fact: hostname={
   
   {hostname}}
   - name: add host record
     local_action: shell echo {
   
   {ipaddress}} {
   
   {hostname}} >> {
   
   {AnsibleDir}}/roles/templates/hosts.j2
   - name: copy hosts.j2 to all host
     template: src={
   
   {AnsibleDir}}/roles/templates/hosts.j2 dest=/etc/hosts

Name the playbook script hosts.yml, and then execute the following commands on the command line to complete the construction of the local parsed hosts file and distribute each node of the cluster

[root@namenodemaster ansible]# ansible-playbook  hosts.yml

PLAY [hadoophosts] ********************************************************************************************

TASK [Gathering Facts] ****************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.41]
ok: [192.168.1.103]
ok: [192.168.1.169]
ok: [192.168.1.70]

TASK [add localhost] ******************************************************************************************
changed: [192.168.1.31]

TASK [set_fact] ***********************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.41]
ok: [192.168.1.169]
ok: [192.168.1.70]
ok: [192.168.1.103]

TASK [set_fact] ***********************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.41]
ok: [192.168.1.169]
ok: [192.168.1.70]
ok: [192.168.1.103]

TASK [add host record] ****************************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.41]
changed: [192.168.1.169]
changed: [192.168.1.103]
changed: [192.168.1.70]

TASK [copy hosts.j2 to all host] ******************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.169]
changed: [192.168.1.70]
changed: [192.168.1.41]
changed: [192.168.1.103]

PLAY RECAP ****************************************************************************************************
192.168.1.103              : ok=5    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=5    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=6    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=5    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=5    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# 

(4) Automatically modify and optimize system parameters

Immediately after the above ansible configuration environment, system parameter optimization mainly includes closing selinux, closing firewalld, iptables, adding ulimit resource limit, adding time synchronization server, etc. To automatically optimize system parameters, you can use the following playbook script to achieve

[root@namenodemaster ansible]# vim os.yml 

- hosts: hadoophosts
  remote_user: root
  gather_facts: false
  tasks:
   - name: selinux disabled
     lineinfile: dest=/etc/selinux/config regexp='SELINUX=(.*)' line='SELINUX=disabled'
   - name:
     lineinfile: dest=/etc/security/limits.conf line="{
   
   {item.value}}"
     with_items:
     - {value: "*         soft    nofile         655360"}
     - {value: "*         hard    nofile         655360"}
   - name: disabled iptables and firewalld
     shell: systemctl stop firewalld; systemctl disable firewalld
   - name: cron ntpdate
     cron: name=ntpdate minute=*/5 user=root job="source /etc/profile;/usr/sbin/ntpdate -u 192.168.1.41;/sbin/hwclock -w"

The playbook script executed in turn to close selinux, add user resource configuration, close the firewall, and increase the time synchronization server. Among them, 192.168.1.41 is the time synchronization server of my internal network. If there is no such time server, the external network time synchronization clock can also be used, but To ensure that the machine can access the Internet .

Name the playbook script os.yml, and then execute the following commands on the command line to complete the optimization of system parameters

[root@namenodemaster ansible]# ansible-playbook  os.yml

PLAY [hadoophosts] *****************************************************************************************************************************************

TASK [selinux disabled] ************************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.169]
ok: [192.168.1.41]
ok: [192.168.1.70]
ok: [192.168.1.103]

TASK [lineinfile] ******************************************************************************************************************************************
ok: [192.168.1.70] => (item={u'value': u'*         soft    nofile         655360'})
ok: [192.168.1.41] => (item={u'value': u'*         soft    nofile         655360'})
ok: [192.168.1.31] => (item={u'value': u'*         soft    nofile         655360'})
ok: [192.168.1.103] => (item={u'value': u'*         soft    nofile         655360'})
ok: [192.168.1.169] => (item={u'value': u'*         soft    nofile         655360'})
ok: [192.168.1.70] => (item={u'value': u'*         hard    nofile         655360'})
ok: [192.168.1.103] => (item={u'value': u'*         hard    nofile         655360'})
ok: [192.168.1.41] => (item={u'value': u'*         hard    nofile         655360'})
ok: [192.168.1.31] => (item={u'value': u'*         hard    nofile         655360'})
ok: [192.168.1.169] => (item={u'value': u'*         hard    nofile         655360'})

TASK [disabled iptables and firewalld] *********************************************************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.169]
changed: [192.168.1.41]
changed: [192.168.1.70]
changed: [192.168.1.103]

TASK [cron ntpdate] ****************************************************************************************************************************************
changed: [192.168.1.103]
changed: [192.168.1.41]
changed: [192.168.1.31]
changed: [192.168.1.70]
changed: [192.168.1.169]

PLAY RECAP *************************************************************************************************************************************************
192.168.1.103              : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
[root@namenodemaster ansible]#

(5) Automatic batch creation of Hadoop users

As a cluster administrator user, a Hadoop user needs to be created on each cluster node. This user does not require a password. Only one user can be created. All subsequent services are started by the Hadoop user. The following playbook script can automatically complete the work of creating users, the content of the script is as follows

[root@namenodemaster ansible]# cat adduser.yml 
- name: create user
  hosts: hadoophosts
  remote_user: root
  gather_facts: true
  vars:
    user1: hadoop
  tasks:
   - name: start createuser
     user: name="{
   
   {user1}}"

 The playbook script is named adduser.yml, and then execute the following command on the command line to complete the user creation

[root@namenodemaster ansible]# ansible-playbook  adduser.yml 

PLAY [create user] *****************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.169]
ok: [192.168.1.41]
ok: [192.168.1.103]
ok: [192.168.1.70]

TASK [start createuser] ************************************************************************************************************************************
ok: [192.168.1.31]
changed: [192.168.1.103]
changed: [192.168.1.70]
changed: [192.168.1.41]
changed: [192.168.1.169]

PLAY RECAP *************************************************************************************************************************************************
192.168.1.103              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# 

3. Automatic installation of JDK, ZooKeeper and Hadoop

The installation and deployment of the entire Hadoop cluster requires three steps, namely, installing the JDK and setting the Java environment variables, the installation and deployment of the ZooKeeper cluster, and the installation and deployment of the Hadoop cluster.

Software deployment is generally divided into installation and configuration. If it is deployed through automated tools, it is generally to download the software, then modify the configuration file, and finally package and compress the program and configuration. In this way, an automated deployment program is packaged. . Put the packaged program in the corresponding directory of the ansible management machine and call it.

Here, JDK, ZooKeeper and Hadoop are all installed in the /opt/bigdata directory of the server. Start with the installation and deployment of the JDK, and use the way of writing ansible-playbook script. The content of the written automated installation JDK script is as follows:

Edit the jdk.yml file 

[root@namenodemaster ansible]# cat jdk.yml 
- hosts: hadoophosts
  remote_user: root
  roles:
  - roles
  tasks:
   - name: mkdir jdk directory
     file: path={
   
   {BigdataDir}} state=directory mode=0755
   - name: copy and unzip jdk
     unarchive: src={
   
   {AnsibleDir}}/roles/files/jdk.tar.gz dest={
   
   {BigdataDir}}
   - name: chmod bin
     file: dest={
   
   {BigdataDir}}/jdk/bin mode=0755 recurse=yes
   - name: set jdk env
     lineinfile: dest=/home/hadoop/.bash_profile line="{
   
   {item.value}}" state=present
     with_items:
     - {value: "export JAVA_HOME={
   
   {BigdataDir}}/jdk"}
     - {value: "export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar"}
     - {value: "export PATH=$JAVA_HOME/bin:$PATH"}

[root@namenodemaster ansible]# 

This script still uses the role variables BigdataDir and AnsibleDir. Among them, jdk.tar.gz is the packaged JDK installation program, which only needs to be copied to each node of the cluster and decompressed to complete the installation. Name the playbook script jdk.yml, and then execute the following command on the command line to complete the JDK installation

[root@namenodemaster ansible]# ansible-playbook jdk.yml 

PLAY [hadoophosts] *****************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.103]
ok: [192.168.1.169]
ok: [192.168.1.41]
ok: [192.168.1.70]

TASK [mkdir jdk directory] *********************************************************************************************************************************
changed: [192.168.1.103]
changed: [192.168.1.41]
changed: [192.168.1.169]
changed: [192.168.1.70]
changed: [192.168.1.31]

TASK [copy and unzip jdk] **********************************************************************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.169]
changed: [192.168.1.41]
changed: [192.168.1.70]
changed: [192.168.1.103]

TASK [chmod bin] *******************************************************************************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.70]
changed: [192.168.1.41]
changed: [192.168.1.169]
changed: [192.168.1.103]

TASK [set jdk env] *****************************************************************************************************************************************
changed: [192.168.1.41] => (item={u'value': u'export JAVA_HOME=/opt/bigdata/jdk'})
changed: [192.168.1.103] => (item={u'value': u'export JAVA_HOME=/opt/bigdata/jdk'})
changed: [192.168.1.70] => (item={u'value': u'export JAVA_HOME=/opt/bigdata/jdk'})
changed: [192.168.1.169] => (item={u'value': u'export JAVA_HOME=/opt/bigdata/jdk'})
changed: [192.168.1.31] => (item={u'value': u'export JAVA_HOME=/opt/bigdata/jdk'})
changed: [192.168.1.41] => (item={u'value': u'export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar'})
ok: [192.168.1.31] => (item={u'value': u'export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar'})
changed: [192.168.1.70] => (item={u'value': u'export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar'})
changed: [192.168.1.103] => (item={u'value': u'export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar'})
changed: [192.168.1.169] => (item={u'value': u'export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar'})
changed: [192.168.1.41] => (item={u'value': u'export PATH=$JAVA_HOME/bin:$PATH'})
changed: [192.168.1.70] => (item={u'value': u'export PATH=$JAVA_HOME/bin:$PATH'})
changed: [192.168.1.103] => (item={u'value': u'export PATH=$JAVA_HOME/bin:$PATH'})
ok: [192.168.1.31] => (item={u'value': u'export PATH=$JAVA_HOME/bin:$PATH'})
changed: [192.168.1.169] => (item={u'value': u'export PATH=$JAVA_HOME/bin:$PATH'})

PLAY RECAP *************************************************************************************************************************************************
192.168.1.103              : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# 

Next, write a script to automatically install the ZooKeeper cluster, as follows

[root@namenodemaster ansible]# vim zk.yml 

 - hosts: hadoophosts
   remote_user: root
   roles:
   - roles
   tasks:
   - name: mkdir directory for bigdata data
     file: dest={
   
   {BigdataDir}} mode=0755 state=directory
   - name: install zookeeper
     unarchive: src={
   
   {AnsibleDir}}/roles/files/zookeeper.tar.gz dest={
   
   {BigdataDir}}
   - name: install configuration file for zookeeper
     template: src={
   
   {AnsibleDir}}/roles/templates/zoo.cfg.j2 dest={
   
   {BigdataDir}}/zookeeper/conf/zoo.cfg
   - name: create data and log directory
     file: dest={
   
   {BigdataDir}}/zookeeper/{
   
   {item}} mode=0755 state=directory
     with_items:
     - dataLogDir
     - data
   - name: add myid file
     shell: echo {
   
   { myid }} > {
   
   {BigdataDir}}/zookeeper/data/myid
   - name: chown hadoop for zk directory
     file: dest={
   
   {BigdataDir}}/zookeeper owner=hadoop group=hadoop state=directory recurse=yes

This script references a template file zoo.cfg.j2, which is located in the /etc/ansible/roles/templates/ path on the management machine. The content of this file is as follows:

[root@namenodemaster ansible]# vim roles/templates/zoo.cfg.j2 

tickTime=2000
initLimit=20
syncLimit=10
dataDir={
   
   {BigdataDir}}/zookeeper/data
dataLogDir={
   
   {BigdataDir}}/zookeeper/dataLogDir
clientPort=2181
quorumListenOnAllIPs=true
server.1={
   
   {zk1_hostname}}:2888:3888
server.2={
   
   {zk2_hostname}}:2888:3888
server.3={
   
   {zk3_hostname}}:2888:3888

In this template file, several role variables BigdataDir, zk1_hostname, zk2_hostname, and zk3_hostname are also referenced. These variables are defined in the main.yml file in the vars subfolder of the roles folder.

Name the playbook script zk.yml, and then execute the following commands on the command line to complete the automatic installation and configuration of ZooKeeper:

[root@namenodemaster ansible]# ansible-playbook zk.yml 

PLAY [hadoophosts] *****************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.41]
ok: [192.168.1.169]
ok: [192.168.1.103]
ok: [192.168.1.70]

TASK [mkdir directory for bigdata data] ********************************************************************************************************************
ok: [192.168.1.41]
ok: [192.168.1.169]
ok: [192.168.1.103]
ok: [192.168.1.70]
ok: [192.168.1.31]

TASK [install zookeeper] ***********************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.103]
ok: [192.168.1.70]
ok: [192.168.1.41]
ok: [192.168.1.169]

TASK [install configuration file for zookeeper] ************************************************************************************************************
changed: [192.168.1.70]
changed: [192.168.1.103]
changed: [192.168.1.41]
changed: [192.168.1.31]
changed: [192.168.1.169]

TASK [create data and log directory] ***********************************************************************************************************************
changed: [192.168.1.41] => (item=dataLogDir)
changed: [192.168.1.70] => (item=dataLogDir)
changed: [192.168.1.31] => (item=dataLogDir)
changed: [192.168.1.103] => (item=dataLogDir)
changed: [192.168.1.169] => (item=dataLogDir)
changed: [192.168.1.41] => (item=data)
changed: [192.168.1.70] => (item=data)
changed: [192.168.1.103] => (item=data)
changed: [192.168.1.31] => (item=data)
changed: [192.168.1.169] => (item=data)

TASK [add myid file] ***************************************************************************************************************************************
fatal: [192.168.1.31]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'myid' is undefined\n\nThe error appears to be in '/etc/ansible/zk.yml': line 17, column 6, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n     - data\n   - name: add myid file\n     ^ here\n"}
fatal: [192.168.1.41]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'myid' is undefined\n\nThe error appears to be in '/etc/ansible/zk.yml': line 17, column 6, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n     - data\n   - name: add myid file\n     ^ here\n"}
changed: [192.168.1.103]
changed: [192.168.1.70]
changed: [192.168.1.169]

TASK [chown hadoop for zk directory] ***********************************************************************************************************************
changed: [192.168.1.70]
changed: [192.168.1.103]
changed: [192.168.1.169]

PLAY RECAP *************************************************************************************************************************************************
192.168.1.103              : ok=7    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=7    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=5    changed=2    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=5    changed=2    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=7    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# 

Finally, write an automated Hadoop installation script and download the binary installation package. In fact, the packaged Hadoop installation program is packaged into the compressed format of hadoop.tar.gz, and then automatically copied from the management machine to each node of the cluster, the playbook file The content is as follows:

[root@namenodemaster ansible]# vim hadoop.yml

 - hosts: hadoophosts
  remote_user: root
  roles:
  - roles
  tasks:
   - name: create hadoop user
     user: name=hadoop state=present
   - name: mkdir directory for bigdata directory
     file: dest={
   
   {BigdataDir}} mode=0755 state=directory
   - name: mkdir directory for bigdata configfiles
     file: dest={
   
   {hadoopconfigfile}} mode=0755 state=directory
   - name: install hadoop
     unarchive: src={
   
   {AnsibleDir}}/roles/files/hadoop.tar.gz dest={
   
   {BigdataDir}}
   - name: chown hadoop configfiles directory
     file: dest={
   
   {BigdataDir}}/hadoop owner=hadoop group=hadoop state=directory
   - name: install configuration file for hadoop
     unarchive: src={
   
   {AnsibleDir}}/roles/files/conf.tar.gz dest={
   
   {hadoopconfigfile}}
   - name: chown hadoop configfiles directory
     file: dest={
   
   {hadoopconfigfile}}/conf owner=hadoop group=hadoop state=directory
   - name: set hadoop env
     lineinfile: dest=/home/hadoop/.bash_profile insertafter="{
   
   {item.position}}" line="{
   
   {item.value}}" state=present
     with_items:
     - {position: EOF, value: "export HADOOP_HOME={
   
   {BigdataDir}}/hadoop/current"}
     - {position: EOF, value: "export HADOOP_MAPRED_HOME=${HADOOP_HOME}"}
     - {position: EOF, value: "export HADOOP_COMMON_HOME=${HADOOP_HOME}"}
     - {position: EOF, value: "export HADOOP_HDFS_HOME=${HADOOP_HOME}"}
     - {position: EOF, value: "export HADOOP_YARN_HOME=${HADOOP_HOME}"}
     - {position: EOF, value: "export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat"}
     - {position: EOF, value: "export CATALINA_BASE=${HTTPFS_CATALINA_HOME}"}
     - {position: EOF, value: "export HADOOP_CONF_DIR={
   
   {hadoopconfigfile}}/conf"}
     - {position: EOF, value: "export HTTPFS_CONFIG={
   
   {hadoopconfigfile}}/conf"}
     - {position: EOF, value: "export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin"}
   - name: enforce env
     shell: source /home/hadoop/.bash_profile

Name this playbook script hadoop.yml, and then execute the following commands on the command line to complete the automated installation and configuration of Hadoop: 

[root@namenodemaster ansible]# ansible-playbook hadoop.yml 

PLAY [hadoophosts] *****************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.103]
ok: [192.168.1.169]
ok: [192.168.1.41]
ok: [192.168.1.70]

TASK [create hadoop user] **********************************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.41]
ok: [192.168.1.70]
ok: [192.168.1.103]
ok: [192.168.1.169]

TASK [mkdir directory for bigdata directory] ***************************************************************************************************************
ok: [192.168.1.103]
ok: [192.168.1.31]
ok: [192.168.1.169]
ok: [192.168.1.70]
ok: [192.168.1.41]

TASK [mkdir directory for bigdata configfiles] *************************************************************************************************************
ok: [192.168.1.41]
ok: [192.168.1.103]
ok: [192.168.1.70]
ok: [192.168.1.31]
ok: [192.168.1.169]

TASK [install hadoop] **************************************************************************************************************************************
ok: [192.168.1.103]
ok: [192.168.1.169]
ok: [192.168.1.31]
ok: [192.168.1.70]
ok: [192.168.1.41]

TASK [chown hadoop configfiles directory] ******************************************************************************************************************
ok: [192.168.1.70]
ok: [192.168.1.31]
ok: [192.168.1.169]
ok: [192.168.1.41]
ok: [192.168.1.103]

TASK [install configuration file for hadoop] ***************************************************************************************************************
ok: [192.168.1.31]
changed: [192.168.1.70]
changed: [192.168.1.41]
changed: [192.168.1.103]
changed: [192.168.1.169]

TASK [chown hadoop configfiles directory] ******************************************************************************************************************
ok: [192.168.1.31]
ok: [192.168.1.70]
ok: [192.168.1.41]
ok: [192.168.1.103]
ok: [192.168.1.169]

TASK [set hadoop env] **************************************************************************************************************************************
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_HOME=/opt/bigdata/hadoop/current'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_MAPRED_HOME=${HADOOP_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_HOME=/opt/bigdata/hadoop/current'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_HOME=/opt/bigdata/hadoop/current'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_HOME=/opt/bigdata/hadoop/current'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_HOME=/opt/bigdata/hadoop/current'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_COMMON_HOME=${HADOOP_HOME}'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_MAPRED_HOME=${HADOOP_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_MAPRED_HOME=${HADOOP_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_MAPRED_HOME=${HADOOP_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_MAPRED_HOME=${HADOOP_HOME}'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_HDFS_HOME=${HADOOP_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_COMMON_HOME=${HADOOP_HOME}'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_COMMON_HOME=${HADOOP_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_COMMON_HOME=${HADOOP_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_COMMON_HOME=${HADOOP_HOME}'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_YARN_HOME=${HADOOP_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_HDFS_HOME=${HADOOP_HOME}'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_HDFS_HOME=${HADOOP_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_HDFS_HOME=${HADOOP_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_HDFS_HOME=${HADOOP_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_YARN_HOME=${HADOOP_HOME}'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_YARN_HOME=${HADOOP_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_YARN_HOME=${HADOOP_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_YARN_HOME=${HADOOP_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export CATALINA_BASE=${HTTPFS_CATALINA_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export CATALINA_BASE=${HTTPFS_CATALINA_HOME}'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HADOOP_CONF_DIR=/etc/hadoop/conf'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export CATALINA_BASE=${HTTPFS_CATALINA_HOME}'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HADOOP_CONF_DIR=/etc/hadoop/conf'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CONFIG=/etc/hadoop/conf'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export CATALINA_BASE=${HTTPFS_CATALINA_HOME}'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export CATALINA_BASE=${HTTPFS_CATALINA_HOME}'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HADOOP_CONF_DIR=/etc/hadoop/conf'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CONFIG=/etc/hadoop/conf'})
ok: [192.168.1.31] => (item={u'position': u'EOF', u'value': u'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HADOOP_CONF_DIR=/etc/hadoop/conf'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CONFIG=/etc/hadoop/conf'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HADOOP_CONF_DIR=/etc/hadoop/conf'})
changed: [192.168.1.70] => (item={u'position': u'EOF', u'value': u'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CONFIG=/etc/hadoop/conf'})
changed: [192.168.1.103] => (item={u'position': u'EOF', u'value': u'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export HTTPFS_CONFIG=/etc/hadoop/conf'})
changed: [192.168.1.41] => (item={u'position': u'EOF', u'value': u'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin'})
changed: [192.168.1.169] => (item={u'position': u'EOF', u'value': u'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin'})

TASK [enforce env] *****************************************************************************************************************************************
changed: [192.168.1.31]
changed: [192.168.1.169]
changed: [192.168.1.70]
changed: [192.168.1.103]
changed: [192.168.1.41]

PLAY RECAP *************************************************************************************************************************************************
192.168.1.103              : ok=10   changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.169              : ok=10   changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.31               : ok=10   changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.41               : ok=10   changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
192.168.1.70               : ok=10   changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[root@namenodemaster ansible]# 

The complete directory structure is as follows: 

[root@namenodemaster ansible]# pwd
/etc/ansible
[root@namenodemaster ansible]# tree
.
├── adduser.yml
├── ansible.cfg
├── hadoop.yml
├── hostname.yml
├── hosts
├── hosts.yml
├── jdk.yml
├── os.yml
├── roles
│   ├── files
│   │   ├── conf.tar.gz
│   │   ├── hadoop.tar.gz
│   │   ├── jdk.tar.gz
│   │   └── zookeeper.tar.gz
│   ├── templates
│   │   ├── authorized_keys.j2
│   │   ├── hosts.j2
│   │   └── zoo.cfg.j2
│   └── vars
│       └── main.yml
├── sshk.yml
└── zk.yml

4 directories, 18 files
[root@namenodemaster ansible]#

The configuration information in the roles folder is as follows:

Guess you like

Origin blog.csdn.net/yanghuadong_1992/article/details/113575177