Article Directory
Hadoop high-availability architecture construction
Prepare the environment
Prepare three virtual machines, install jdk, hadoop, zookeeper. And configure secret-free access to ensure that the Hadoop fully distributed and ZooKeeper fully distributed environments have been installed.
HDFS-HA cluster configuration
You can configure one virtual machine first, and then transfer the files to other virtual machines. You can also use tools to modify three virtual machines at the same time. The content of the following configuration files are all new content, which does not affect the previous hadoop fully distributed and ZooKeeper fully distributed. Distributed configuration
Configure the core-site.xml file
The value of hadoop.tmp.dir fills in the installation path of the actual corresponding hadoop file
<configuration>
<!-- 把两个 NameNode)的地址组装成一个集群 mycluster -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoopHA</value>
</property>
<!-- 指定 hadoop 运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/root/software/hadoop/data/tmp</value>
</property>
</configuration>
Configure hdfs-site.xml
<configuration>
<!-- 完全分布式集群名称 -->
<property>
<name>dfs.nameservices</name>
<value>hadoopHA</value>
</property>
<!-- 集群中 NameNode 节点都有哪些,这里是 nn1 和 nn2 -->
<property>
<name>dfs.ha.namenodes.hadoopHA</name>
<value>nn1,nn2</value>
</property>
<!-- nn1 的 RPC 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.hadoopHA.nn1</name>
<value>hadoop100:9000</value>
</property>
<!-- nn2 的 RPC 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.hadoopHA.nn2</name>
<value>hadoop102:9000</value>
</property>
<!-- nn1 的 http 通信地址 -->
<property>
<name>dfs.namenode.http-address.hadoopHA.nn1</name>
<value>hadoop100:50070</value>
</property>
<!-- nn2 的 http 通信地址 -->
<property>
<name>dfs.namenode.http-address.hadoopHA.nn2</name>
<value>hadoop102:50070</value>
</property>
<!-- 指定 NameNode 元数据在 JournalNode 上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop100:8485;hadoop101:8485;hadoop102:8485/hadoopHA</value>
</property>
<!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用隔离机制时需要 ssh 无秘钥登录-->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 声明 journalnode 服务器存储目录-->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/root/software/hadoop/data/jn</value>
</property>
<!-- 关闭权限检查-->
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
<!-- 访问代理类:client,mycluster,active 配置失败自动切换实现方
式-->
<property>
<name>dfs.client.failover.proxy.provider.hadoopHA</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
</configuration>
Placement yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用 resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
</property>
<!--声明两台 resourcemanager 的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop100</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop102</value>
</property>
<!--指定 zookeeper 集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop100:2181,hadoop101:2181,hadoop102:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定 resourcemanager 的状态信息存储在 zookeeper 集群-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
Start HDFS-HA cluster
1. Start the zookeeper cluster
All three virtual machines need to be started
zkServer.sh start
2. Start journalnode
All three virtual machines need to be started
Note: journalnode is the management edit.log file of qjournal distributed application (used for management), which relies on zk management, so put three node nodes on zk to start. Hadoop 2.x uses qjournal to manage the edit.log file.
hadoop-daemon.sh start journalnode
3. Format HDFS (only for the first startup execution)
hadoop100 running on the master node
Note: If you have run hdfs before, you need to delete the directory corresponding to hadoophadoop.tmp.dir. After formatting is complete, you need to distribute the new tmp folder to two other virtual machines
hdfs namenode -format
4. Format ZKFC (only the first time to start execution)
Note: zkfc is used to manage the process of switching between two namenodes. The same is dependent on zk implementation. When the active namenode status is abnormal, the zkfc on the namenode will send a status to zk, and the zkfc on the standby namenode will check the abnormal status, and send a command to the active namenode via ssh, kill -9 process number , Kill the process and reset itself to active, place active suspended animation and split-brain event, in case the ssh sending fails, you can also start a custom .sh script file to forcibly kill the active namenode process.
Run the following command on the hadoop100 master node
hdfs zkfc -formatZK
5. Start HDFS
Run the following command on the hadoop100 master node
start-dfs.sh
6. Test HDFS
View the service status of each virtual machine through the jps command in the system:
You can also check the running status of each service on the web page:
HDFS: http://hadoop100:50070 master node
http://hadoop102:50070 sub-node
Manually kill hadoop100 after the namenode node above:
when the main node is abnormal, it can automatically switch to sub-node Node, configuration is complete!
7. Start YARN
Run the following command on the hadoop100 master node
start-yarn.sh
Execute on hadoop102 sub-node
yarn-daemon.sh start resourcemanager
8. Test yarn-HA
Enter the standby machine hadoop102:8088 in the browser and it will automatically jump to the master node hadoop100:8088, which proves that yarn-HA is enabled. In the actual test, it is found that the redirection cannot be performed. It is suspected that the domain name on the local computer cannot be recognized.
Note that there may be multiple hosts files in the C drive of the computer . The search is to distinguish the paths.
After opening, add the corresponding IP address and host name in it, then you can jump to the page normally.
The high-availability configuration ends here, thank you!