用QJM(Quorum Journal Manager)来配置 HA高可用HDFS集群概述。
二、用QJM实现数据共享
1. 编辑 hdfs-site.xml 文件
1.1 dfs.nameservices 新添的命名服务逻辑名称
<property> <name>dfs.nameservices</name> <value>mycluster</value> </property>
1.2 dfs.ha.namenodes.[nameservice ID] NameNode的唯一标识
每个nameservice 最多2个namenodes
<property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value></property>
1.3 dfs.namenode.rpc-address.[nameservice ID].[name node ID]
配置每个namenode的Rpc全路径监听地址
<property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>machine1.example.com:8020</value></property><property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>machine2.example.com:8020</value></property>
1.4 dfs.namenode.http-address.[nameservice ID].[name node ID]
配置每个namenode的http全路径监听地址
<property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>machine1.example.com:50070</value></property><property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>machine2.example.com:50070</value></property>
注意事项:Hadoop的安全特性开启时,http-address也要开启
1.5 dfs.namenode.shared.edits.dir nameNode能够读写的目录
qjournal://host1:port1;host2:port2;host3:port3/journalId
<property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster</value></property>
1.6 dfs.client.failover.proxy.provider.[nameservice ID] java api 调用可用的namenode
<property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property>
1.7 dfs.ha.fencing.methods 调用 namenode 失败后执行的脚本或是java
QJM只允许一个namdenode进行读写JournalNodes
Sshfence 必须配置 dfs.ha.fencing.ssh.private-key-files
<property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value></property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/exampleuser/.ssh/id_rsa</value></property>
<property> <name>dfs.ha.fencing.methods</name> <value>sshfence([[username][:port]])</value></property><property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value></property>
shell
<property> <name>dfs.ha.fencing.methods</name> <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value></property>
<property> <name>dfs.ha.fencing.methods</name> <value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value></property>
$target_host |
hostname of the node to be fenced |
$target_port |
IPC port of the node to be fenced |
$target_address |
the above two, combined as host:port |
$target_nameserviceid |
the nameservice ID of the NN to be fenced |
$target_namenodeid |
the namenode ID of the NN to be fenced |
1.8 fs.defaultFS
core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value></property>
1.9 dfs.journalnode.edits.dir
<property> <name>dfs.journalnode.edits.dir</name> <value>/path/to/journal/node/local/data</value></property>
配置自动failedOver
hdfs-site.xml 文件
<property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
core-site.xml 文件
<property> <name>ha.zookeeper.quorum</name> <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value> </property>
初始化HA状态在zookeeper
$ hdfs zkfc -formatZK
执行start-dfs.sh 后会执行ZKFC 守护进程
手动方式
$ hadoop-daemon.sh start zkfc
安全连接zookeeper core-site.xml
<property> <name>ha.zookeeper.auth</name> <value>@/path/to/zk-auth.txt</value> </property> <property> <name>ha.zookeeper.acl</name> <value>@/path/to/zk-acl.txt</value> </property>