1, provided that:
1. Modify the Linux hostname
2. Modify IP
3. modify the mapping between host names and IP / etc / hosts
4. Turn off the firewall
5.ssh free landing
6. Install JDK, configure the environment variables, etc.
7. Note the cluster to synchronize time
2, the node cluster deployment planning (3 nodes)
server01 namenode resourcemanager zkfc nodemanager datanode zookeeper journal node
server02 namenode resourcemanager zkfc nodemanager datanode zookeeper journal node
server03 datanode nodemanager zookeeper journal node
3, installation steps:
1. Installation Configuration zooekeeper cluster
2. Cluster installation configuration hadoop
2.1 extracting
the tar-2.6.4.tar.gz -C -zxvf hadoop / Home / hadoop / App /
2.2 Configuration HDFS (hadoop2.0 all profiles are $ HADOOP_HOME is / etc / directory under hadoop)
# Add the environment variable to hadoop
vim / etc / Profile
Export JAVA_HOME = / usr / the Java / jdk1.7.0_55
Export HADOOP_HOME = / hadoop / hadoop-2.6.4
Export the PATH = $ the PATH: $ JAVA_HOME / cluster1n: $ HADOOP_HOME / cluster1n
# hadoop2.0 profile all in $ HADOOP_HOME / etc / hadoop under
cd /home/hadoop/app/hadoop-2.6.4/etc/hadoop
2.2.1 modify hadoop-env.sh
export JAVA_HOME = / export / servers / jdk1.8.0_65
################################################## #############################
2.2.2 modify the site.xml-Core
<the Configuration>
<-! cluster name specified here ! This value is derived from hdfs-site.xml configuration ->
<Property>
<name> fs.defaultFS </ name>
<value> HDFS: // cluster1 </ value>
</ Property>
! <- here the default path is NameNode, DataNode, JournalNode other public directory to store data ->
<Property>
<name> hadoop.tmp.dir </ name>
<value> /export/servers/hadoop-2.6.0-cdh5.14.0/ HAhadoopDatas / tmp </ value>
</ Property>
<-! Address and port ZooKeeper cluster. Note that the number must be an odd number, and not less than three nodes ->
<Property>
<name> ha.zookeeper.quorum </ name>
<value> amdha01: 2181, amdha02: 2181, node03: 2181 </ value>
</ Property>
</ Configuration>
################################################## #############################
2.2.3 modify the site.xml-hdfs
<Configuration>
<-! nameservice as specified in hdfs cluster1, and needs to be consistent in the core-site.xml ->
<Property>
<name> dfs.nameservices </ name>
<value> cluster1 </ value>
</ Property>
<-! cluster1 following two NameNode, respectively nn1, NN2 ->
<Property>
<name> dfs.ha.namenodes.cluster1 </ name>
<value> nn1, NN2 </ value>
</ Property>
<-! nn1 communication address of the RPC ->
<Property>
<name> dfs.namenode.rpc-address.cluster1.nn1 </ name>
<value> amdha01: 8020 </ value>
</ Property>
<!-- nn1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.cluster1.nn1</name>
<value> amdha01: 50070 </ value>
</ Property>
<! - The RPC communication address NN2 ->
<Property>
<name> dfs.namenode.rpc-address.cluster1.nn2 </ name>
<value> amdha02: 8020 </ value>
</ Property>
<- - NN2 communication address of http!>
<Property>
<name> dfs.namenode.http-address.cluster1.nn2 </ name>
<value> amdha02: 50070 </ value>
</ Property>
<-! NameNode specified edits the metadata storage position on JournalNode ->
<Property>
<name> dfs.namenode.shared.edits.dir </ name>
<value> qjournal : // node01: 8485; node02: 8485; node03: 8485 / cluster1 </ value>
</ Property>
<-! JournalNode specified data stored in the local disk location ->
<property>
<name>dfs.journalnode.edits.dir</name>
<value> /export/servers/hadoop-2.6.0-cdh5.14.0/journaldata </ value>
</ Property>
<-! NameNode open automatically switch failure ->
<Property>
<name> dfs.ha.automatic -failover.enabled </ name>
<value> to true </ value>
</ Property>
<-! specifies that the cluster is in trouble, which is responsible for implementing failover implementation class ->
<Property>
<name> dfs.client .failover.proxy.provider.cluster1 </ name>
<value> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider </ value>
</ Property>
<-! isolation mechanism disposed method, a plurality of mechanisms newline segmentation, i.e., with each mechanism temporary line ->
<Property>
<name> dfs.ha.fencing.methods </ name>
<value>
sshfence
</ value>
</ Property>
<-! required when using ssh-free landing sshfence isolation mechanism ->
<Property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
###############################################################################
2.2.4修改mapred-site.xml
<configuration>
<!-- 指定mr框架为yarn方式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
###############################################################################
2.2.5修改yarn-site.xml
<configuration>
<!-- 开启RM高可用 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<!-- 指定RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>node01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>node02</value>
</property>
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>node01:2181,node02:2181,node03:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
2.2.6修改slaves
node01
node02
node03
Copying the software to all nodes
SCP -R & lt Hadoop-2.6.0-cdh5.14.0 amdha02: / $ the PWD
SCP -R & lt Hadoop-2.6.0-cdh5.14.0 node03: / $ the PWD
2.2.7 Configuring the password-free login
# must first configure node01 to node01, node02, node03-free password
# on node01 production of a pair of keys
SSH-keygen
# copy the public key to other nodes, including their own **** ** **
SSH-coyp ID amdha01-
SSH-coyp ID amdha02-
SSH-coyp-ID node03
# Note: between two namenode To configure ssh ssh remote login password-free up the knife when needed
# produce a pair of keys on node02
ssh-keygen
# copy the public key to node01
ssh-coyp-the above mentioned id node01
### Note: strictly follow the steps below !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2.5 zookeeper start cluster (zk are started on amdha01, amdha02, node03)
bin / zkServer.sh start
# Check status: a leader, two follower
bin / zkServer.sh status
2.6 manually start journalnode (respectively in node01, node02, executed on node03)
hadoop-daemon.sh start journalnode
# run jps command inspection, node01, node02, more JournalNode process on node03
2.7 formatting the NameNode
# execute command on node01:
hdfs the NameNode -format
after # format will generate a hdfs initialization files in the directory configured in accordance with core-site.xml in hadoop.tmp.dir,
copy all files in the configuration directory hadoop.tmp.dir to another machine where the node namenode
-r tmp scp / node02: /home/hadoop/app/hadoop-2.6.4/
## may be so, recommend the NameNode -bootstrapStandby HDFS
2.8 format ZKFC (can be executed on the Active)
HDFS zkfc -formatZK
2.9 start HDFS (executed on node01)
start-dfs.sh
2.10 start YARN
start-yarn.sh
also need to manually start a backup manually on standby in the ResourceManager
yarn-daemon.sh Start the ResourceManager
This, hadoop-2.6.4 configuration is completed, you can count browser to access:
HTTP: // node01: 50070
the NameNode 'node01: 8020' (the Active)
HTTP: // node02: 50070
the NameNode 'node02: 8020' (STANDBY)
verification HDFS HA
first upload a file to HDFS
hadoop FS -put / etc / Profile / Profile
hadoop -ls FS /
and then kill off the active of the NameNode
kill -9 <pid of NN>
accessed through a browser: http: // node02: 50070
the NameNode 'node02: 8020' (active)
the NameNode on node02 becomes active at this time
in Run:
Hadoop -ls FS /
-rw-R & lt - r-- the root Supergroup 1926 2014-02-06 15:36. 3 / profile
just upload files still exist! ! !
Manually start the hang of the NameNode
hadoop-daemon.sh Start the NameNode
accessed through a browser: HTTP: // node01: 50070
the NameNode 'node01: 8020' (STANDBY)
verification YARN:
run the demo hadoop WordCount program offered in:
hadoop report this content share JAR / hadoop / MapReduce / hadoop-MapReduce-examples-2.4.1.jar wordcount / Profile / OUT
the OK, we're done! ! !
Some instructions testing the state of the cluster:
hdfs dfsadmin -report 查看hdfs的各节点状态信息
cluster1n/hdfs haadmin -getServiceState nn1 获取一个namenode节点的HA状态
scluster1n/hadoop-daemon.sh start namenode 单独启动一个namenode进程
./hadoop-daemon.sh start zkfc 单独启动一个zkfc进程