Hadoop 3 installation configuration

 

Three virtual machines: a master node node1, node2, and node data node3 also choose to also serve as data nodes node1

The basic setting needs
1. Time synchronization (time server), the network, the hosts
time synchronization (time server, NTP server) disposed
yum the install the ntpdate
the ntpdate -u 3.tw.pool.ntp.org

You may set a fixed the IP
VI / etc / sysconfig / Network-scripts / the ifcfg-ens33 / E / CF2
added IPADDR = 192.168.3.4

Review hostname:
hostnamectl SET-hostname node1.test.com
the hosts arranged
Vim / etc / the hosts
192.168.3.4 node1
192.168.3.5 node2
192.168.3.6 node3

Restart network services
service network restart

2. Free Density (if not free dense configuration, the configuration After starting the service, it will need to enter a password; NameNode DataNode to do a free secret set) node 1 to other nodes
avoid dense set http://hadoop.apache.org /docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Setup_passphraseless_ssh
SSH-keygen -t DSA -P '' -f ~ / .ssh / id_dsa
CD /root/.ssh
CAT ~ / .ssh / >> ~ id_dsa.pub / .ssh / the authorized_keys
the chmod 0600 ~ / .ssh / the authorized_keys

Copy the key to other nodes in
the root @ localhost .ssh] # SCP id_dsa.pub node2: / tmp /
the root @ localhost .ssh] # SCP id_dsa.pub node3: / tmp /
following statement on node1, node2 Free access secret
CAT ~ / .ssh / id_rsa.pub | SSH root @ node2 "mkdir -p ~ / .ssh && chmod 700 ~ / .ssh && CAT >> ~ / .ssh / authorized_keys"
CAT ~ / .ssh / id_rsa.pub | SSH root @ node3 "mkdir -p ~ / .ssh && chmod 700 ~ / .ssh && cat >> ~ / .ssh / authorized_keys"

3.jdk
copy jdk to node2, node3
[the root Tools @ localhost] # jdk-8u211 SCP-Linux-x64.rpm node2: `pwd`
[the root Tools @ localhost] # jdk-8u211 SCP-Linux-x64.rpm node3: `pwd`
installation (XShell key may be used to transmit all sessions unified installation)
RPM -ivh / usr / Tools / JDK *

Set the JAVA_HOME (which may also be in ~ / .bashrc provided)
Vim / etc / Profile
Add the JAVA_HOME
Export the JAVA_HOME = / usr / Java / jdk1.8.0_211-AMD64
Export the PATH = $ the PATH: the JAVA_HOME / bin

By configuring the effective
source / etc / profile
to verify the configuration
java -version


4. extracting installation file upload, modify the configuration file
[root @ localhost tools] # tar zxvf hadoop-3.2.0.tar.gz -C / home
configuration hadoop Path:
Export HADOOP_HOME is = / Home / hadoop-3.2.0
Export the PATH = $ PATH: $ HADOOP_HOME / bin: $ HADOOP_HOME / sbin
valid configuration
source / etc / profile

5. Modify Profile
[root @ localhost hadoop-3.2.0] # cd etc / hadoop

vim hadoop-env.sh
(vim中执行命令:!echo $JAVA_HOME)
JAVA_HOME=/usr/java/jdk1.8.0_211-amd64
HDFS_DATANODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_NAMENODE_USER=root
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root


Core-the site.xml Vim
<Configuration>
<-! HDFS temporary route ->
<Property>
<name> hadoop.tmp.dir </ name>
<value> / opt / HDFS / tmp </ value>
</ Property >
<! - default address hdfs, the port access address ->
<Property>
<name> fs.defaultFS </ name>
<value> hdfs: // node1: 9820 </ value>
</ Property>
</ the Configuration >


hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<!-- 是否启用hdfs权限检查 false 关闭 -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hdfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node2:9868</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>

 

yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->
<!--集群master,-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>

<!-- NodeManager上运行的附属服务-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>


the site.xml-mapred
<the Configuration>
<-! local representation locally run, classic represents the classic mapreduce framework, yarn represents the new framework ->
<Property>
<name> mapreduce.framework.name </ name>
<value> the Yarn </ value>
</ Property>
</ Configuration>


New Masters (master node NameNode)
node1

Workers modified (as is also the master node node1 Datanode)
node1
node2
node3

start-dfs.sh,stop-dfs.sh中添加
HDFS_DATANODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_NAMENODE_USER=root

start-yarn.sh,stop-yarn.sh中添加
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root


6. synchronization profile, to ensure a consistent cluster configuration file
sync to other nodes
[root @ localhost home] # scp -r hadoop-3.2.0 / node2: / home


7. formatting the NameNode:
HDFS NameNode -format

8. Start start-all.sh

9. Review the boot process
jps

9. Access Web
http://192.168.3.4:9870

YARN UI: http://192.168.3.4:8088/cluster


Reformat it, you need to delete data data storage directory


datanode node starts normally, but not shown on the UI
needs to add the following configuration at the site.xml-HDFS:
<Property>
<name> dfs.namenode.datanode.registration.ip-hostname-Check </ name>
<value> to false < / value>
</ Property>

 

Guess you like

Origin www.cnblogs.com/pashanhu/p/10950144.html