Environmental preparation
You need to install jdk, go to the official download jdk8 version, configure environment variables
configuration file location: /etc/profile
#java_path
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:JAVA_HOME/lib/tools.jar
Host configuration (need to set password-free login)
password-free login settings:
#使用rsa算法生成秘钥和公钥对:
ssh-keygen -t rsa
#然后把公钥加入到授权中:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#将authorized_keys文件修改为600权限
chmod 600 .ssh/authorized_keys
Then you can log in without password
CPU name | ip |
---|---|
mast | 192.168.10.194 |
node1 | 192.168.10.195 |
node2 | 192.168.10.196 |
Set up DNS resolution
# /etc/hosts
192.168.10.194 master
192.168.10.195 node1
192.168.10.196 node2
model
- stand-alone mode
- Pseudo-distributed
- fully distributed
Fully distributed construction
Version: 2.9.0
/usr/local/hadoop/
Configure hadoop-env.sh , add the following configuration (/usr/local/hadoop/etc/hadoop)
export JAVA_HOME=/opt/jdk1.8.0_181
Place core-site.xml (/usr/local/hadoop/etc/hadoop)
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop</value>
</property>
</configuration>
Format a new distributed system (/usr/local/hadoop/bin/)
bin/hadoop namenode -format
Start the master node (/usr/local/hadoop/sbin/)
hadoop-daemon.sh start namenode
Start the data node (/usr/local/hadoop/sbin/)
hadoop-daemon.sh start datanode
View java process: jps
[root@master data]# jps
4373 NameNode
4506 DataNode
4586 Jps
Access: http://master:50070
stop process (/usr/local/hadoop/sbin/)
hadoop-daemon.sh stop datanode
hadoop-daemon.sh stop namenode
configure yarn
yarn-site.xml(/usr/local/hadoop/etc/hadoop/)
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
mapred-site.xml(/usr/local/hadoop/etc/hadoop/)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Start and stop yarn (/usr/local/hadoop/sbin/)
start-yarn.sh
stop-yarn.sh
yarn process
[root@master sbin]# jps
27538 NodeManager
30134 Jps
27373 ResourceManager
Access: http://master:8088
Start a distributed cluster
Copy the hadoop directory to other host nodes
Place slaves(/usr/local/hadoop/etc/hadoop/)
[root@master hadoop]# cat slaves
master
node1
node2
Start fully distributed (/usr/local/hadoop/sbin/)
start-all.sh