Environmental preparation:
Prepare 3 virtual machines (CentOS 7), one master, and two slavers. The master functions as NameNode, DataNode, ResourceManager, and NodeManager, and slave functions as DataNode and NodeManager.
master:10.0.83.71
slave1: 10.0.83.72
slave2:10.0.83.73
Execute on each machine:
Turn off the firewall:
systemctl stop firewalld.service
systemctl disable firewalld.service
Modify the host name corresponding to each ip and modify the hosts file
vi /etc/hosts
10.0.83.71 node1
10.0.83.72 node2
10.0.83.73 node3
Set up 3 virtual machines to log in without secret
Respectively execute on 71, 72, 73: ssh-keygen -t rsa
Perform copy to the other 2 machine addresses on each machine:
ssh-copy-id 10.0.83.71
ssh-copy-id 10.0.83.72
ssh-copy-id 10.0.83.73
Create data storage and other required directories
mkdir -p /data/hdfs/name
mkdir -p /data/hdfs/data
mkdir -p /data/hdfs/tmp
mkdir -p /opt/
synchronised time:
yum -y install ntp
systemctl enable ntpd
systemctl start ntpd
timedatectl set-timezone Asia/Shanghai
timedatectl set-ntp yes
Execute on the master:
cd /opt/
wget https://apache.mirrors.nublue.co.uk/hadoop/common/hadoop-3.1.4/hadoop-3.1.4.tar.gz
tar zxvf hadoop-3.1.4.tar.gz
mv hadoop-3.1.4 hadoop
scp -r hadoop root@node2:/opt/
scp -r hadoop root@node3:/opt/
Create a separate user group hadoop for hadoop components, create users hdfs, yarn, mapred
groupadd hadoop
useradd hdfs -g hadoop
useradd yarn -g hadoop
useradd mapred -g hadoop
Create related directories
Data storage directory
NameNode Data storage directory: /data/hadoop/namenode
DataNode Data storage directory: /data/hadoop/datanode
Temporary data storage directory: /data/hadoop/tmp
HADOOP_MAPRED_HOME:
mkdir -p /data/hadoop/namenode
mkdir -p /data/hadoop/datanode
mkdir -p /data/hadoop/tmp
chown -R hdfs:hadoop /opt/hadoop
chown -R hdfs:hadoop /data/hadoop
Create a new log directory
mkdir /var/log/hadoop
chown hdfs:hadoop /var/log/hadoop
chmod -R 770 /var/log/hadoop
新建pid目录
mkdir /var/run/hadoop
chown hdfs:hadoop /var/run/hadoop
chmod -R 770 /var/run/hadoop
source /etc/profile
3. Cluster operation test
3.1 Start the cluster
Format namenode
su hdfs -c'hdfs namenode -format'
2. Start the namenode
su hdfs -c 'hdfs --daemon start namenode'
Start each datanode node separately
su hdfs -c'hdfs --daemon start datanode'
启动resourcemanager
su yarn -c 'yarn --daemon start resourcemanager'
Start each nodemanager node separately
su yarn -c'yarn --daemon start nodemanager'
启动historyserver
su mapred -c 'mr-jobhistory-daemon.sh start historyserver'
3.2 WebUI access
HDFS UI http://10.0.83.71:50070/
YARN UI http://10.0.83.71:8088/
3.3 Stop the cluster
1. Stop namenode
su hdfs -c'hdfs --daemon stop namenode'
Stop the datanode node
su hdfs -c'hdfs --daemon stop datanode'
停止resourcemanager
su yarn -c 'yarn --daemon stop resourcemanager'
Stop the nodemanager node
su yarn -c'yarn --daemon stop nodemanager'
停止historyserver
su mapred -c 'mr-jobhistory-daemon.sh stop historyserver'
3.4 Other operating instructions
View the hdfs directory
su hdfs -c'hdfs dfs -ls /'
Create a new hdfs directory
su hdfs -c'hdfs dfs -mkdir PATH'
Modify the file owner
su hdfs -c'hdfs dfs -chown OWNER:GROUP PATH'
Modify file permissions
su hdfs -c'hdfs dfs -chmod 644 PATH'
webhdfs 操作
curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS"
curl -i -X DELETE "http://<host>:<port>/webhdfs/v1/<path>?op=DELETE
[&recursive=<true |false>]"
curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATE
[&overwrite=<true |false>][&blocksize=<LONG>][&replication=<SHORT>]
[&permission=<OCTAL>][&buffersize=<INT>][&noredirect=<true|false>]"
curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETOWNER
[&owner=<USER>][&group=<GROUP>]"
curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETPERMISSION
[&permission=<OCTAL>]"