If a cluster resource is not enough and needs to be expanded, it can be expanded dynamically without shutting down. The specific operations are as follows:
192.168.111.11 lyy1 ---master node
192.168.111.12 lyy2
192.168.111.13 lyy3
192.168.111.14 lyy4
Added:
192.168.111.15 lyy5
192.168.111.16 lyy6
1. Clone two virtual machines from the lyy1 node, ensure that all configurations are the same as the software, and then modify the ip and hostname
(This cluster is a virtual cluster based on proxmox, which can easily replicate, switch virtual machines, etc. If it is a physical cluster, you can copy the master node image to the new node)
vim /etc/network/interfaces
vim /etc/hostname
2. Modify vim /etc/hosts and add ip mapping. Use batch commands and sync to all machines
for i in $(seq 1 6); do echo lyy$i; scp /etc/hosts root@lyy$i:/etc/;done
At the same time, it is necessary to modify the workers of hadoop, the slaves of spark, the regionservers of hbase, and increase the host name.
for i in $(seq 1 6); do echo lyy$i; scp /opt/hadoop-3.0.0/etc/hadoop/workers
root@lyy$i:/opt/hadoop-3.0.0/etc/hadoop;done
for i in $(seq 2 6); do echo lyy$i; scp /opt/hbase-1.2.4/conf/regionservers
root@lyy$i:/opt/hbase-1.2.4/conf;done
for i in $(seq 2 6);do echo lyy$i; scp /opt/spark-2.2.0-bin-hadoop2.7/conf/slaves
root@lyy$i:/opt/spark-2.2.0-bin-hadoop2.7/conf;done
Also sync hbase-site.xml configuration file
for i in $(seq 2 6); do echo lyy$i; scp /opt/hbase-1.2.4/conf/hbase-site.xml
root@lyy$i:/opt/hbase-1.2.4/conf;done
for i in $(seq 1 6); do echo lyy$i; ssh lyy$i "cp /opt/hbase-1.2.4/conf/hbase-site.xml /opt/spark-2.2.0-bin-hadoop2.7/conf && cp /opt/hbase-1.2.4/conf/hbase-site.xml /opt/hadoop-3.0.0/etc/hadoop";done
Note: You can directly start the following processes on the newly added nodes without restarting the cluster:
3.hadoop adds datanode nodes
hadoop-daemon.sh start datanode
starts the DataNode process
yarn-daemon.sh start nodemanager
starts the NodeManager process
4. Spark adds worker nodes
start-slave.sh spark://lyy1:7077
starts the Worker process
5.Hbase adds RegionServer
hbase-daemon.sh start regionserver
starts the HRegionServer process
hbase-daemon.sh start zookeeper
starts the HquorumPeer process
Enter status in the hbase shell
to view the cluster status
6. Load Balancing
If it is not balanced, the cluster will store all new data on the new node, which will reduce work efficiency:
View hdfs node status:
hdfs dfsadmin --report
1048576(=1Mb/s)
104857600(=100Mb/s)
#Set the bandwidth for copying data between different nodes is limited, the default is 1MB/s
#Set if the disk usage of a datanode is 1% higher than the average level, Blocks will be transferred to other datanodes that are lower than the average level, that is, the usage difference of each node does not exceed 1%.
or:
start-balancer.sh
stop-balancer.sh
After load balancing, the hard disk usage of each node tends to be balanced:
In addition, Hbase also needs load balancing:
Enter in hbase shell:
balance_switch true
At this point, the node expansion can be completed. Now the cluster has 6 nodes. You can view the nodes on the monitoring pages of hadoop, spark, and hbase respectively.