Hadoop distributed cluster construction (extreme focus)
Article Directory
3.1 Build goals
Because of the limitations of the machine's hardware (here, Cris's computer has 16G memory), I had to build the following environment
In fact, at least six virtual machines are required to build a complete environment. Due to the limited conditions, it is difficult to make up three virtual machines.
The names of specific components and the identity of each node will not be introduced here. If you don’t understand, please Google
3.2 Construction process
Machines No. 101, 102, and 103 have their own Java and Hadoop environments. Here we choose No. 101 as
the host for Hadoop environment configuration. After the configuration is completed, directly synchronize to No. 102 and No. 103 machines.
①, core configuration file
Here Cris sets the permissions first. You must ensure that the owner and group of /opt/software and /opt/module are all cris
Carefully follow the first two chapters, there is no such problem, know that
modify the core configuration file core-site.xml
Modify HDFS configuration file
hadoop-env.sh
hdfs-site.xml
enter vim hdfs-site.xml and modify it as follows:
Modify the YARN configuration file
yarn-env.sh
yarn-site.xml
MapReduce configuration file
mapred-env.sh
mapred-site.xml
Configure the history server
In order to view the historical operation of the program, enter vim mapred-site.xml
Configuration log aggregation
Log aggregation concept: After the application is completed, the program operation log information is uploaded to the HDFS system. Benefits of the log aggregation function: you can easily check and
see the program running details, which is convenient for development and debugging.
Enter vim yarn-site.xml
②. Distribute the configured Hadoop configuration files on the cluster
Check whether the files of 102 and 103 are successfully synchronized
102
103
③, cluster single-point start
format
Start NameNode on 101
101, 102, 103 start DataNode respectively
I personally recommend running a single node first after the setup is completed, and finding and solving problems in time, and
then stopping the single point service
3.3 Start the cluster
Configure slaves
Then sync the file
Start the cluster and test
Visit the corresponding web page
over to see the next one