Build a Hadoop stand-alone pseudo-distributed environment
This trial is ubuntu 16.04 as the system environment
Using VMware Workstation 12 as a virtual machine
This time we will not introduce CentOS and RedHat to build a Hadoop stand-alone pseudo-distributed environment
hadoop 2.6.4 used by hadoop
Okay, now we start our Hadoop environment setting journey! ! !
first step:
Open the ubuntu terminal. If there is no hadoop user, we create a hadoop user
Use the command sudo useradd -m hadoop-s / bin / bash
Here is a brief review:
Users and groups
User: Multi-user
Core elements for resource allocation by users and groups
A group is a group of containers where you can add users or add permissions
user type:
Administrator 0
System users: 1--499
Run the background program (daemon )
Ordinary users: 500+
Login: interactive access
Group category:
Administrator group: 0
System group: 1-499
User group: 500+
Linux:
/ etc / passwd : user account information
/ etc / shadow : user's password and related account settings
/ etc / group: group account information
/ etc / gshaow: group password information
Use sudo cat / etc / group after creating the user
Use sudo passwd hadoop to set password for hadoop
Sudo adduser hadoop sudo adds administrator permissions for hadoop users
Finally use the reboot command to restart ubuntu and log in as the hadoop user
The second step:
Hadoop users update the update after login or some software will not work
I use the mirror source of us , sometimes the update is slow, or your hash is problematic, indicating that the software source cannot find resources, you can change the software source. It is recommended to use 163.
I won't say anything about changing the software source, and I will solve it by myself.
Then install the text editor of vim . There are many kinds of nano vi vim (enhanced version of vi) gedit (graphical interface, it is recommended to use it not very familiar to vi or vim, but you still have to learn vi or vim in the future. All understand). For the gedit that I have just contacted with Linux this time, because it is convenient to modify the xml configuration file and the modified .sh script file later, speed up efficiency and prevent errors.
For the system with nano vi install vim is
Because I have installed it, so the above screen appears. If it is not installed, it should have the option of y / n . Enter y directly.
third step:
Install ssh to achieve remote control
Ubuntu installs the ssh client by default and then we can install the server directly
The command is:
Mine has been installed and the above screen appears
There are many remote control tools in Xshell secureCRT
I type ifconfig to view ip
I use Xshell5 to connect to ubuntu
the fourth step:
Install java jdk jre
After installing OpenJDK , you need to find the corresponding installation path. This path is used to configure the JAVA_HOME environment variable.
Use sudo gedit ~ / .bashrc to edit and create environment variables.
export JAVA_HOME = JDK installation path
Among them, then you must use the source ~ / .bashrc command to make the variable take effect.
Finally use the above command to verify that the java environment variables we installed are valid. If it works, we continue to the next step. Otherwise, please configure the java environment. Otherwise, the next step is not possible.
the fifth step:
Install hadoop
I am using hadoop-2.6.4.tar.gz
I put it on the desktop. I got it from the window and dragged it into ubuntu. To realize this function, you must boot and reinstall vmware tools to use it (you can create a shared folder and use ftp file transfer and many other methods)
I suggest downloading directly on the Internet in ubuntu , and then find the file in the download. To test whether you can connect to the Internet, use ping www.baidu.com and other methods.
Then, next install hadoop, my installation package is placed on Desktop, it is recommended to install in the same environment as java, mine is in / usr / local
edit permission
Sudo chown -R hadoop ./hadoop
Use the following command to check the version of hadoop
Let me write a little bit about permissions:
Chown : Only the administrator has the permission to change the owner of the file
Chown USERNAMEfile, ... -R recursively modify the owner of the internal files of the directory and its subfiles
--reference =/path/to/somefile file,...
#chgrp GRPNAMEfile... -R --reference=/path/to/somefile file,...
Chown USERNAME:GROUP file....
Chown USERANME.GROUP file....
Chmod: modify user permissions
Modify the permissions of three types of users
Chmod more file....
-R
--reference =/path/to/somefile file,...
Rwxr-x---
Modify the permissions of certain users or certain types of users
U g o a
Chmod user type = MORE file ...
Modify the permissions of certain users or certain users
U g o a
Chmod user type + (-) MORE file ...
The sixth step:
Hadoop is non-distributed by default, that is, java single process. Next, we conduct pseudo-distributed configuration
The Hadoop configuration file is located in / usr / local / hadoop / etc / hadoop /. Then we enter the configuration file and modify it. What needs to be modified is core-site.xml and hdfs-site.xml. Use sudo gedit./etc/hadoop/core -site.xml
The modified core-site.xml file is the following screenshot:
Similarly: the screenshot of modifying hdfs-site.xml is as follows:
Step 7: Start hadoop
./bin/hdfs namenode -format format namenode remember that you must install hadoop under the bin in use
The 0 on the way indicates that you have succeeded a lot, to continue our operation below. We are about to succeed, let's go! ! ! If something goes wrong, then check what is wrong.
./sbin/start-dfs.sh open NameNode and DataNode daemon. The first time, a warning will appear, yes .
Jps to see if it opens successfully.
If it may fail, delete log tmp using rm -r ./deleted file format
Step 8: After successful startup, you can access the web interface to view related information
Use your IP address in red and the port number is 50070 to view the information
This has been successful. Haha, haha, cool! ! !
Step 9:
Add path :
Because the above operations must be performed under the installed hadoop , it is more troublesome. We use the path environment variable to enable us to open our service in any directory.
Modifying the ~ / .bashrc file is similar to modifying the JAVA_HOME setting
, Pro, finally remember to operate source ~ / .bashrc to make the settings take effect.
Scan the code and follow the public account for more information.