Build a Hadoop stand-alone pseudo-distributed environment

 Build a Hadoop stand-alone pseudo-distributed environment

This trial is ubuntu 16.04 as the system environment

Using VMware Workstation 12 as a virtual machine

This time we will not introduce CentOS and RedHat to build a Hadoop stand-alone pseudo-distributed environment

hadoop 2.6.4 used by hadoop

Okay, now we start our Hadoop environment setting journey! ! !

first step:

Open the ubuntu terminal. If there is no hadoop user, we create a hadoop user

       Use the command sudo useradd -m hadoop-s / bin / bash

Here is a brief review:

Users and groups

User: Multi-user

Core elements for resource allocation by users and groups   

A group is a group of containers where    you can add users or add permissions

user type:

Administrator 0 

System users: 1--499 

Run the background program (daemon )

Ordinary users: 500+

Login: interactive access

Group category:

Administrator group: 0

System group: 1-499

User group: 500+

Linux:

/ etc / passwd : user account information

/ etc / shadow : user's password and related account settings

/ etc / group: group account information

/ etc / gshaow: group password information

Use sudo cat / etc / group after creating the user

Use sudo passwd hadoop to set password for hadoop

Sudo adduser hadoop sudo adds administrator permissions for hadoop users

Finally use the reboot command to restart ubuntu and log in as the hadoop user

The second step:

Hadoop users update the update after login or some software will not work

I use the mirror source of us , sometimes the update is slow, or your hash is problematic, indicating that the software source cannot find resources, you can change the software source. It is recommended to use 163.

I won't say anything about changing the software source, and I will solve it by myself.

Then install the text editor of vim . There are many kinds of nano vi vim (enhanced version of vi) gedit (graphical interface, it is recommended to use it not very familiar to vi or vim, but you still have to learn vi or vim in the future. All understand). For the gedit that I have just contacted with Linux this time, because it is convenient to modify the xml configuration file and the modified .sh script file later, speed up efficiency and prevent errors.

For the system with nano vi   install vim is 

Because I have installed it, so the above screen appears. If it is not installed, it should have the option of y / n . Enter y directly.

third step:

Install ssh to achieve remote control 

Ubuntu installs the ssh client by   default and then we can install the server directly

The command is:

Mine has been installed and the above screen appears

There are many remote control tools in Xshell secureCRT

I type ifconfig to view ip

I use Xshell5 to connect to ubuntu

the fourth step:

Install java jdk jre

After installing OpenJDK , you need to find the corresponding installation path. This path is used to configure the JAVA_HOME environment variable.

Use sudo gedit ~ / .bashrc to edit and create environment variables.

export JAVA_HOME = JDK installation path

Among them, then you must use the source ~ / .bashrc   command to make the variable take effect.

Finally use the above command to verify that the java environment variables we installed are valid. If it works, we continue to the next step. Otherwise, please configure the java environment. Otherwise, the next step is not possible.

the fifth step:

Install hadoop

I am using hadoop-2.6.4.tar.gz

I put it on the desktop. I got it from the window and dragged it into ubuntu. To realize this function, you must boot and reinstall vmware tools to use it (you can create a shared folder and use ftp file transfer and many other methods)

I suggest downloading directly on the Internet in ubuntu , and then find the file in the download. To test whether you can connect to the Internet, use ping  www.baidu.com and other methods. 

Then, next install hadoop, my installation package is placed on Desktop, it is recommended to install in the same environment as java, mine is in / usr / local

edit permission

Sudo chown -R hadoop ./hadoop

Use the following command to check the version of hadoop

Let me write a little bit about permissions:

Chown : Only the administrator has the permission to change the owner of the file  

Chown USERNAMEfile, ... -R   recursively modify the owner of the internal files of the directory and its subfiles

--reference =/path/to/somefile file,...

#chgrp  GRPNAMEfile...    -R    --reference=/path/to/somefile file,...

Chown USERNAME:GROUP file....

Chown USERANME.GROUP file....

Chmod: modify user permissions

Modify the permissions of three types of users

Chmod  more  file....

-R

--reference =/path/to/somefile file,...

Rwxr-x---

Modify the permissions of certain users or certain types of users

U  g  o  a

Chmod user type = MORE file ...

Modify the permissions of certain users or certain users

U   g   o  a

Chmod user type + (-) MORE file ...

The sixth step:

Hadoop is non-distributed by default, that is, java single process. Next, we conduct pseudo-distributed configuration

The Hadoop configuration file is located in / usr / local / hadoop / etc / hadoop /.   Then we enter the configuration file and modify it. What needs to be modified is core-site.xml and hdfs-site.xml.   Use sudo gedit./etc/hadoop/core -site.xml

The modified core-site.xml file is the following screenshot:

 

Similarly: the screenshot of modifying hdfs-site.xml is as follows:

 

Step 7: Start hadoop

./bin/hdfs namenode -format format namenode remember that you must install hadoop under the bin in use

 

 

The 0 on the way indicates that you have succeeded a lot, to continue our operation below. We are about to succeed, let's go! ! ! If something goes wrong, then check what is wrong.

./sbin/start-dfs.sh open NameNode and DataNode daemon. The first time, a warning will appear, yes .

Jps to see if it opens successfully.

If it may fail, delete log tmp    using rm -r ./deleted file format

Step 8: After successful startup, you can access the web interface to view related information

 

Use your IP address in red and the port number is 50070 to view the information

This has been successful. Haha, haha, cool! ! !

Step 9:

Add path :

Because the above operations must be performed under the installed hadoop , it is more troublesome. We use the path environment variable to enable us to open our service in any directory.

Modifying the ~ / .bashrc file is similar to modifying the JAVA_HOME setting

, Pro, finally remember to operate source ~ / .bashrc to make the settings take effect.

Scan the code and follow the public account for more information.

                                                             

 

Published 24 original articles · praised 36 · 20,000+ views

Guess you like

Origin blog.csdn.net/tanjunchen/article/details/79733866