(1) Installation of Linux virtual machine

　　I believe that most of the people who embark on the road of no return with Hadoop are users who are accustomed to using Windows. Based on this, bloggers share their own experiences for readers, and friends are welcome to comment and pay attention.

1. Download of VMware and Ubuntu

　　 My own machine configuration: Window10.

　　 The download link is attached as follows,

VMware： https://www.vmware.com/

　　 Ubuntu ： https://www.ubuntu.com/download

The blogger downloaded the latest version in 2018.

2. Installation

　　There are too many tutorials on the Internet during the installation process. The installation link is attached here, so I won't go into details.

　　https://blog.csdn.net/iqmae68024/article/details/54772918

　　After the installation is complete, you can enter the Linux system. In order to use many functions and commands in the future, the blogger recommends installing VMware as well. For the installation tutorial, see the link: https://blog.csdn.net/tjcwt2011/article/details/ 72638977

(2) Install Hadoop on a Linux virtual machine

1. Create a Hadoop user

　　The reason why you choose to create a new user is that on the one hand, you can have a relatively independent environment to toss with, and on the other hand, it is also better to manage hadoop-related files, avoiding manual slippage during operation.

sudo useradd -m hadoop -s /bin/bash        // create a user named hadoop and store it in bash 

sudo passwd hadoop      // set password 

sudo adduser hadoop sudo    // give hadoop administrator privileges

2, SSH installation and configuration

　　Because hadoop is a clustered file management system, if ssh is not installed, each machine cannot access each other (of course, it is another matter for just installing a virtual machine to play). SSH, short for Secure Shell , is a relatively reliable protocol that provides security for remote login sessions and other network services. Using the SSH protocol can effectively prevent information leakage during remote management. Because Ubuntu has installed the SSH client by default, we only need to install the server here.

1 sudo apt - get install openssh-server   // Use the apt-get command to install the ssh server for the virtual machine 
2  
3 ssh localhost   // Use ssh to log in to the machine, you will be prompted to enter a password when logging in, just enter the hadoop user created before password is fine

　　Generally speaking, the installation and login of ssh ends here, but in this case, the password must be entered repeatedly for each login in the future. For convenience, the ssh security key can be generated without entering the password every time.

exit                            // Exit the ssh login status just now 
cd ~/.ssh/                      // There will be after login once. ssj directory. If not, execute once ssh localhost 
ssh-keygen -t rsa               // There will be a prompt, press Enter to 
cat ./id_rsa.pub >> ./authorized_keys   // Add authorization

　　Try ssh login again, you don't need to enter the password again, isn't it much more convenient.

3. Install Java and configure environment variables

　　Hadoop itself is developed in the Java language, so in order to use it later, you must first install Java and configure the environment variables. The new version of Ubuntu system no longer supports the java version below the JDK8 version, so for the convenience of installation, the blogger chooses to install openjdk8.

　　JRE (Java Runtime Environment, Java Runtime Environment) is the environment required to run Java. JDK (Java Development Kit, Java software development kit) includes JRE, and also includes tools and class libraries required for developing Java programs.

sudo apt-get install openjdk-8-jre openjdk-8-jdk

　　Next, configure the environment variables, first find the path during installation:

　　dpkg: "dpkg" is short for "Debian Packager". A package management system specially developed for "Debian" to facilitate software installation, update and removal. All "Linux" distributions derived from "Debian" use "dpkg", such as "Ubuntu", "Knoppix", etc.

1 dpkg -L openjdk- 8 -jdk | grep ' /bin '    // Regular matching search path

　　The content of the first line is what we are looking for, copy the path /usr/lib/jvm/java-8-openjdk-amd64. Configure java environment variables:

1 gedit ~/.bashrc   // Open the bashrc file with gedit and set the environment variables. If gedit has not been installed, you can install it with sudo apt-get install gedit

　　After opening, add export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 at the beginning.

　　Then save it, and then re-run the .bashrc file to make it take effect.

1 source ~/.bashrc 
2 echo $JAVA_HOME //Output the value of JAVA_HOME to see if the global variable has taken effect

　　Well, the environment variables of java have been completed here.

　　After pouring water for two hours, the blogger went out for a walk, and the later configuration and sharing of java programs will continue to be updated. It is not easy to write by hand, please indicate the source when forwarding or citing.

　　I am Fang Ling Xiaoxiaosheng, a programmer who combines talent and baldness.

Windows Hadoop installation learning tutorial series (1)