[Translation completed] Raspberry Pi builds hadoop cluster

Hadoop is implemented in Java, so running on Raspberry Pi is as easy as running on other x86 platforms. First, we need to install a JVM that supports Raspberry Pi. You can choose OpenJDK or Oracle's JDK 8. I personally recommend JDK8 , which is slightly faster, but OpenJDK installation is easier.

1. Install Java

Installing OpenJDK is very simple, just execute the following command

pi@raspberrypi ~ $ sudo apt-get install openjdk-7-jdk
pi@raspberrypi ~ $ java -version
java version "1.7.0_07"
OpenJDK Runtime Environment (IcedTea7 2.3.2) (7u7-2.3.2a-1+rpi1)
OpenJDK Zero VM (build 22.0-b10, mixed mode)

Alternatively , we can choose to install Oracle's JDK 8.
It can be obtained from:  https://jdk8.java.net/fxarmpreview/index.html

pi@raspberrypi ~ $sudo tar zxvf jdk-8-ea-b36e-linux-arm-hflt-*.tar.gz -C /opt
pi@raspberrypi ~ $sudo update-alternatives --install "/usr/bin/java" 
"java" "/opt/jdk1.8.0/bin/java" 1 
pi@raspberrypi ~ $ java -version
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)

If you have both installed, use the following command to switch:

sudo update-alternatives --config java

2. Add a hadoop system user

pi@raspberrypi ~ $ sudo addgroup hadoop
pi@raspberrypi ~ $ sudo adduser --ingroup hadoop hduser
pi@raspberrypi ~ $ sudo adduser hduser sudo

3. Set up SSH

pi@raspberrypi ~ $ su - hduserhduser@raspberrypi ~ $ ssh-keygen -t rsa -P ""

This generates an RSA key that matches an empty password. Hadoop will no longer prompt for a password when communicating with other nodes

hduser@raspberrypi ~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Now set up SSH to allow access with the key you just generated

hduser@raspberrypi ~$ ssh localhost

Now we should be able to log in without a password

4. Install Hadoop
We can download hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core

hduser@raspberrypi ~$ wget http://mirror.catn.com/pub/apache/hadoop/core/hadoop-1.1.2/hadoop-1.1.2.tar.gz
hduser@raspberrypi ~$sudo tar vxzf hadoop-1.1.2.tar.gz -C /usr/local
hduser@raspberrypi ~$cd /usr/local
hduser@raspberrypi /usr/local$ sudo mv hadoop-1.1.2 hadoop
hduser@raspberrypi /usr/local$ sudo chown -R hduser:hadoop hadoop

Now hadoop is installed. Compile the .bashrc file in the home directory and add the following to it

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-armhf
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin

If you are using oracle's JDK, modify JAVA_HOME accordingly.

Restart the Raspberry Pi to verify that the installation was successful:

hduser@raspberrypi ~$ hadoop version
Hadoop 1.1.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/
branch-1.1 -r 1440782
Compiled by hortonfo on Thu Jan 31 02:03:24 UTC 2013
From source with checksum c720ddcf4b926991de7467d253a79b8b

5. Configure Hadoop
Note: The configuration here is the minimum configuration for hadoop single-node mode.

配置文件位于"/usr/local/hadoop/conf/", 我们需要修改core-site.xml, hdfs-site.xml, mapred-site.xml三个文件

core-site.xml

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/fs/hadoop/tmp</value>
  </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:54310</value>
  </property>
</configuration>

mapred-site.xml

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:54311</value>
  </property>
</configuration>

hdfs-site.xml

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

Oh, it's almost finished, there's one last step left.

hduser@raspberrypi ~$ sudo mkdir -p /fs/hadoop/tmp
hduser@raspberrypi ~$ sudo chown hduser:hadoop /fs/hadoop/tmp
hduser@raspberrypi ~$ sudo chmod 750 /fs/hadoop/tmp
hduser@raspberrypi ~$hadoop namenode -format

Notice:

If you choose JDK 8, we need to force DataNode to run in JVM client mode, because JDK 8 does not support server mode. Go to /usr/local/hadoop/bin directory to edit the hadoop file (please back it up first). Use nano The steps to modify are as follows: nano hadoop, ctrl-w input "-server" to search. We need to delete the "-server" parameter, then save and exit.

The Hadoop single-node system is built. Here are some useful commands.

1. jps           // 输出本地VM标识符
2. start-all.sh  // 启动所有hadoop进程
3. stop-all.sh   // 停止所有hadoop进程

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324376194&siteId=291194637