Hadoop is implemented in Java, so running on Raspberry Pi is as easy as running on other x86 platforms. First, we need to install a JVM that supports Raspberry Pi. You can choose OpenJDK or Oracle's JDK 8. I personally recommend JDK8 , which is slightly faster, but OpenJDK installation is easier.
1. Install Java
Installing OpenJDK is very simple, just execute the following command
pi@raspberrypi ~ $ sudo apt-get install openjdk-7-jdk
pi@raspberrypi ~ $ java -version
java version "1.7.0_07"
OpenJDK Runtime Environment (IcedTea7 2.3.2) (7u7-2.3.2a-1+rpi1)
OpenJDK Zero VM (build 22.0-b10, mixed mode)
Alternatively , we can choose to install Oracle's JDK 8.
It can be obtained from: https://jdk8.java.net/fxarmpreview/index.html
pi@raspberrypi ~ $sudo tar zxvf jdk-8-ea-b36e-linux-arm-hflt-*.tar.gz -C /opt
pi@raspberrypi ~ $sudo update-alternatives --install "/usr/bin/java"
"java" "/opt/jdk1.8.0/bin/java" 1
pi@raspberrypi ~ $ java -version
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
If you have both installed, use the following command to switch:
sudo update-alternatives --config java
2. Add a hadoop system user
pi@raspberrypi ~ $ sudo addgroup hadoop
pi@raspberrypi ~ $ sudo adduser --ingroup hadoop hduser
pi@raspberrypi ~ $ sudo adduser hduser sudo
3. Set up SSH
pi@raspberrypi ~ $ su - hduserhduser@raspberrypi ~ $ ssh-keygen -t rsa -P ""
This generates an RSA key that matches an empty password. Hadoop will no longer prompt for a password when communicating with other nodes
hduser@raspberrypi ~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Now set up SSH to allow access with the key you just generated
hduser@raspberrypi ~$ ssh localhost
Now we should be able to log in without a password
4. Install Hadoop
We can download hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core
hduser@raspberrypi ~$ wget http://mirror.catn.com/pub/apache/hadoop/core/hadoop-1.1.2/hadoop-1.1.2.tar.gz
hduser@raspberrypi ~$sudo tar vxzf hadoop-1.1.2.tar.gz -C /usr/local
hduser@raspberrypi ~$cd /usr/local
hduser@raspberrypi /usr/local$ sudo mv hadoop-1.1.2 hadoop
hduser@raspberrypi /usr/local$ sudo chown -R hduser:hadoop hadoop
Now hadoop is installed. Compile the .bashrc file in the home directory and add the following to it
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-armhf
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
If you are using oracle's JDK, modify JAVA_HOME accordingly.
Restart the Raspberry Pi to verify that the installation was successful:
hduser@raspberrypi ~$ hadoop version
Hadoop 1.1.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/
branch-1.1 -r 1440782
Compiled by hortonfo on Thu Jan 31 02:03:24 UTC 2013
From source with checksum c720ddcf4b926991de7467d253a79b8b
5. Configure Hadoop
Note: The configuration here is the minimum configuration for hadoop single-node mode.
配置文件位于"/usr/local/hadoop/conf/", 我们需要修改core-site.xml, hdfs-site.xml, mapred-site.xml三个文件
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/fs/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Oh, it's almost finished, there's one last step left.
hduser@raspberrypi ~$ sudo mkdir -p /fs/hadoop/tmp
hduser@raspberrypi ~$ sudo chown hduser:hadoop /fs/hadoop/tmp
hduser@raspberrypi ~$ sudo chmod 750 /fs/hadoop/tmp
hduser@raspberrypi ~$hadoop namenode -format
Notice:
If you choose JDK 8, we need to force DataNode to run in JVM client mode, because JDK 8 does not support server mode. Go to /usr/local/hadoop/bin directory to edit the hadoop file (please back it up first). Use nano The steps to modify are as follows: nano hadoop, ctrl-w input "-server" to search. We need to delete the "-server" parameter, then save and exit.
The Hadoop single-node system is built. Here are some useful commands.
1. jps // 输出本地VM标识符
2. start-all.sh // 启动所有hadoop进程
3. stop-all.sh // 停止所有hadoop进程