MapReduce is a program development mode, using a number of parallel processing to the server. MapReduce, is the Map job distribution, Reduce summary of the results of the work order.
Among this to WordCount as an example, the number of calculations every file English word appears.
1) Create a directory wordcount
mkdir -p ~/wordcount/input
cd ~/wordcount
Use sudo gedit WordCount.java to edit the document.
2) Compile WordCount.java
sudo gedit ~/.bashrc
Then add profiles
Let ~ / .bashrc settings to take effect
source ~/.bashrc
Then start the compilation
hadoop com.sun.tools.javac.Main WordCount.java
jar cf wc.jar WordCount*.class
ll
3) Create a test text file
cp /usr/local/hadoop/LICENSE.txt ~/wordcount/input
ll ~/wordcount/input
Next start all virtual servers
Start the cluster
start-all.sh
Upload test file to HDFS directory
hadoop fs -mkdir -p /user/wordcount/input
Switch to ~ / wordcount / input directory
cd ~/wordcount/input
Upload a text file to the HDFS
hadoop fs -copyFromLocal LICENSE.txt /user/wordcount/input
HDFS file list
hadoop fs -ls /user/wordcount/input
4) Run WordCount.java
Change directory
cd ~/wordcount
WordCount program run
hadoop jar wc.jar WordCount /user/wordcount/input/LICENSE.txt /user/wordcount/output
5) Check operating results
View directory of HDFS
hadoop fs -ls /user/wordcount/output
View the contents of the output file in HDFS
hadoop fs -cat /user/wordcount/output/part-r-00000 |more
WordCount program execution again please delete the output directory
hadoop fs -rm -R /user/wordcount/output
Hadoop's MapReduce is not very good, here briefly.