例子博客
https://blog.csdn.net/huhui_cs/article/details/9907951
http://dbaplus.cn/news-21-1277-1.html
一 搭建好了集群环境之后,首先我们先跑一下例子。
可以时刻关注http://172.17.0.2:50070/explorer.html信息
在ubuntu1 中创建三个文件
a.txt内容为
this is first file
one
two
three
four
b.txt
this is second file
aa
bb
cc
dd
ee
ff
c.txt
this is third file
11
22
33
44
55
one
two
aa
bb
在hdfs创建input目录,并且将三个文件上传。
root@ubuntu1:/home/software/hadoop# bin/hdfs dfs -mkdir /input
root@ubuntu1:/home/software/hadoop# vim a.txt
root@ubuntu1:/home/software/hadoop# vim b.txt
root@ubuntu1:/home/software/hadoop# vim c.txt
root@ubuntu1:/home/software/hadoop#
root@ubuntu1:/home/software/hadoop#
root@ubuntu1:/home/software/hadoop#
root@ubuntu1:/home/software/hadoop#
root@ubuntu1:/home/software/hadoop# bin/hdfs dfs -put /input
LICENSE.txt README.txt b.txt c.txt etc/ lib/ logs/ share/
NOTICE.txt a.txt bin/ data/ include/ libexec/ sbin/
root@ubuntu1:/home/software/hadoop# bin/hdfs dfs -put a.txt b.txt c.txt /input
root@ubuntu1:/home/software/hadoop# bin/hdfs dfs -ls /input
Found 3 items
-rw-r--r-- 3 root supergroup 38 2018-05-26 10:33 /input/a.txt
-rw-r--r-- 3 root supergroup 38 2018-05-26 10:33 /input/b.txt
-rw-r--r-- 3 root supergroup 48 2018-05-26 10:33 /input/c.txt
接下来查找三个文件里的单词
root@ubuntu1:/home/software/hadoop# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar grep /input /output '[a-z]'
查看结果
root@ubuntu1:/home/software/hadoop# bin/hdfs dfs -get /output output
root@ubuntu1:/home/software/hadoop# cat output/*