版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/yulei_qq/article/details/82453592
使用瑞士军刀(netcat 作为输入源) ,hdfs 作为flume 的输出源(sink)
flume 配置文件内容如下:
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 8888
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://mycluster/flume/events/%y-%m-%d/%H/%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
#是否是产生新目录,每十分钟产生一个新目录,一般控制的目录方面。
#2017-12-12 -->
#2017-12-12 -->%H%M%S
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = second
a1.sinks.k1.hdfs.useLocalTimeStamp=true
#是否产生新文件。
a1.sinks.k1.hdfs.rollInterval=10
a1.sinks.k1.hdfs.rollSize=10
a1.sinks.k1.hdfs.rollCount=3
a1.channels.c1.type=memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
其中hdfs://mycluster 是HDFS的集群地址.
使用flume-ng agent -f hdfs_k.conf -n a1 即可启动flume
[hadoop@s202 /soft/flume/conf]$flume-ng agent -f hdfs_k.conf -n a1
启动netcat 客户端
[hadoop@s202 ~]$nc localhost 8888
[hadoop@s202 ~]$nc localhost 8888
he ll sss adsd
OK
hellow ssdfvas fffg
OK
然后在flume 控制可以看到收集的日志目录
18/09/06 08:52:12 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 started
18/09/06 08:52:12 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:8888]
18/09/06 08:52:42 INFO hdfs.HDFSSequenceFile: writeFormat = Writable, UseRawLocalFileSystem = false
18/09/06 08:52:42 INFO hdfs.BucketWriter: Creating hdfs://mycluster/flume/events/18-09-06/08/52/40/events-.1536195162064.tmp
18/09/06 08:52:51 INFO hdfs.HDFSSequenceFile: writeFormat = Writable, UseRawLocalFileSystem = false
18/09/06 08:52:51 INFO hdfs.BucketWriter: Creating hdfs://mycluster/flume/events/18-09-06/08/52/50/events-.1536195171721.tmp
18/09/06 08:52:56 INFO hdfs.BucketWriter: Closing hdfs://mycluster/flume/events/18-09-06/08/52/40/events-.1536195162064.tmp
18/09/06 08:52:56 INFO hdfs.BucketWriter: Renaming hdfs://mycluster/flume/events/18-09-06/08/52/40/events-.1536195162064.tmp to hdfs://mycluster/flume/events/18-09-06/08/52/40/events-.1536195162064
18/09/06 08:52:56 INFO hdfs.HDFSEventSink: Writer callback called.
18/09/06 08:53:01 INFO hdfs.BucketWriter: Closing hdfs://mycluster/flume/events/18-09-06/08/52/50/events-.1536195171721.tmp
18/09/06 08:53:02 INFO hdfs.BucketWriter: Renaming hdfs://mycluster/flume/events/18-09-06/08/52/50/events-.1536195171721.tmp to hdfs://mycluster/flume/events/18-09-06/08/52/50/events-.1536195171721
18/09/06 08:53:02 INFO hdfs.HDFSEventSink: Writer callback called.
使用如下命令进行查看:
[hadoop@s202 ~]$hdfs dfs -cat /flume/events/18-09-06/08/52/50/events-.1536195171721
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?.~?;...s.e..hellow ssdfvas fffg?.~?;...s.[hadoop@s202 ~]$
可以看到这是一个HDFS 的序列文件 ,可以模糊的看到"hellow ssdfvas fffg "字样.
也可以使用如下命令查看:
[hadoop@s202 ~]$hdfs dfs -text /flume/events/18-09-06/08/52/50/events-.1536195171721
1536195171972 68 65 6c 6c 6f 77 20 73 73 64 66 76 61 73 20 66 66 66 67