采集需求:通过exec(即执行命令)的方式,采集某文件的数据并传输至HDFS。

#exec 实时监听文件是否有数据产生

a1.sources=s1

a1.channels=c1

a1.sinks=sk1

a1.sources.s1.type=exec

a1.sources.s1.command=tail -F /opt/flume_Method/ceshi/dataList.txt

a1.sources.s1.batchSize=150

a1.sources.s1.channels=c1

a1.channels.c1.type=file

a1.channels.c1.checkpointDir=/opt/flume_Method/data/checkpoint

a1.channels.c1.dataDirs=/opt/flume_Method/data/dataDir

a1.sinks.sk1.type=hdfs

a1.sinks.sk1.hdfs.path=hdfs://master:8020/user/root/flume/%y-%m-%d/%H-%M

a1.sinks.sk1.hdfs.useLocalTimeStamp = true

a1.sinks.sk1.hdfs.round = true

a1.sinks.sk1.hdfs.roundUnit = minute

a1.sinks.sk1.hdfs.roundValue = 1

a1.sinks.sk1.hdfs.fileType=DataStream

a1.sinks.sk1.hdfs.rollInterval=0

a1.sinks.sk1.hdfs.rollSize=0

a1.sinks.sk1.hdfs.rollCount=100

a1.sinks.sk1.hdfs.batchSize = 5

a1.sinks.sk1.channel=c1

向文件(/opt/flume_Method/ceshi/dataList.txt)中追加数据,查看hdfs文件(/user/root/flume/)中是否有新文件产生

猜你喜欢

转载自blog.csdn.net/GX_0824/article/details/126964619