案例背景:
在 flume采集多Source、多Channel和多Sink的案例实现中,test-flume-into-hbase-multi- position.conf配置文件Channel如下:
# 配置Channel信息
agent.channels.file-channel-1.type = file
agent.channels.file-channel-1.checkpointDir = /data/flume-hbase-test/checkpoint
agent.channels.file-channel-1.dataDirs = /home/hadoop/flume-hbase-test/data/
agent.channels.file-channel-2.type = file
agent.channels.file-channel-2.checkpointDir = /data/flume-hbase-test/checkpoint2
agent.channels.file-channel-2.dataDirs = /home/hadoop/flume-hbase-test/data2/
在启动flume : hadoop@master:/data/bigdata/apache-flume-1.8.0-bin/bin$ ./flume-ng agent -n agent -f ../conf/test-flume-into-hbase-multi-position.conf -c ../conf/ -Dflume.root.logger=INFO,console 控制台一直报错。
报错内容如下:
flume Cannot lock /home/hadoop/.flume/file-channel/data. The directory is already
Cannot lock /home/hadoop/.flume/file-channel/data. The directory is already locked. [channel=file-channel-1]
大致意思是:/home/hadoop/.flume/file-channel/data目录被锁定,我想我的配置文件中dataDirs明明是 /home/hadoop/flume-hbase-test/ 在各大平台找了个遍都没找到解决办法,于是翻墙出去看到这些话:
还有一张图(忘记截图了)说的是,你不应该在同一目录下用不同的文件 eg: /data/data1 和 /data/data2 (这里就可能会发生目录被占用或锁定),你应该在不同目录去使用 eg: /data/data 和 /data2/data
于是配置文件进行了更改:
# 配置Channel信息
agent.channels.file-channel-1.type = file
agent.channels.file-channel-1.checkpointDir = /data/flume-hbase-test/checkpoint
agent.channels.file-channel-1.dataDirs = /home/hadoop/flume-hbase-test/data/
agent.channels.file-channel-2.type = file
agent.channels.file-channel-2.checkpointDir = /data/flume-hbase-test2/checkpoint
agent.channels.file-channel-2.dataDirs = /home/hadoop/flume-hbase-test2/data/
顺利解决问题,控制台也没有报错了!
注意:
配置文件一定认真检查,这是区分大小写的,可能就是哪里没对应上 导致目录找不到 这些问题
最后附上全部配置文件
#从文件中读取实时信息,不做处理直接存储到Hbase中
agent.sources = logfile-source-1 logfile-source-2
agent.channels = file-channel-1 file-channel-2
agent.sinks = hbase-sink-1 hbase-sink-2
# 配置logfile-Source信息
agent.sources.logfile-source-1.type = exec
agent.sources.logfile-source-1.command = tail -f /data/flume-hbase-test/mkhbasetable/data/test.log
agent.sources.logfile-source-1.checkperiodic = 50
agent.sources.logfile-source-2.type = exec
agent.sources.logfile-source-2.command = tail -f /data/flume-hbase-test/mkhbasetable/data/test2.log
agent.sources.logfile-source-2.checkperiodic = 50
# 配置Channel信息
agent.channels.file-channel-1.type = file
agent.channels.file-channel-1.checkpointDir = /data/flume-hbase-test/checkpoint
agent.channels.file-channel-1.dataDirs = /home/hadoop/flume-hbase-test/data/
agent.channels.file-channel-2.type = file
agent.channels.file-channel-2.checkpointDir = /data/flume-hbase-test2/checkpoint
agent.channels.file-channel-2.dataDirs = /home/hadoop/flume-hbase-test2/data/
# 配置Sink
agent.sinks.hbase-sink-1.type = org.apache.flume.sink.hbase.HBaseSink
#对应hbase-site.xml中的配置
agent.sinks.hbase-sink-1.zookeeperQuorum = master:2181,slave1:2181,slave2:2181
agent.sinks.hbase-sink-1.table = mikeal-hbase-table
agent.sinks.hbase-sink-1.columnFamily = familyclom1
agent.sinks.hbase-sink-1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
#text slicing
agent.sinks.hbase-sink-1.serializer.regex =(.*) (.*?) (.*?)
agent.sinks.hbase-sink-1.serializer.colNames = ip,time,url
agent.sinks.hbase-sink-2.type = org.apache.flume.sink.hbase.HBaseSink
#对应hbase-site.xml中的配置,原来因为端口冲突修改了默认端口,此处注意
agent.sinks.hbase-sink-2.zookeeperQuorum = master:2181,slave1:2181,slave2:2181
agent.sinks.hbase-sink-2.table = mikeal-hbase-table
agent.sinks.hbase-sink-2.columnFamily = familyclom2
agent.sinks.hbase-sink-2.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
agent.sinks.hbase-sink-2.serializer.regex =(.*?) (.*?) (.*?)
agent.sinks.hbase-sink-2.serializer.colNames = ip,time,url
# bing Source Sink and Channel
agent.sources.logfile-source-1.channels = file-channel-1
agent.sinks.hbase-sink-1.channel = file-channel-1
agent.sources.logfile-source-2.channels = file-channel-2